CVE-2021-41125
5.7 MEDIUMScrapy is a high-level web crawling and scraping framework for Python
Published: 2021-10-06 · Last updated: 2026-06-17
Severity and scoring
- CVSS
- 5.7 MEDIUM
- Vector
- CVSS:3.1/AV:N/AC:L/PR:L/UI:R/S:U/C:H/I:N/A:N
- CWE
- CWE-200, CWE-522
Affected products
| Vendor | Product |
|---|---|
| debian | debian_linux, scrapy |
| scrapy | debian_linux, scrapy |
Description
Scrapy is a high-level web crawling and scraping framework for Python. If you use `HttpAuthMiddleware` (i.e. the `http_user` and `http_pass` spider attributes) for HTTP authentication, all requests will expose your credentials to the request target. This includes requests generated by Scrapy components, such as `robots.txt` requests sent by Scrapy when the `ROBOTSTXT_OBEY` setting is set to `True`, or as requests reached through redirects. Upgrade to Scrapy 2.5.1 and use the new `http_auth_domain` spider attribute to control which domains are allowed to receive the configured HTTP authentication credentials. If you are using Scrapy 1.8 or a lower version, and upgrading to Scrapy 2.5.1 is not an option, you may upgrade to Scrapy 1.8.1 instead. If you cannot upgrade, set your HTTP authentication credentials on a per-request basis, using for example the `w3lib.http.basic_auth_header` function to convert your credentials into a value that you can assign to the `Authorization` header of your request, instead of defining your credentials globally using `HttpAuthMiddleware`.
Source: NVD
References
- [NVD]https://nvd.nist.gov/vuln/detail/CVE-2021-41125
- [Vendor advisory]http://doc.scrapy.org/en/latest/topics/downloader-middleware.html#module-scrapy.downloadermiddlewares.httpauth
- [Patch]https://github.com/scrapy/scrapy/commit/b01d69a1bf48060daec8f751368622352d8b85a6
- [Other]https://github.com/scrapy/scrapy/security/advisories/GHSA-jwqp-28gf-p498
- [Other]https://lists.debian.org/debian-lts-announce/2022/03/msg00021.html
- [Other]https://w3lib.readthedocs.io/en/latest/w3lib.html#w3lib.http.basic_auth_header
- [Vendor advisory]http://doc.scrapy.org/en/latest/topics/downloader-middleware.html#module-scrapy.downloadermiddlewares.httpauth
- [Patch]https://github.com/scrapy/scrapy/commit/b01d69a1bf48060daec8f751368622352d8b85a6
- [Other]https://github.com/scrapy/scrapy/security/advisories/GHSA-jwqp-28gf-p498
- [Other]https://lists.debian.org/debian-lts-announce/2022/03/msg00021.html
- [Other]https://w3lib.readthedocs.io/en/latest/w3lib.html#w3lib.http.basic_auth_header
Related CVEs
Same vendor
- CVE-2026-49975 — Memory Allocation with Excessive Size Value vulnerability in Apache HTTP Server's mod_http leads to denial of service via malicious HTTP ... (7.5 HIGH)
- CVE-2026-31431 — In the Linux kernel, the following vulnerability has been resolved: crypto: algif_aead - Revert to operating out-of-place This mostly r... (7.8 HIGH)
- CVE-2026-4775 — A flaw was found in the libtiff library (7.8 HIGH)
- CVE-2026-3497 — Vulnerability in the OpenSSH GSSAPI delta included in various Linux distributions (7.5 HIGH)
- CVE-2026-2219 — It was discovered that dpkg-deb (a component of dpkg, the Debian package management system) does not properly validate the end of the dat... (7.5 HIGH)
Same CWE
- CVE-2026-12117 — Improper access control in the social login connection endpoint in Devolutions Server 2026.2.5 allows an authenticated vault member to ...
- CVE-2026-53840 — OpenClaw before 2026.5.12 contains an information disclosure vulnerability in streamable-http MCP servers that forwards operator-configur... (7.1 HIGH)
- CVE-2026-12320 — Information disclosure in the Password Manager component (4.3 MEDIUM)
- CVE-2026-12311 — Information disclosure, sandbox escape in the Security: Process Sandboxing component (4.7 MEDIUM)
- CVE-2026-50870 — An information disclosure vulnerability in the configuration endpoint of Ben Busby whoogle-search v1.2.3 allows attackers to obtain sensi... (7.5 HIGH)