Skip to main content

Exposure of Sensitive Information to an Unauthorized Actor

CVE-2021-41124

Severity High
Score 7.5/10

Summary

Scrapy-splash is a library which provides Scrapy and JavaScript integration. In affected versions users who use [`HttpAuthMiddleware`](http://doc.scrapy.org/en/latest/topics/downloader-middleware.html#module-scrapy.downloadermiddlewares.httpauth) (i.e. the `http_user` and `http_pass` spider attributes) for Splash authentication will have any non-Splash request expose your credentials to the request target. This includes `robots.txt` requests sent by Scrapy when the `ROBOTSTXT_OBEY` setting is set to `True`. Upgrade to scrapy-splash 0.8.0 and use the new `SPLASH_USER` and `SPLASH_PASS` settings instead to set your Splash authentication credentials safely. If you cannot upgrade, set your Splash request credentials on a per-request basis, [using the `splash_headers` request parameter](https://github.com/scrapy-plugins/scrapy-splash/tree/0.8.x#http-basic-auth), instead of defining them globally using the [`HttpAuthMiddleware`](http://doc.scrapy.org/en/latest/topics/downloader-middleware.html#module-scrapy.downloadermiddlewares.httpauth). Alternatively, make sure all your requests go through Splash. That includes disabling the [robots.txt middleware](https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#topics-dlmw-robots).

  • LOW
  • NETWORK
  • NONE
  • UNCHANGED
  • NONE
  • NONE
  • HIGH
  • NONE

CWE-200 - Information Exposure

An information exposure vulnerability is categorized as an information flow (IF) weakness, which can potentially allow unauthorized access to otherwise classified information in the application, such as confidential personal information (demographics, financials, health records, etc.), business secrets, and the application's internal environment.

Advisory Timeline

  • Published