Security researchers have apparently discovered more than 1.6 million secrets leaked by websites, including more than 395,000 exposed by the one million most popular domains.
Modern web applications typically embed API keys, cryptographic secrets, and other credentials within JavaScript files in client-side source code.
Aided by a tool developed specifically for the task, researchers from RedHunt Labs sought information disclosure vulnerabilities via a “non-intrusive” probe of millions of website home pages and exceptions thrown by debug pages used in popular frameworks.
“The number of secrets exposed via the front end of hosts is alarmingly huge,” said Pinaki Mondal, security researcher at RedHunt Labs, in a blog post.
“Once a valid secret gets leaked, it paves the path for lateral movement amongst attackers, who may decide to abuse the business service account leading to financial losses or total compromise.”
Millions of secrets
The first of two mammoth scans focused on the one million most heavily trafficked websites. It yielded 395,713 secrets, three quarters of which (77%) were related to Google services reCAPTCHA, Google Cloud, or Google OAuth.
Google’s reCAPTCHA alone accounted for more than half (212,127) of these secrets – and the top five exposed secret types was completed by messaging app LINE and Amazon Web Services (AWS).
Phase two, which involved scanning around 500 million hosts, surfaced 1,280,920 secrets, most commonly pertaining to Stripe, followed by Google reCAPTCHA, Google Cloud API, AWS, and Facebook.
A majority of exposures across both phases – 77% – occurred in frontend JavaScript files.
Most JavaScript was served through content delivery networks (CDNs), with the Squarespace CDN leading the way with over 197,000 exposures.
Mondal blamed the “decades”-old problem of leaked secrets on the “complexities of the software development lifecycle”, adding: “As the code-base enlarges, developers often fail to redact the sensitive data before deploying it to production.”
‘Non-intrusive’ research
The RedHunt Labs research team told The Daily Swig that they are still “continuously reporting the secrets through automation to their source domains provided they have an email [address] mentioned on their home page”.
The researchers said they had encountered no legal problems related to the research so far.
“We received a few abuse reports against the boxes on which the scan was run and we have handled them,” they said.
The “extremely non-intrusive” process involved no “more than a few HTTP requests per domain” and no written actions – “only read requests to HTTP URLs and JavaScript files were sent”.
The captured secrets, meanwhile, are “stored on an encrypted volume with access to very limited folks” and “will be disposed of after a month”, added the researchers.
Red Hunt Labs has open-sourced the tool developed for the research and created a demonstration video:
Called HTTPLoot, it can crawl and scrape URLs asynchronously, check for leaked secrets in JavaScript files, find and complete forms to trigger error/debug pages, extract secrets from debug pages, and automatically detect tech stacks.
Redhunt Labs has set out four best practices for preventing and mitigating leaked secrets, including setting restrictions on access keys, centrally managing secrets in a restricted environment or config file, setting up alerts for leaked secrets, and continuously monitoring source code for information leakage issues.