CVE-2024-21538
External Control of File Name or Path vulnerability in cross-spawn (npm)

External Control of File Name or Path No known exploit Fixable By Resolved Security

What is CVE-2024-21538 About?

This vulnerability is an External Control of File Name or Path in the `RecursiveUrlLoader` within `langchain_community`. It allows an attacker controlling a linked HTML file to redirect the crawler to external domains despite `prevent_outside=True`. This leads to unintended data collection or resource consumption. Exploitation is relatively easy if an attacker can control the content of initial crawl targets.

Affected Software

  • cross-spawn
    • <6.0.6
    • >7.0.0, <7.0.5

Technical Details

The RecursiveUrlLoader in langchain_community versions prior to the fix contains a flaw where the prevent_outside=True parameter is not correctly enforced. When the crawler processes an HTML file from an initially permitted URL, it extracts links. If this HTML file contains links to https://example.completely.different/my_file.html (an external domain), the RecursiveUrlLoader will still follow and download these external files. This indicates a failure in validating the domain of recursively found URLs against the initial domain, allowing an attacker who can inject content into the initially crawled domain to direct the loader to arbitrary external resources. This bypasses the intended security control meant to restrict crawling to specified domains.

What is the Impact of CVE-2024-21538?

Successful exploitation may allow attackers to cause the crawler to download content from arbitrary external websites, leading to unintended data acquisition, resource exhaustion, or exposure to malicious content from untrusted sources.

What is the Exploitability of CVE-2024-21538?

Exploitation is of low complexity. An attacker requires the ability to control the content of an HTML file that is loaded by the RecursiveUrlLoader. No authentication or specific privileges are required on the target application's side, as the vulnerability lies in how the loader processes external links. The attack is remote, involving crafting a malicious HTML page with external links that the RecursiveUrlLoader will ingest. The primary constraint is that the RecursiveUrlLoader must be configured to process a URL that the attacker can influence. The risk of exploitation is higher in applications where users can submit URLs for crawling or where an attacker can compromise a legitimate website that the crawler is configured to scan.

What are the Known Public Exploits?

PoC Author Link Commentary
No known exploits

What are the Available Fixes for CVE-2024-21538?

A Fix by Resolved Security Exists!
Fix open-source vulnerabilities without upgrading your dependencies.

About the Fix from Resolved Security

The patch modifies regular expressions to avoid excessive backtracking in the escapeArgument function, preventing potential denial-of-service attacks caused by carefully crafted input strings. By changing the capture groups to non-capturing lookaheads, it addresses the ReDoS vulnerability identified in CVE-2024-21538 stemming from inefficient regex processing that could hang the application.

Available Upgrade Options

  • cross-spawn
    • <6.0.6 → Upgrade to 6.0.6
  • cross-spawn
    • >7.0.0, <7.0.5 → Upgrade to 7.0.5

Struggling with dependency upgrades?

See how Resolved Security's drop-in replacements make it simple.

Book a demo

Additional Resources

What are Similar Vulnerabilities to CVE-2024-21538?

Similar Vulnerabilities: CVE-2023-38035 , CVE-2023-36845 , CVE-2023-32314 , CVE-2022-45868 , CVE-2022-38685