CVE-2020-13092: Deserialization vulnerability in scikit-learn (PyPI)

What is CVE-2020-13092 About?

This vulnerability in scikit-learn exists because `joblib.load()` can unserialize and execute commands from untrusted files. If a malicious file is passed to this function and contains a `__reduce__` method that calls `os.system`, it can lead to arbitrary command execution. Exploitation requires the ability to provide a crafted file for deserialization, but `joblib.load()` is documented as unsafe.

Affected Software

scikit-learn
- <0.23.1
- <=0.23.0

Technical Details

The vulnerability in scikit-learn (aka sklearn) versions through 0.23.0 stems from its use of joblib.load() for deserializing model files. joblib.load() is inherently unsafe because it can reconstruct arbitrary Python objects. If an attacker can provide a specially crafted file (e.g., a pickled model) to a system that subsequently calls joblib.load() on it, and if this crafted file's pickled data includes an object with a malicious __reduce__ method, Python's pickling protocol can execute arbitrary code during deserialization. Specifically, if the __reduce__ method is designed to perform an os.system call with attacker-controlled commands, then the application loading the untrusted file will execute those commands on the host system, leading to arbitrary code execution. Third parties often dispute this as an inherent vulnerability, noting that joblib.load() is explicitly documented as unsafe and the responsibility for secure usage lies with the developer.

What is the Impact of CVE-2020-13092?

Successful exploitation may allow attackers to execute arbitrary commands on the underlying system, leading to full system compromise, data theft, or denial of service.

What is the Exploitability of CVE-2020-13092?

Exploitation complexity is moderate. An attacker needs to be able to supply a malicious, serialized file (e.g., a pickle file) to an application that then loads this file using joblib.load() without adequate trust boundaries or sanitization. This might involve tricking a user into uploading a malicious file, or compromising a data source from which the application loads models. Authentication and privilege requirements depend entirely on the context in which joblib.load() is called; if it's processing user-uploaded content, no prior authentication might be needed. If it's loading internal models, an attacker would need to compromise the model storage. This can be a remote vulnerability if file upload or external data processing is exposed. The primary risk factor is any application that loads serialized scikit-learn models (or other joblib-compatible objects) from untrusted sources without implementing robust security checks, despite joblib.load() being described as unsafe.

What are the Known Public Exploits?

PoC Author	Link	Commentary
No known exploits

What are the Available Fixes for CVE-2020-13092?

Available Upgrade Options

scikit-learn
- <0.23.1 → Upgrade to 0.23.1

Struggling with dependency upgrades?

See how Resolved Security's drop-in replacements make it simple.

Book a demo

Additional Resources

What are Similar Vulnerabilities to CVE-2020-13092?

Similar Vulnerabilities: CVE-2020-8025 , CVE-2019-14272 , CVE-2017-1000350 , CVE-2018-1000656 , CVE-2018-1000101

CVE-2020-13092 Deserialization vulnerability in scikit-learn (PyPI)