CVE-2026-4372: Hugging Face Transformers RCE via Malicious Config.json; 232 Million Downloads Pre-Patch Expose GPU-Accelerated AI Inference
Date: 2026-06-06
Tags: supply-chain, model-poisoning
Executive Summary
CVE-2026-4372 affects multiple versions of Hugging Face Transformers when the optional kernels package is installed; although the package is not enabled by default, it is commonly used in GPU-accelerated inference environments and is often included through the transformers[all] installation option. Vulnerable Transformers versions were downloaded about 232 million times before a patch was released, creating supply chain risk for organizations using third-party AI models. Researchers at Pluto disclosed a remote code execution vulnerability that bypasses the library's built-in trust_remote_code=False security control; CVE-2026-4372 allows remote code execution through malicious Hugging Face model configurations, bypassing the library's trust_remote_code=False security control.
Campaign Summary
| Field | Detail |
|---|---|
| Campaign / Malware | Hugging Face Transformers Exploitation via Malicious Model Config |
| Attribution | Unknown; Potential supply chain attack vector (confidence: medium) |
| Target | Organizations using Hugging Face Transformers with GPU-accelerated inference (common in cloud ML pipelines, MLOps platforms, production LLM serving) |
| Vector | Malicious model uploaded to Hugging Face with poisoned config.json; victim organization loads model using from_pretrained(); arbitrary code executes during deserialization before trust_remote_code checks |
| Status | active |
| First Observed | 2026-06-06 |
Detailed Findings
The flaw originates in how Transformers processes model configuration files (config.json); the library relied on a generic setattr() mechanism that applied configuration parameters directly to internal objects, including private attributes that were never intended to be influenced by untrusted input; attackers could manipulate internal settings through a specially crafted model configuration. One poisoned field in a model's config.json silently executes arbitrary code on anyone who loads it with no special flags, no warnings, just the standard from_pretrained() call. Organizations using vulnerable versions of the Hugging Face Transformers library could unknowingly execute attacker-controlled code simply by loading a malicious AI model; the RCE vulnerability bypasses the library's built-in trust_remote_code=False security control, potentially exposing cloud credentials, SSH keys, API tokens, and other sensitive assets. The vulnerability is particularly dangerous in GPU-accelerated inference environments, where Transformers runs with elevated privileges and direct cloud provider access.
MITRE ATT&CK Mapping
| Technique | ID | Context |
|---|---|---|
| Supply Chain Compromise | T1195.001 | Malicious model uploaded to Hugging Face, a trusted model registry, compromises downstream Transformers users |
| Code Injection | T1059.006 | Arbitrary Python code embedded in model config.json executes on deserialization |
| Credential Access | T1552.001 | Executed code accesses environment variables, cloud metadata endpoints, and stored credentials |
IOCs
Domains
_No specific malicious model hash or URL published; focus on detecting Transformers versions and config.json deserialization anomalies; monitor Hugging Face for suspicious model config files using setattr on private attributes_
Full URL Paths
https://huggingface.co/ (legitimate platform; malicious models can be hosted here)
Splunk Format
"https://huggingface.co/ (legitimate platform; malicious models can be hosted here)"
Package Indicators
Hugging Face Transformers (vulnerable versions before patch)
Detection Recommendations
Upgrade Hugging Face Transformers immediately to patched version; disable transformers[all] installation in favor of explicit dependency specification; implement model signature verification using cryptographic attestation before loading from Hugging Face; deploy cloud credential monitoring to detect unauthorized credential enumeration originating from model loading processes; use network monitoring to block unexpected egress from ML inference servers; validate config.json files for suspicious setattr() calls or private attribute modifications before model load; implement DLP on ML inference servers to prevent credential exfiltration via model-generated outbound connections.
References
- [TechRepublic] Malicious Hugging Face Models Could Trigger Remote Code Execution (2026-06-06) — https://www.techrepublic.com/article/news-hugging-face-transformers-rce-flaw/
- [Pluto Security Research] CVE-2026-4372: Hugging Face Transformers RCE via Malicious Model Configuration (2026-06-06) — https://www.techrepublic.com/article/news-hugging-face-transformers-rce-flaw/