← Back to feed

CVE-2026-4372: Hugging Face Transformers RCE via Malicious Config.json; 232 Million Downloads Pre-Patch Expose GPU-Accelerated AI Inference

Date: 2026-06-06
Tags: supply-chain, model-poisoning

Executive Summary

CVE-2026-4372 affects multiple versions of Hugging Face Transformers when the optional kernels package is installed; although the package is not enabled by default, it is commonly used in GPU-accelerated inference environments and is often included through the transformers[all] installation option. Vulnerable Transformers versions were downloaded about 232 million times before a patch was released, creating supply chain risk for organizations using third-party AI models. Researchers at Pluto disclosed a remote code execution vulnerability that bypasses the library's built-in trust_remote_code=False security control; CVE-2026-4372 allows remote code execution through malicious Hugging Face model configurations, bypassing the library's trust_remote_code=False security control.

Campaign Summary

FieldDetail
Campaign / MalwareHugging Face Transformers Exploitation via Malicious Model Config
AttributionUnknown; Potential supply chain attack vector (confidence: medium)
TargetOrganizations using Hugging Face Transformers with GPU-accelerated inference (common in cloud ML pipelines, MLOps platforms, production LLM serving)
VectorMalicious model uploaded to Hugging Face with poisoned config.json; victim organization loads model using from_pretrained(); arbitrary code executes during deserialization before trust_remote_code checks
Statusactive
First Observed2026-06-06

Detailed Findings

The flaw originates in how Transformers processes model configuration files (config.json); the library relied on a generic setattr() mechanism that applied configuration parameters directly to internal objects, including private attributes that were never intended to be influenced by untrusted input; attackers could manipulate internal settings through a specially crafted model configuration. One poisoned field in a model's config.json silently executes arbitrary code on anyone who loads it with no special flags, no warnings, just the standard from_pretrained() call. Organizations using vulnerable versions of the Hugging Face Transformers library could unknowingly execute attacker-controlled code simply by loading a malicious AI model; the RCE vulnerability bypasses the library's built-in trust_remote_code=False security control, potentially exposing cloud credentials, SSH keys, API tokens, and other sensitive assets. The vulnerability is particularly dangerous in GPU-accelerated inference environments, where Transformers runs with elevated privileges and direct cloud provider access.

MITRE ATT&CK Mapping

TechniqueIDContext
Supply Chain CompromiseT1195.001Malicious model uploaded to Hugging Face, a trusted model registry, compromises downstream Transformers users
Code InjectionT1059.006Arbitrary Python code embedded in model config.json executes on deserialization
Credential AccessT1552.001Executed code accesses environment variables, cloud metadata endpoints, and stored credentials

IOCs

Domains

_No specific malicious model hash or URL published; focus on detecting Transformers versions and config.json deserialization anomalies; monitor Hugging Face for suspicious model config files using setattr on private attributes_

Full URL Paths

https://huggingface.co/ (legitimate platform; malicious models can be hosted here)

Splunk Format

"https://huggingface.co/ (legitimate platform; malicious models can be hosted here)"

Package Indicators

Hugging Face Transformers (vulnerable versions before patch)

Detection Recommendations

Upgrade Hugging Face Transformers immediately to patched version; disable transformers[all] installation in favor of explicit dependency specification; implement model signature verification using cryptographic attestation before loading from Hugging Face; deploy cloud credential monitoring to detect unauthorized credential enumeration originating from model loading processes; use network monitoring to block unexpected egress from ML inference servers; validate config.json files for suspicious setattr() calls or private attribute modifications before model load; implement DLP on ML inference servers to prevent credential exfiltration via model-generated outbound connections.

References