Critical vLLM RCE Vulnerability (CVE-2026-22778): Heap Overflow via Malicious Video Links Affects Millions of AI Servers

Date: 2026-04-11
Tags: malware

Executive Summary

This vulnerability allows an attacker to achieve Remote Code Execution (RCE) simply by sending a malicious video link to a vLLM API. vLLM is a high-throughput, memory-efficient engine designed for serving Large Language Models (LLMs). A critical vulnerability, CVE-2026-22778, was recently discovered in vLLM, a popular framework for serving Large Language Models (LLMs) with high throughput. The vulnerability has been actively exploited in the wild.

Campaign Summary

Field	Detail
Campaign / Malware	vLLM RCE Exploitation Campaign
Attribution	Unknown (confidence: low)
Target	Organizations running vLLM-based LLM inference servers, multimodal AI endpoints
Vector	Malicious video URL submission to vLLM Completions API; heap overflow in OpenCV/FFmpeg JPEG2000 decoder
Status	active
First Observed	2026-02-02 (public disclosure); exploitation ongoing as of 2026-04-11

Detailed Findings

The issue is based on a chain of two vulnerabilities that ultimately lead to Remote Code Execution (RCE). First, to bypass the ASLR mitigation, the chain begins with an information leak caused by PIL error messages that expose memory addresses. Second, the vulnerability that leads to RCE is a heap overflow in the JPEG2000 decoder used by OpenCV/FFmpeg, which can be exploited to achieve code execution. When an invalid image is sent to the LLM's multimodal endpoint, PIL returns an error indicating that it cannot identify the image file. During this process, a memory address is leaked — specifically, a heap address. Because OpenCV is used for video decoding, constructing a video from JPEG2000 frames can reach this vulnerability and lead to command execution. The JPEG2000 decoder trusts the cdef (channel definition) box, which allows channels to be remapped without validating buffer sizes. In other words, Y data can be written into the U buffer, and vice versa. If Y contains significantly more data than U, writing Y into U will fill the U buffer and overflow into adjacent heap memory. This RCE can be used for a full server takeover, including arbitrary command execution, data exfiltration, and lateral movement.

MITRE ATT&CK Mapping

Technique	ID	Context
Exploit Public-Facing Application	T1190	Attacker exploits CVE-2026-22778 in publicly exposed vLLM API to achieve RCE on inference servers
Abuse Elevation Control Mechanism	T1548	Heap overflow allows arbitrary code execution with privileges of vLLM process, potentially root or container orchestration privileges

IOCs

Domains

_No specific IOCs published; vulnerability is in vLLM library itself. Organizations should monitor network traffic for unusual video URLs submitted to vLLM endpoints and inspect server logs for JPEG2000 processing errors._

Full URL Paths

Splunk Format

_No IOCs available for Splunk query_

Package Indicators

vLLM < 0.14.1

Detection Recommendations

Immediately update vLLM to version 0.14.1 or later. Monitor vLLM API logs for suspicious video URL submissions, particularly those containing JPEG2000 encoded frames or unusual channel remapping (cdef box) parameters. Implement strict input validation on video submissions and reject JPEG2000 format if not required. Deploy EDR on servers running vLLM to detect arbitrary process execution from the vLLM process. Network-based detection: flag HTTP/HTTPS requests to vLLM APIs with video payloads from untrusted sources.

References

[OX Security] Millions of AI Servers at Risk: Critical vLLM RCE Lets Attackers Take Over via Video Link (CVE-2026-22778) (2026-02-02) — https://www.ox.security/blog/cve-2026-22778-vllm-rce-vulnerability/