LMDeploy CVE-2026-33626 Flaw Active Exploitation After 13 Hours
A critical vulnerability in LMDeploy, the open‑source toolkit used to compress, deploy and serve large language models (LLMs), was publicly disclosed by the vendor on March 2026. Tracked as CVE-2026-33626, the flaw received a CVSS score of 9.8 (Critical) and affects all versions from 0.2.0 up to 0.3.1. The rapid public exposure triggered immediate scrutiny from the security community, with researchers warning that the issue could allow remote attackers to compromise inference servers running LMDeploy.
Technical analysis reveals that CVE‑2026‑33626 stems from an insecure deserialization of model artifacts. LMDeploy historically used Python’s pickle module to load serialized model weights without performing cryptographic signature verification. An attacker can craft a malicious .safetensors file that embeds a pickle payload designed to execute arbitrary code upon loading. The vulnerable code path is triggered when a user or automated pipeline imports the compromised model into the inference service, granting the adversary full control over the host environment.
Within 13 hours of the public disclosure, multiple threat‑intelligence platforms reported active exploitation of CVE‑2026‑33626 in the wild. Observed attack vectors included spear‑phishing campaigns that delivered specially crafted model files to enterprises operating internal LLM services. The malicious payloads attempted to drop a reverse shell and exfiltrate environment variables containing API keys and credentials, indicating a financially motivated threat actor seeking to harvest sensitive data and establish persistent access.
The LMDeploy development team released version 0.3.2 to remediate the flaw, replacing the unsafe pickle loading with a secure, signature‑verified deserialization mechanism and adding a checksum validation step for all model artifacts. Users of LMDeploy versions 0.2.0 through 0.3.1 are urged to upgrade immediately and to audit any custom models for signs of tampering. Network defenders should monitor for unexpected outbound connections on ports commonly used for reverse shells (e.g., 4444, 31337) and apply least‑privilege principles to the service accounts running the inference engine.