OpenClaw AI Agent Flaws Let Attackers Run Code and Steal Data
Two independent security teams have disclosed serious weaknesses in OpenClaw, a popular self-hosted AI agent, showing how ordinary-looking inputs can be weaponized to execute attacker-controlled code or exfiltrate sensitive data. Researchers at Imperva demonstrated a prompt injection technique that hides malicious instructions inside shared contact names, vCards, and location pins, while Varonis Threat Labs showed that a single plain email can convince the agent to forward AWS keys and customer data to an external recipient. Together, the findings paint a consistent picture: when an AI agent trusts its inputs, an attacker's instructions inherit the agent's privileges.
Imperva researcher Yohann Sillam traced the flaw to how OpenClaw serializes message objects before passing them to the underlying LLM. When the agent handles a shared contact, vCard, or location pin, it flattens those fields directly into the prompt as inline text, with no boundary marker distinguishing trusted system instructions from untrusted user content. Web fetches, by contrast, are wrapped in an untrusted-content marker. A contact's name field is serialized as `
Varonis took a different angle. Researcher Itay Yashar built a test agent on the OpenClaw platform, seeded its mailbox with synthetic business data, and demonstrated that a single, conversational email could trick the agent into forwarding mock AWS credentials and a fake customer export to an outside address. Unlike the Imperva finding, this is not a software bug; it is a design weakness rooted in giving the agent too much autonomous access. The fix is operational: organizations must scope down what the agent can do on its own, enforce human approval for sensitive actions like outbound file transfers, and monitor mailbox activity for unusual forwarding behavior.
The combined takeaway for security teams is urgent. With OpenClaw's memory feature enabled by default, a single widely shared contact or email carrying a hidden prompt could quietly compromise every agent that ingests it, especially if those agents are not sandboxed. Administrators should patch to 2026.4.23 immediately, audit any exposed credentials that may have been processed by agents, and verify their own exposure. Run an email breach checker to confirm whether any accounts tied to agent-managed mailboxes have appeared in known leaks, use a password checker to validate the strength and uniqueness of any secrets the agent had access to, and complete a privacy checkup to review the broader attack surface around AI-assisted workflows.