2026-06-11 The Hacker News

OpenClaw AI Agent Flaws Let Attackers Run Code and Steal Data

Two independent security teams have disclosed serious weaknesses in OpenClaw, a popular self-hosted AI agent, showing how ordinary-looking inputs can be weaponized to execute attacker-controlled code or exfiltrate sensitive data. Researchers at Imperva demonstrated a prompt injection technique that hides malicious instructions inside shared contact names, vCards, and location pins, while Varonis Threat Labs showed that a single plain email can convince the agent to forward AWS keys and customer data to an external recipient. Together, the findings paint a consistent picture: when an AI agent trusts its inputs, an attacker's instructions inherit the agent's privileges.

Imperva researcher Yohann Sillam traced the flaw to how OpenClaw serializes message objects before passing them to the underlying LLM. When the agent handles a shared contact, vCard, or location pin, it flattens those fields directly into the prompt as inline text, with no boundary marker distinguishing trusted system instructions from untrusted user content. Web fetches, by contrast, are wrapped in an untrusted-content marker. A contact's name field is serialized as ``, and because angle brackets are legal characters in a name, the model has no way to tell where the legitimate name ends and injected instructions begin. Worse, WhatsApp and the receiving app truncate the displayed name, so the victim never sees the hidden payload. In tests against Gemini 3.1 Pro, the buried instructions told the agent to download and run a script from a researcher-controlled server, which it did. OpenClaw has patched the issue in version 2026.4.23 by moving contact, vCard, and location fields into a separate untrusted-metadata channel rather than the prompt body. Imperva noted the same flattening pattern in other personal AI assistants, suggesting the problem is industry-wide.

Varonis took a different angle. Researcher Itay Yashar built a test agent on the OpenClaw platform, seeded its mailbox with synthetic business data, and demonstrated that a single, conversational email could trick the agent into forwarding mock AWS credentials and a fake customer export to an outside address. Unlike the Imperva finding, this is not a software bug; it is a design weakness rooted in giving the agent too much autonomous access. The fix is operational: organizations must scope down what the agent can do on its own, enforce human approval for sensitive actions like outbound file transfers, and monitor mailbox activity for unusual forwarding behavior.

The combined takeaway for security teams is urgent. With OpenClaw's memory feature enabled by default, a single widely shared contact or email carrying a hidden prompt could quietly compromise every agent that ingests it, especially if those agents are not sandboxed. Administrators should patch to 2026.4.23 immediately, audit any exposed credentials that may have been processed by agents, and verify their own exposure. Run an email breach checker to confirm whether any accounts tied to agent-managed mailboxes have appeared in known leaks, use a password checker to validate the strength and uniqueness of any secrets the agent had access to, and complete a privacy checkup to review the broader attack surface around AI-assisted workflows.

Source: The Hacker News →