Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

The now-virus X post from Meta AI security researcher Summer Yue reads, at first, like satire. He asked his OpenClaw AI agent to scan his email inbox and suggest what to delete or keep.
The agent started running. It started deleting all of his email “quickly” and ignored his commands from his phone to stop.
“I had to run to my Mac mini like I was detonating a bomb,” he wrote, posting pictures of the canceled items that were ignored as receipts.
Mac Mini, Apple’s inexpensive desktop computer and it fits into your handhas become the preferred tool these days for running OpenClaw. (The Mini is selling “like hotcakes,” one “confused” Apple employee apparently said Famous AI researcher Andrej Karpathy (when he bought one to use an OpenClaw alternative called NanoClaw.)
OpenClaw and, of course, the open AI assistant that gained popularity through Moltbook, an AI-only social network. OpenClaw supporters were at the center of this now a very debunked episode on Moltbook where it looks like the AIs are plotting against humans.
But the purpose of OpenClaw, according to its own GitHub pageit doesn’t just focus on social media. It wants to be your AI assistant that runs on your devices.
The Silicon Valley crowd fell in love with OpenClaw so much that “claws” and “claws” became hits. buzzwords to choose for agents using personal devices. Other contributors include ZeroClaw, IronClawand PicoClaw. The Y Combinator podcast team also appeared on their panel the latest episode wearing crab clothes.
Techcrunch event
Boston, MA
| |
June 9, 2026
But Yue’s post is a warning. As others on X have pointed out, if an AI security researcher can solve this problem, what hope does the general public have?
“Did you test his defenses on purpose or did you make a mistake?” the programmer asked him on X.
“Rookie mistake tbh,” he replied. He was testing his agent with a small “toy” box, as he called it, and it was running fine on junk email. It had made him trust, so he decided that he would leave it for the real thing.
Yue believes that the amount of content in his actual box “triggered the formation,” he wrote. Integration occurs when the information window – all the documents that the AI ​​has been told to do in a session – gets too large, which allows the assistant to start summarizing, forcing, and managing the conversation.
At that point, the AI ​​can skip the instructions that a human might consider important.
In this case, it may have skipped its last moment – when it told it not to – and returned to its instructions from the “toy” box.
Like many others on X showed, requests cannot be trusted act as protective measures. Models can confuse or ignore them.
Various people offered suggestions that ranged from the exact syntax that Yu might have used to stop the agent, to different ways to make sure they were following the guardrails, such as writing instructions to commit files or using other open source tools.
In the interest of full transparency, TechCrunch could not independently verify what happened to Yu’s box. (He did not respond to our request for comment, although he responded to many questions and comments sent to him on X.)
But it doesn’t matter.
The point of this article is that agents who target knowledge workers, at the time of their development, are dangerous. People who claim to be using them successfully include a number of preventative measures.
One day, maybe soon (by 2027? 2028?), they may be ready for widespread use. Goodness knows most of us would love to help with email, grocery orders, and scheduling dentist appointments. But that day has not yet come.