Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

Even OpenAI works hard Atlas AI browser against cyberattacks, the company admits that quick injectiona type of attack that uses AI agents to follow malicious instructions often hidden in web pages or emails, is a threat that isn’t going away anytime soon – raising questions about how AI agents can work safely on the open internet.
“Risk injection, like phishing and cyber engineering, cannot be ‘solved’,” OpenAI wrote on Monday. blog post detailing how the company is preparing the Atlas suite to deal with persistent threats. The company admitted that the ‘agent mode’ in ChatGPT Atlas “increases the security risk.”
OpenAI launched its ChatGPT Atlas browser in October, and security researchers rushed to publish their demonstrations, showing that it was possible to write a few words in Google Docs that could change the behavior of the underlying browser. The same day, Brave published a blog post explaining that injection speed is a systematic problem for AI-powered browsers, plus The Comet of Perplexity.
OpenAI is not alone in realizing that instant injection is not going away. The The UK’s National Cyber ​​Security Center earlier this month warned which makes injection molding against AI software “absolutely impossible,” putting websites at risk of data breaches. The UK government has advised cyber experts to reduce the risk and impact of injections quickly, instead of assuming that the attacks will “end”.
For OpenAI’s part, the company said: “We see rapid injection as a long-term AI security problem, and we need to continue to strengthen our defenses against it.”
The company’s response to this Sisyphean task? The rapid, rapid response that the company says shows early promise in helping find new insider attack methods before they are used “in the wild.”
This is not really different from what rivals such as Anthropic and Google have been saying: that to deal with the constant threat of rapid attacks, security must be stable and constantly tested. Google’s latest projectfor example, it focuses on building controls and operating system policies.
But where OpenAI is taking a different approach is an “LLM-based automated attacker.” This attacker is actually a bot that OpenAI trained, using reinforcement learning, to act like a hacker that looks for ways to sneak malicious instructions into the AI ​​assistant.
The bot can try an attack in a simulation before actually using it, and the simulator shows how the target AI would think and what it would do if it saw an attack. The bot can learn the response, control the attack, and try again and again. Understanding the internal logic of AI is something that outsiders don’t have access to, so, in theory, an OpenAI bot should find errors faster than a real-world attacker could.
It’s a common technique for AI security testing: design an agent to find edge cases and test against them quickly in comparison.
“Our attacker (reinforced learning) can manipulate the agent to perform dangerous actions that span tens (or hundreds) of steps,” wrote OpenAI. “We also saw new methods of attack that were not seen in our public campaign or outside reports.”

In the demonstration (pictured above), OpenAI showed how the attacker downloaded a malicious email to a user’s inbox. After the AI ​​assistant scanned the box, it followed the instructions hidden in the email and sent a resignation message instead of writing an out-of-office reply. But following security updates, “assistants” were able to detect injection attempts quickly and notify the user, according to the company.
The company says that while rapid injection is difficult to make foolproof, it’s leaning on extensive testing and a quick patch cycle to harden its systems before they appear in the world.
An OpenAI spokesperson declined to share whether Atlas’ security updates have made the injections less likely, but says the company has been working with third parties to harden Atlas against early injections since its launch.
Rami McCarthy, senior security researcher at cybersecurity company WizIt is said that learning to reinforce is one way to continuously adapt to attack patterns, but it is only part of the picture.
“A useful way to think about risk in AI systems is increased autonomy and opportunism,” McCarthy told TechCrunch.
“Agentina browsers tend to live in a very difficult environment: independence and high availability,” McCarthy said. “Many current guidelines reflect this inconsistency. Limiting access to accounts primarily reduces transparency, while requiring review of authentication requests hinders independence.”
These are two of OpenAI’s recommendations for users to reduce their risk, and a spokesperson said Atlas is also trained to verify users before sending messages or making payments. OpenAI also suggests that users give agents specific instructions, rather than allowing them to log into your inbox and tell them to “do whatever it takes.”
“Greater latitude makes it easier for hidden or malicious content to affect the agent, even if security is in place,” according to OpenAI.
While OpenAI says protecting Atlas users from being injected too quickly is a top priority, McCarthy calls out skepticism about refunding vulnerable browsers.
“For many of the things that are used every day, browsers that don’t pay enough to verify their content are at risk,” McCarthy told TechCrunch. “The risk is high because of access to sensitive information such as emails and payment information, although the opportunity is what makes them powerful. This trend has changed, but today the inconsistency is still real.”