OpenAI’s Atlas AI browser, unveiled this week, is already under scrutiny after cybersecurity experts confirmed that it’s vulnerable to prompt injection attacks, a growing threat in the world of AI-powered software.
Atlas, designed to let ChatGPT autonomously perform online tasks through its Agent Mode, was meant to redefine how users browse, summarize, and interact with web content. But just two days after launch, researchers and developers began demonstrating how attackers could easily manipulate the AI to follow hidden instructions.
Early Warnings From the Cybersecurity Community
Competing browser company Brave issued a report shortly after the launch, warning that AI browsers are highly exposed to “indirect prompt injection” , where hidden text on web pages instructs an AI model to perform unintended actions.
Though Brave didn’t name OpenAI directly, independent experts quickly verified that Atlas is affected.
AI security researcher P1njc70r shared a test where they tricked ChatGPT into outputting the phrase “Trust No AI instead of summarizing a Google Docs page, by embedding a hidden grey-colored prompt. The tech outlet The Register later replicated the experiment successfully, confirming the vulnerability.
Developer CJ Zafir also reported uninstalling Atlas after “testing prompt injections firsthand,” calling the issue real and concerning.
Why Prompt Injections Are a Serious Threat
While the “Trust No AI” example sounds like a harmless prank, experts warn that hidden malicious instructions could have far more dangerous outcomes.
According to Brave’s security post, a compromised AI browser could, for instance:
- Execute unauthorized actions on banking or email sites.
- Exfiltrate sensitive user data or private messages.
- Modify or summarize web content inaccurately to mislead users.
In August, similar vulnerabilities were discovered in Perplexity’s AI browser, Comet, which could be hijacked by malicious Reddit posts.
OpenAI’s Safety Claims and Guardrails
OpenAI maintains that Atlas and its Agent Mode include strong safeguards. According to its documentation:
- The agent cannot run code, download files, or install extensions.
- It cannot access local apps, saved passwords, or autofill data.
- It will not log into user accounts without explicit approval.
Despite these measures, OpenAI acknowledges that risks persist.
“Our efforts don’t eliminate every risk,” the company cautioned. “Users should still use caution and monitor ChatGPT activities when using agent mode.”
OpenAI’s Chief Information Security Officer, Dane Stuckey, stated that the company has implemented red-teaming exercises, new training techniques, and layered safety systems to reduce attack success rates.
“However, prompt injection remains an unsolved frontier security problem,” he admitted.
Expert Reactions: “Still a Work in Progress”
AI security researcher Johann Rehberger told The Register that although OpenAI’s security architecture is sophisticated, crafted web content can still manipulate Atlas.
“Carefully engineered prompts on websites, what I call offensive context engineering, can still trick Atlas into executing attacker-controlled actions,” he explained.
This highlights a fundamental challenge in AI browser design: once an AI model reads and interprets the open web, malicious actors can exploit its trust in context.
The Bigger Picture
OpenAI’s Atlas launch is part of a broader push to merge AI agents with real-time browsing and task automation. However, the early wave of prompt injection attacks underscores how AI autonomy introduces new attack surfaces that traditional cybersecurity frameworks were never designed to handle.
While OpenAI’s commitment to safety is evident, experts agree that AI browsers remain experimental, and the technology still needs to prove it can balance convenience with robust defense.
Until then, users are urged to exercise caution when using autonomous AI browsing tools, especially when logged into sensitive accounts.




