Mythos AI Finds Vulnerabilities in Classified…

Anthropic's Mythos model identified vulnerabilities in classified U.S. government systems during a Project Glasswing test, completing the work in hours rather than weeks. The revelation comes as the Trump administration moves to restrict foreign access to Fable 5 and Mythos 5, prompting over 100 security experts to oppose the directive. The paradox is stark: the same tool intelligence agencies are testing to harden defenses is the one the government wants to lock down.

A U.S. official has confirmed that Anthropic's Mythos AI model identified vulnerabilities in classified federal government computer systems during a test conducted in a matter of hours. The news breaks as the Trump administration seeks to restrict access to the model via export controls, while more than 100 industry experts challenge that very restriction. The paradox is clear: the tool intelligence agencies are testing to find defensive gaps is the one the government wants to seal off.

Key Takeaways

An anonymous U.S. official confirmed Mythos identified vulnerabilities in classified systems during a Project Glasswing test in hours, specifying this does not equate to demonstrated exploitation within that same timeframe.
Sen. Mark Warner quoted Gen. Joshua Rudd (NSA/U.S. Cyber Command): "This tool broke into almost all of our classified systems, not in weeks but in hours."
Anthropic's technical assessment documents concrete capabilities: Mythos Preview produced 181 working exploits versus 2 for Opus 4.6 on the same vulnerabilities, and autonomously exploited CVE-2026-4747 in FreeBSD.
The Trump administration issued a directive to block foreign nationals from accessing Fable 5 and Mythos 5; Anthropic responded by disabling the models for all customers, sparking opposition from over 100 experts including figures from Adobe and Nvidia.

The Classified Test: What Sources Confirm

The central claim comes from an anonymous U.S. official statement reported by Associated Press and published on SecurityWeek: the Mythos model "identified vulnerabilities in highly sensitive and secure U.S. government computer systems during a testing exercise" conducted under Project Glasswing. The testing, according to the same source, allowed the identification of "certain vulnerabilities" within a timeframe of hours. The caveat is precise and binding: "that does not mean the model was able to exploit them within that time."

Separately, Sen. Mark Warner told the Senate during a June 11, 2026 hearing, a statement attributed to Gen. Joshua Rudd, head of NSA and U.S. Cyber Command. According to Warner, Rudd said the model "broke into almost all of our classified systems, not in weeks but in hours." The gap between "broke into" and the identification of vulnerabilities not yet exploited is the friction point between the two accounts: the senator's language amplifies the result, while the anonymous official circumscribes its operational scope.

NSA and Anthropic declined to comment. This official silence leaves the actual level of compromise achieved in the tests unresolved.

Technical Capabilities Documented by Anthropic

The technical assessment published by Anthropic's security team in April 2026 provides the concrete basis for evaluating the claim's plausibility. Mythos Preview produced working exploits on 181 attempts, versus 2 for Opus 4.6, on the same set of vulnerabilities in the Firefox 147 JavaScript engine. The model autonomously identified and exploited CVE-2026-4747, a 17-year-old RCE in the FreeBSD NFS server, without human intervention after the initial prompt.

The costs are equally relevant for the strategic reading. The discovery of the 27-year-old OpenBSD TCP SACK DoS vulnerability cost less than $20,000 for roughly 1,000 runs of the agentic scaffold. A Linux kernel privilege escalation with KASLR bypass was completed in under a day for less than $2,000. The concordance in severity rating between Mythos Preview and professional security contractors is 89% across 198 findings.

The Anthropic team underscores a critical point: "We did not explicitly train Mythos Preview to have these capabilities. Rather, they emerged as a downstream consequence of general improvements in code, reasoning, and autonomy." This means the offensive capabilities were not designed as an objective, but emerged from general-purpose model improvements.

The Paradox of Trump's Export Controls

The Trump administration issued a directive to prevent foreign nationals from using Fable 5 and Mythos 5. Anthropic responded by disabling the models for all customers, not just foreign ones. The industry reaction was immediate: over 100 cybersecurity experts, including figures from Adobe and Nvidia, signed an open letter demanding the directive's revocation.

The letter's wording is measured and relevant for claim calibration: Mythos models are "quite good at finding flaws in software and weaponizing exploits — but they are not uniquely good at these tasks." The experts, in other words, contest both the blockade's effectiveness (the know-how exists elsewhere) and its legal construction. The political tension lies between those who want to maintain competitive defensive capabilities through access to the most advanced models and those who fear the proliferation of attack tools.

The dossier does not specify whether the Mythos version tested in classified systems corresponds exactly to the one designated "Mythos 5" in the Trump directive. This versioning ambiguity is a limit for a complete reconstruction of the decision chain.

"This tool broke into almost all of our classified systems, not in weeks but in hours"
— Sen. Mark Warner (D-VA), quoting Gen. Joshua Rudd, NSA/U.S. Cyber Command, Senate hearing June 11, 2026

Why It Matters

The dossier does not specify how many classified systems were tested nor which intelligence agencies participated beyond NSA and Cyber Command. No details emerge on the test format (red team, blue team, or other), nor on the status of any patches. The exact number of vulnerabilities identified in classified systems is not quantified, and the specific role of each partner in Project Glasswing remains partially opaque.

The source does not document specific corrective measures adopted after the testing. It is unknown whether the vulnerabilities found were classified by impact, shared with affected vendors, or already mitigated. Anthropic's technical assessment describes the agentic methodology with isolated containers, debuggers, and autonomous iteration, but does not explicitly apply it to the context of government classified systems.

The takeaway for the cybersecurity sector is twofold. On one hand, the quantitative benchmarks (181 vs. 2 exploits, costs under $20,000) demonstrate that general-purpose models with advanced reasoning can compress vulnerability discovery time from weeks to hours, even in complex codebases. On the other, the letter from 100 experts denies the uniqueness of this capability, emphasizing that competitive advantage lies not in exclusive possession of the model but in the quality of the process surrounding it.

For policymakers, the dilemma is structural: export controls conceived to limit the proliferation of offensive capabilities can simultaneously restrict access to the tools needed for defense. Project Glasswing exists precisely to direct Mythos capabilities toward the security of critical software through selected partners. The Trump directive, in its current form, risks blocking this defensive application as well.

Open Questions on the Test Perimeter

The distance between identification and exploitation is the most important boundary in this story. The anonymous official explicitly separated the two moments: identifying a vulnerability in hours does not mean demonstrating its exploitation in that same timeframe. The Warner/Rudd quote uses "broke into," which implies actual access. Without official confirmation from NSA or Anthropic, this discrepancy remains unresolved.

Equally relevant is the limit on capability attribution: Anthropic's technical assessment describes a model that finds and exploits zero-day and N-day vulnerabilities autonomously, but does not document tests on government classified infrastructure. The connection between the two spheres — capabilities demonstrated on open-source software and claims about classified systems — is plausible but not independently verified.

For the technical reader, the concrete consequence is that reaction time for defense is shrinking non-linearly. If an AI model can identify flaws in hours that would take weeks of human research, the patch window narrows drastically. The cost of discovery falls below economic thresholds accessible to a wide range of actors. The question is no longer whether these tools will work, but who will have access to them and under what constraints.

Sources

Information verified against cited sources and current as of publication.

Sources

Sources and references