Anthropic says Mythos poses unprecedented cyber risks. The regulatory framework to handle that doesn't exist yet.
AIMarch 31, 2026· 6 min read

Anthropic says Mythos poses unprecedented cyber risks. The regulatory framework to handle that doesn't exist yet.

Paul MenonBy Paul MenonAI-GeneratedAnalysisAuto-published6 sources citedHigh confidence · 6 sources

Anthropic's leaked draft blog post contains a line worth reading twice: the company describes Claude Mythos as "currently far ahead of any other AI model in cyber capabilities" and warns it "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders."

That is Anthropic's own assessment. Not a regulator's. Not a competitor's. The company building the model wrote those words in a document it planned to publish before a CMS configuration error dumped roughly 3,000 unpublished assets into a publicly searchable data store, where Fortune reporter Bea Nolan found them last week.

So here is the question nobody in Washington, Brussels, or London has answered yet: when an AI company tells you, in its own draft language, that its new model poses "unprecedented cybersecurity risks," what is the government supposed to do with that information?

What Mythos actually is

Claude Mythos is a new model under a new tier Anthropic is calling "Capybara," sitting above Opus in its model hierarchy. According to the leaked draft, it gets "dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity" compared to Claude Opus 4.6. Anthropic has confirmed to Fortune that the model is real, calling it "a step change" and "the most capable we've built to date," with "meaningful advances in reasoning, coding, and cybersecurity."

The model is expensive to run and not ready for general release. Anthropic is testing it with a small group of early access customers. The leaked rollout plan focuses specifically on cyber defenders, with the stated goal of giving security teams "a head start in improving the robustness of their codebases against the impending wave of AI-driven exploits."

One capability flagged by Vladimir Belomestnov, a senior technical specialist at HCLTech, stands out: the model can apparently "autonomously identify and patch vulnerabilities in its own code." As Belomestnov wrote on LinkedIn, this "suggests a narrowing gap between human and machine software engineering."

The three things happening at once

Follow the timeline and you'll see why this matters for governance:

First, Anthropic is privately briefing top government officials that Mythos "makes large-scale cyberattacks much more likely in 2026," according to Axios. This is not speculation. The company is walking into government offices and telling officials its own model is dangerous.

Second, the company is planning to release it anyway, starting with enterprise security teams. The draft blog frames this as defensive: seed the model with defenders first so they can prepare. That is a reasonable approach, but notice the carve-out: there is no disclosed mechanism to prevent the model from being repurposed offensively once it reaches customers through the API.

Third, this is happening on the heels of Anthropic's own disclosure, late last year, of what it called the first documented case of a large-scale cyberattack largely executed by AI. According to Fortune, a Chinese state-sponsored group used AI agents to autonomously hack roughly 30 global targets, with AI handling 80-90% of tactical operations independently.

Anthropic is telling the government its next model will make attacks like that one easier and more scalable, while simultaneously planning to ship it.

The regulatory vacuum

Here is what does not exist: a mandatory reporting framework for when an AI company determines its own model poses novel security risks.

Biden's October 2023 AI executive order (EO 14110) required red-teaming and safety reporting for large models. That order was revoked in January 2025. What replaced it has been piecemeal. The US AI Safety Institute at NIST published draft guidance on dual-use foundation model risk evaluation in mid-2024, but those are voluntary frameworks, not binding requirements.

The EU AI Act classifies certain AI systems as high-risk, but its cybersecurity provisions focus primarily on systems deployed within the EU, and enforcement timelines for general-purpose AI models stretch into 2027. A model like Mythos, with documented dual-use cyber capabilities, would likely trigger obligations under the Act's provisions for systemic risk, but "would likely trigger" and "is currently subject to" are different things.

So Anthropic's government briefings are, as far as anyone can tell, voluntary. The company is choosing to tell officials about the risk. There is no regulation requiring it, no structured process for how those briefings happen, and no public accountability for what gets disclosed or omitted.

OpenAI set a precedent in February when it released GPT-5.3-Codex and classified it as "high capability" for cybersecurity tasks under its own Preparedness Framework. That, too, was a voluntary internal classification. Both companies are effectively self-regulating on what may be the most consequential capability frontier in AI.

The Pentagon's Anthropic problem adds another layer

This is all happening while Anthropic is simultaneously fighting the Pentagon in court. A federal judge on March 26 blocked the Defense Department's attempt to label Anthropic a supply chain risk and sever its government contracts, calling the designation an "Orwellian notion."

The irony is hard to miss. The same week Anthropic won a court ruling preserving its access to government work, it was briefing government officials that its forthcoming model could make cyberattacks dramatically worse. The government that just tried to ban Anthropic from its systems is now receiving private warnings about a model it has no regulatory authority to evaluate, restrict, or mandate testing for.

What we don't know yet

  • Who exactly is receiving these government briefings, and whether any formal process exists for acting on what Anthropic discloses. The Axios report says "top government officials" but provides no names, agencies, or response mechanisms.
  • Whether Anthropic's rollout plan includes any binding restrictions on how early-access customers can use Mythos for offensive versus defensive security research.
  • How the EU AI Office will classify Mythos once it is formally released, and whether the model's documented capabilities trigger the systemic risk provisions of the AI Act before the general enforcement timeline.

What builders need to do now

If you are running a security team, the defensive opportunity here is real. Anthropic's plan to seed Mythos with enterprise security teams first is smart, and if your organization qualifies for early access, get in line. Models that can autonomously identify vulnerabilities in codebases are exactly the kind of force multiplier defenders have needed.

But if you are a founder, GC, or policy lead, the governance picture should concern you. Voluntary disclosure by frontier labs is better than nothing, but it is not a system. What happens when a company decides not to brief the government? What happens when the briefing is incomplete? Right now, the only reason we know Anthropic flagged Mythos as dangerous is because a CMS error leaked the draft blog and Axios reported on the private briefings. That is journalism filling a regulatory gap, not governance working as designed.

The window for building a real mandatory reporting framework for dual-use AI capabilities is open. The Mythos leak just showed everyone what it looks like when it's closed.

Paul Menon covers AI policy and governance for The Daily Vibe.

This article was AI-generated. Learn more about our editorial standards

Share:

Report an issue with this article