The Mythos of Mythos: What Anthropic’s AI Security Claims Really Tell Us

Anthropic’s Claude Mythos found vulnerabilities across every major operating system and browser — a claim built on 198 manually reviewed reports and generous extrapolation. The technology may be real. The narrative surrounding it is a masterclass in something this series has been documenting all along: the strategic use of fear as a policy instrument.

By Vladimir Tsakanyan, PhD · Center for Cyber Diplomacy and International Security · cybercenter.space

There is a moment in Anthropic’s 250-page Project Glasswing report on Claude Mythos — tucked under the understated subheading “and several thousand more” — where the company acknowledges it cannot actually confirm that all the thousands of bugs its model claims to have found are critical security vulnerabilities. The number, it turns out, is extrapolated from a finding that expert contractors agreed with Claude’s severity assessment in approximately 90 percent of 198 manually reviewed vulnerability reports. The “thousands of severe zero-days in every major operating system and browser” headline, which dominated technology coverage for several days and prompted a race among major vendors to patch vulnerabilities that may or may not be exploitable, rests on that sample and that projection.

This is worth pausing on — not because the underlying capability is unimpressive, but because of what the gap between the finding and the framing reveals about how AI companies are choosing to position their most capable models. Anthropic is a serious organisation doing serious work. Claude Mythos appears to be a genuinely capable automated vulnerability discovery tool, more powerful than its predecessors. The FFMPeg bug it found, for example, had persisted for sixteen years undetected. Several potential Linux kernel exploits were identified that human researchers had missed. In the OSS-Fuzz-style testing of over seven thousand open-source software stacks, Mythos found crashable exploits in around six hundred examples and ten severe vulnerabilities — a meaningful improvement over prior Claude models.

But Anthropic’s own analysis of the FFMPeg finding acknowledged that the vulnerability “ultimately is not a critical severity vulnerability” and “would be challenging to turn into a functioning exploit.” Several of the Linux kernel exploits identified could not actually be exploited because of Linux’s defence-in-depth security architecture. A number had already been patched. The inclusion of these findings in a headline count of “thousands of severe zero-days” is not a technical assessment. It is a communications strategy.

The Political Economy of AI Fear

To understand why Anthropic is deploying this strategy, it helps to look at the company’s trajectory rather than its latest announcement in isolation. Claude was the first large language model to receive security clearance for use by the US government and military — a distinction that became more complicated after Anthropic drew a line on being used for mass surveillance or fully autonomous targeting, a decision that reportedly led to Claude’s exclusion from Pentagon contracts, subsequently filled by OpenAI. Anthropic’s consumer-facing products are its coding tools. Its strategic ambitions are clearly oriented toward large enterprises and government clients. Claude Mythos, kept internal and offered selectively to major tech companies and government entities, is not a consumer product. It is a sales vehicle.

This is not, as Nvidia CEO Jensen Huang pointed out in mid-2025, a novel strategy. OpenAI was doing exactly the same thing in 2019, when it announced that its GPT-2 text generation model was too dangerous to release in full — a decision widely interpreted, at the time and since, as primarily a mechanism for generating press coverage and establishing the company’s authority as the responsible steward of powerful AI. The pattern is well established: announce a capability, express grave concern about its implications, position the company as uniquely qualified to manage those implications safely, and leverage that positioning to attract the government contracts and enterprise relationships that the consumer market cannot provide at sufficient scale.

Anthropic has refined this approach more systematically than most. Its publication cadence of alarming papers, reports, and studies — on AI hacking attempts, on AI-driven unemployment, on the existential risks of frontier models — has been consistent, credible in parts, and cumulatively effective at establishing the company as the safety-conscious voice in an industry where safety rhetoric is competitively advantageous. The Mythos release, with its report containing twenty-plus pages of Anthropic staff reflecting on the model’s “fondness for particular philosophers” and repeated suggestions that the model might be conscious, is the latest and most elaborate iteration of this established playbook.

Analyst note

The consciousness and sentience suggestions embedded in the Glasswing report deserve analytical attention separate from the vulnerability findings. Anthropic is a company that has invested significantly in the question of AI moral status and has published serious philosophical work on the subject. Its researchers raising these questions about Mythos is not, in itself, implausible or cynical. But the placement of these reflections in a document designed to establish the model’s strategic significance — and to support the argument for keeping it under controlled access rather than public release — creates a rhetorical dynamic that serves the company’s commercial interests regardless of the sincerity of the underlying inquiry. “We’re not sure if this is conscious” is, in the context of a government sales pitch, a more powerful claim than any vulnerability count.

What the Capability Actually Demonstrates

Stripping away the narrative, what does Claude Mythos actually demonstrate about AI and cybersecurity? The honest answer is something significant and something considerably less dramatic than the headlines suggested.

Automated vulnerability discovery is a real and valuable capability. The ability to analyse software codebases at scale, identify patterns consistent with exploitable conditions, and flag them for human expert review has genuine security value — both for defenders trying to patch their own software and, inevitably, for attackers seeking exploitable conditions. The fact that Mythos found crashable exploits in roughly eight percent of the open-source stacks it tested is meaningful. The fact that it improved on previous Claude models in this domain is expected and will continue. The curve of AI capability in vulnerability research is real and will continue upward.

What it does not demonstrate is the emergence of a sentient super-hacker capable of single-handedly compromising every major piece of software infrastructure simultaneously. Red Hat’s analysis of the release found that many of the flagged bugs are functionality flaws rather than security vulnerabilities. Linux’s defence-in-depth architecture neutralised the kernel exploits Mythos found. The extrapolated “thousands” figure, derived from 198 manual reviews, does not survive the scrutiny that a peer-reviewed security research paper would require. The gap between the findings and the framing is not a rounding error. It is a deliberate rhetorical choice.

The technology may be real. The narrative around it is doing different work — establishing Anthropic as the indispensable steward of a capability too dangerous for general release, and too valuable for anyone but the most trusted government and enterprise clients.

The Policy Implications — Which Are Real

There is a risk, in the appropriate scepticism about Anthropic’s framing, of dismissing the underlying policy questions that AI-assisted vulnerability research genuinely raises. These questions are real, consequential, and not yet adequately addressed by any regulatory framework.

AI-assisted vulnerability discovery does accelerate the identification of exploitable conditions in widely deployed software. If that capability becomes accessible to actors without the ethical and legal constraints that Anthropic claims to apply, the consequences for critical infrastructure cybersecurity are not trivial. The question of how AI vulnerability research tools should be governed — who can access them, under what conditions, with what oversight, and with what accountability for findings that are not responsibly disclosed — is a legitimate policy problem that the AI industry, governments, and the security research community have not yet resolved.

The problem with Anthropic’s approach to surfacing these questions is precisely that it conflates them with a sales pitch. When a company simultaneously announces a capability, exaggerates its immediate threat potential, offers itself as the responsible steward of that capability, and positions its controlled-access model as the policy solution, it degrades the quality of the policy conversation. Governments and enterprises making decisions about AI governance based on Anthropic’s framing of Claude Mythos are making those decisions in an information environment that has been shaped by a party with a strong commercial interest in the outcome.

This is not unique to Anthropic. It is the structural condition of AI policy in 2026: the companies best positioned to explain the capabilities and risks of frontier AI models are also the companies with the strongest commercial interest in the policy outcomes that follow from those explanations. That conflict of interest does not make their technical claims false. It makes them insufficient as the primary basis for policy decisions that will shape the governance of AI security capabilities for years.

Analyst note

The timing of the Mythos announcement is itself analytically informative. Days after Anthropic’s release, reporting emerged that OpenAI was also working on an advanced cybersecurity AI model with similar capabilities, which it would also limit in rollout. The symmetry is not coincidental: as frontier AI models reach comparable capability thresholds, the companies producing them face comparable strategic incentives to deploy those capabilities as government and enterprise sales instruments rather than general consumer products. Claude Mythos and its OpenAI equivalent are not primarily security research tools. They are, as the market structure of their deployment makes clear, the latest iteration of a competition for government AI contracts in which the currency is not product superiority but trusted partnership status. Understanding that competition is essential context for evaluating the claims made in support of it.

Reading Anthropic’s Strategy Clearly

None of this analysis should be read as a dismissal of Anthropic as a company, of its safety research as a field, or of Claude Mythos as a technical achievement. The capability is real. The safety concerns about AI-assisted vulnerability research are legitimate. The questions about AI consciousness, however philosophically premature, are not inherently cynical.

What this analysis does suggest is that the appropriate response to a 250-page document combining genuine technical findings, generous extrapolation, philosophical speculation about machine consciousness, and an implicit sales argument for controlled-access government deployment is not uncritical acceptance of its framing. The security research community — Red Hat’s analysis being an example — is applying the appropriate scrutiny to the technical claims. The policy and diplomatic community needs to apply equivalent scrutiny to the strategic framing that surrounds them.

AI companies making claims about the dangers of their own models, the necessity of controlled access, and the unique trustworthiness of their safety frameworks are not neutral parties in the regulatory conversations those claims are designed to influence. Treating their self-assessments as the primary evidence base for AI security policy is the equivalent of commissioning a pharmaceutical company to conduct the clinical trials that determine whether its own drug receives regulatory approval — and accepting those trials, without independent verification, as sufficient.

The capability is advancing. The policy framework for governing it is not keeping pace. And the companies best positioned to explain the gap are also the companies most invested in a particular answer to the question of who should fill it.

Bottom line assessment

Claude Mythos is a real and meaningful advance in AI-assisted vulnerability research. It found real bugs in real software, some of which had persisted undetected for years, and its capabilities will improve further. The framing Anthropic has deployed around these findings — thousands of devastating zero-days, possible machine consciousness, a model too dangerous for general release — is a strategic communication exercise that serves the company’s commercial interests in government and enterprise markets more directly than it serves the public interest in accurate AI capability assessment. The policy questions that AI-assisted vulnerability research genuinely raises are important and not yet adequately addressed. Those questions deserve a policy conversation grounded in independent technical assessment, not in the claims of the company seeking to establish itself as the indispensable steward of the capability in question. Mythos is real. The mythos around it is doing different work.

Anthropic Claude Mythos AI Security Cybersecurity Policy AI Governance Zero-Day Vulnerabilities Vladimir Tsakanyan

The Mythos of Mythos: What Anthropic’s AI Security Claims Really Tell Us

The Political Economy of AI Fear

What the Capability Actually Demonstrates

The Policy Implications — Which Are Real

Reading Anthropic’s Strategy Clearly

Share this:

Discover more from Center for Cyber Diplomacy and International Security

Comments

Leave a comment Cancel reply

Discover more from Center for Cyber Diplomacy and International Security