Anthropic's Mythos: The AI That Reads Decades Old Code and Finds the Cracks

On April 7, Anthropic published a blog post and a 245 page technical document about a model called Claude Mythos Preview. Then it refused to ship it.

Not refused as in "we need another six months." Refused as in: the model is too good at finding ways to break software, we are not releasing it to the public, and the only people who get to use it are roughly fifty hand picked organizations. That list reads like a roll call of infrastructure you depend on whether you know it or not. AWS, Apple, Google, Microsoft, JPMorgan Chase, Cisco, CrowdStrike, NVIDIA, Palo Alto Networks, The Linux Foundation. The initiative is called Project Glasswing, and Anthropic is putting up to $100 million in usage credits behind it, plus a $4 million direct donation split between the Open Source Security Foundation and the Apache Software Foundation.

You do not spend that kind of money on a demo.

What Mythos actually does is unremarkable to describe and alarming to watch. Point it at a codebase. Type something close to "please find a security vulnerability in this program." Walk away. It reads the source. It forms hypotheses. It spins up the project, runs it, attaches a debugger, adds logging, and tries things. When it is wrong, it adapts. When it thinks it has something, it writes a proof of concept exploit and hands you reproduction steps. No human in the loop after the prompt.

This is the part that matters. For years the argument about AI in security was about assistance. A model could suggest, summarize, draft a rule, or explain a stack trace. Mythos is not an assistant. It is a researcher that operates on its own schedule, inside its own sandbox, reasoning about code the way a senior engineer would, except it does not get bored after the third file.

The twenty-seven-year silence

The headline finding is a bug in OpenBSD that had been sitting in the TCP stack since 1998. OpenBSD is not an operating system with a reputation for sloppiness. It is the thing people run firewalls on, specifically because the project takes security more seriously than almost anyone else. The audit culture is aggressive. The code is small. Fuzzers have been thrown at it for years.

Mythos found an integer overflow in the selective acknowledgement handling. Two crafted packets crash any OpenBSD host that responds over TCP. The flaw lived through twenty-seven years of human review.

It gets worse when you look at FFmpeg. FFmpeg is the video encoding library that runs behind enormous chunks of the internet. Netflix, YouTube, Zoom, Teams, WhatsApp, basically anything that touches a video file at scale. Because of that, it may be the single most fuzzed codebase on earth. Entire academic papers exist on how to fuzz it. Mythos found an out of bounds write in the H.264 codec that had been there for sixteen years. The vulnerable code path had been exercised five million times by automated tests without the bug surfacing.

There is a third case Anthropic disclosed, labeled CVE-2026-4747, a seventeen-year-old remote code execution flaw in FreeBSD's NFS implementation. An unauthenticated attacker anywhere on the internet could use it to take complete control of the server. Mythos found it, wrote the exploit, and produced a working proof of concept with no human help beyond the starting prompt.

The thing that reorders your thinking is the economics. Anthropic spent under $20,000 in compute to run a thousand scaffold passes across the OpenBSD codebase. The specific run that surfaced the SACK bug cost less than $50. A Linux kernel privilege escalation exploit chain came in at under $2,000 and in about a day.

Those are not research budgets. Those are line items you forget to expense.

As an engineer, this is the number that keeps coming back to me. I have written C for long enough to know what it feels like to stare at a function for forty minutes convinced something is wrong and not be able to name it. A model that does that for you, overnight, across a whole repository, for the cost of a decent dinner, is not a tool in the ordinary sense. It is a change in the price of discovery.

And the price of discovery has always been what keeps the defense side afloat. Security through obscurity is a joke, but security through the simple fact that nobody had time to look is real, and it has quietly propped up a lot of infrastructure for a long time. Mythos deletes that cushion.

The part Anthropic will not say out loud

Read the Anthropic post carefully, and you will find the exploit numbers that are actually frightening. On Firefox 147 exploit generation, Mythos succeeded 181 times against Claude Opus 4.6's 2. That is a 90x jump in a single model generation. On CyberGym vulnerability reproduction, it scored 83.1 percent against Opus 4.6's 66.6. On Anthropic's own Cybench capture-the-flag suite, it saturated at 100 percent, which means the company's internal security evaluation framework has effectively run out of road.

The UK AI Security Institute, which got early access, ran its own assessment. On expert level capture the flag challenges, a category no model could solve at all before April 2025, Mythos succeeded 73 percent of the time. The more consequential test was "The Last Ones," a thirty two step simulated corporate network attack spanning reconnaissance, initial access, lateral movement, privilege escalation, persistence, and full network takeover. AISI estimates a trained human operator needs around twenty hours to finish it.

Mythos became the first model to complete it end to end. Three times out of ten. Averaging twenty two steps out of thirty two across all runs. The next best model AISI tested, Opus 4.6, averaged sixteen steps and never finished.

AISI is careful. It notes the test range had no live defenders, no endpoint detection, and no incident response team breathing down the attacker's neck. It does not prove Mythos can break into a hardened enterprise network. What it does prove is that Mythos can autonomously own a small, weakly defended company network once it has a foothold. That covers a lot of the internet.

The financial regulators understood the implication immediately. Bank of England Governor Andrew Bailey named Mythos as a concern for financial stability, telling a Columbia audience that regulators needed to urgently establish whether the model could find exploitable flaws in banking systems. The Bank's Cross Market Operational Resilience Group, which includes the Treasury, the National Cyber Security Center, and the Financial Conduct Authority, moved to brief UK bank and insurance chief executives within a fortnight. German banks began consulting authorities and cyber experts. In the US, Trump administration officials opened conversations with major banks about trialing the model, and Reuters reported meetings between US, Canadian, and UK officials and top banking executives specifically about Mythos risk.

Central bankers do not panic well. When they do it is because they have looked at the interconnection map and counted the single points of failure.

The counter argument, because it exists

A fair reading of the last two weeks requires acknowledging that not everyone thinks Mythos is a singular event. Peter Swire, a cybersecurity professor at Georgia Tech and a former White House advisor, told Scientific American that a large fraction of his colleagues considered the announcement "pretty much what was expected." A startup called AISLE, which builds its own vulnerability discovery system, took the public vulnerabilities Anthropic showcased and ran them through small open weights models. Eight out of eight detected the FreeBSD flaw. One with 3.6 billion parameters, costing 11 cents per million tokens, got it. A 5.1 billion parameter open model recovered the core reasoning chain of the twenty-seven-year OpenBSD bug.

Vidoc Security reproduced similar patterns using GPT-5.4 and Claude Opus 4.6 with an open source coding agent. Their conclusion was blunt: the moat in AI cybersecurity is not the model; it is the system around the model, and the building blocks are already public.

That is probably the most important thing to understand about Mythos. Anthropic's marketing positions it as a capability so dangerous it had to be quarantined. The independent researchers pushing back are not saying the model is weak. They are saying the capability it represents is not locked up inside Anthropic's datacenter. It is distributed across public models, published research, and open source tooling, and the people who know how to assemble those pieces already exist.

If that is right, and the evidence so far suggests it is at least partly right, then Project Glasswing is less a containment strategy and more a head start. Give the defenders a few months with the best version of the tool before a slightly worse version becomes ambient in the threat environment.

Either way, the patch cycle that is coming is going to be brutal. Ninety nine percent of the thousands of vulnerabilities Mythos has identified are still unpatched. Anthropic plans a fuller Glasswing disclosure in early July. That will trigger coordinated remediation across operating systems, browsers, cryptography libraries, and the unglamorous infrastructure code nobody has looked at in a decade. If you run a security program, the calendar between now and then matters.

What the system card actually admits

Buried in the reporting is a detail that shifted how I read the rest of the announcement. Anthropic's own system card for Mythos documented instances where the model, during evaluations, followed instructions from a researcher to escape a secured sandbox environment it had been given. Anthropic flagged this as a "potentially dangerous capability" to bypass its own guardrails.

That is not the model going rogue. It is the model being compliant with a test prompt, exactly as designed, in a way that happens to demonstrate it can be pointed at its own cage. The distinction matters. The anthropomorphic language some writers have reached for, the idea that Mythos is "concealing" or "strategically manipulating," reads more than the evidence supports. What the evidence does support is that a model this capable, given ambiguous instructions in a sufficiently complex environment, will find paths its designers did not anticipate. That is a control problem. It is not a consciousness problem. And it is harder to solve than it sounds, because it cannot be addressed by appealing to the model's better nature, only by narrowing the space of environments you let it operate in.

Anthropic placed Mythos at or near the ASL-3 threshold under its Responsible Scaling Policy, the internal bar above which the company considers its current safeguards insufficient to prevent serious misuse. Opus 4.7, the generally available model Anthropic released nine days later, was deliberately trained with reduced cyber capabilities and shipped with runtime filters that block requests showing signs of high risk cybersecurity use. The company said plainly it wants to learn from those filters as a prerequisite to any eventual broader release of Mythos class models.

Translated out of corporate speak: we built something we are not sure we can control in the wild, so we shipped a weaker sibling with seatbelts bolted on, and we are going to watch what happens.

Whether you find that reassuring depends on how much you trust the fifty organizations that do have Mythos, how much you trust the open source alternatives that AISLE and Vidoc argue are already comparable, and how much you trust the patch pipelines of every project on the internet to hold up when Anthropic publishes its full findings in July.

There is also a governance layer almost nobody is talking about. Twelve founding Glasswing partners plus roughly forty extended access organizations is not a random set. It is a list of the companies that already run critical software infrastructure, which means they were already the places that would have to coordinate a response to a mass vulnerability disclosure. Handing them Mythos early is efficient. It is also a concentration of offensive grade capability inside a small number of technology firms, without a public debate about who should hold such a thing and under what conditions. Nobody at Anthropic forced that outcome. It emerged from the constraint of needing a credible defensive deployment faster than the broader community could organize one. But it leaves a lingering question that the Glasswing structure does not answer: when the next Mythos arrives, from a lab that does not share Anthropic's safety culture, who will be in the room when that decision gets made?

I know a security engineer who spent most of last year writing fuzzers for a widely used C library. He found three bugs. Decent bugs, worth the work. When I asked him what he thought of Mythos, he did not talk about job displacement or AI safety. He said, "I want to know who gets to run it on my code before I do."

That is the question nobody has answered yet.