Project Glasswing: The AI Model So Dangerous Anthropic Refuses to Sell It

 

⚡ AI Investigation · June 2026

Project Glasswing: The AI Model So Dangerous Anthropic Refuses to Sell It

A leaked draft. A 27-year-old security hole nobody ever found. Fifty-two companies handed a digital weapon. And one loud question: is this the most responsible thing an AI lab has ever done — or the most brilliant marketing stunt of the decade?

By QuvirAI Team

On the night of March 26, 2026, two security researchers went poking around where they probably shouldn't have.

Roy Paz of LayerX Security and Alexandre Pauwels from the University of Cambridge had stumbled onto something odd sitting on Anthropic's servers: an unsecured cache of nearly 3,000 internal documents, wide open, no login required. Among the images and PDFs and draft pages was a blog post for a product nobody outside the company had ever heard of.

Its name was Claude Mythos. Internal codename: Capybara. And the draft described it, in Anthropic's own words, as "by far the most powerful AI model we've ever developed."

Fortune broke the story that same evening. By the time markets opened Friday morning, the damage was done. According to TradingView's reporting, CrowdStrike alone shed roughly $15 billion in market value in a single trading session. The catalyst wasn't an earnings miss or a hack or a CEO scandal. It was a draft blog post sitting on a URL someone forgot to lock.

Two weeks later, Anthropic stopped pretending the model didn't exist. On April 7, it announced Claude Mythos Preview officially — and then did something almost no AI company has ever done on purpose. It said: we built this, and we're not going to sell it to you.

Here's the full story of what Mythos actually is, who got their hands on it, and why a growing chorus of experts thinks the whole thing might be a very expensive magic trick.

So what exactly is Claude Mythos?

Strip away the drama and Mythos is a general-purpose AI model — the same kind of thing that powers ChatGPT or Claude Code. You could, in theory, ask it to write you an email or plan a trip. That's not why it's famous.

During pre-release testing, Anthropic's team noticed something they hadn't quite seen before. Logan Graham, who leads offensive cyber research at Anthropic, told NBC News that the model wasn't just good at finding software vulnerabilities. It could weaponize them — write the exploit code, then chain multiple flaws together into a working break-in, almost entirely on its own.

"We've regularly seen it chain vulnerabilities together. The degree of its autonomy and long-ranged-ness… I think, is a particular thing about this model."

— Logan Graham, Anthropic offensive cyber research lead, to NBC News

That's the line that changed everything. A model that can independently find a hole nobody knew about, build the tool to crawl through it, and then use that tool — at a speed and scale no human team could match. Anthropic looked at that capability and decided the responsible move was to lock it in a vault.

The numbers that made the industry sit up

Benchmarks are easy to roll your eyes at. These ones are harder to dismiss, because the jumps over Anthropic's previous flagship (Opus 4.6) aren't incremental — they're vertical.

181
Firefox exploits produced
(Opus 4.6 made just 2)
10,000+
critical vulnerabilities found by partners in weeks
27 yrs
age of the OpenBSD bug it uncovered
Benchmark Mythos Opus 4.6 Jump
USAMO 2026 (math) 97.6% 42.3% +55
SWE-bench (coding) 93.9% 80.8% +13
CyberGym (exploits) 83.1% 66.6% +16.5
GPQA Diamond (science) 94.6% record

Continue → Swipe the table sideways on your phone to see all the columns

The benchmark that genuinely rattled people wasn't on any chart, though. According to a technical breakdown on Medium by security writer Tahir, on a test harness where Opus 4.6 managed to produce just two working Firefox exploits, Mythos produced 181.

The bugs it found should not have been findable

This is the part of the story that stops feeling like a press release and starts feeling unsettling.

Mythos found a 27-year-old integer overflow bug in OpenBSD — an operating system that exists more or less specifically to be paranoid about security. It found a 16-year-old flaw in FFmpeg (the video library running inside basically everything) that had survived more than 5 million automated tests. According to the Medium analysis, it wrote a remote code execution exploit for FreeBSD's NFS server that chained six separate requests together to hand an unauthenticated stranger root access.

Across the Project Glasswing partner network, the tally reached more than 10,000 high- or critical-severity vulnerabilities in a matter of weeks, per Anthropic's own update. As one detail from the partner data put it, progress used to be limited by how fast humans could find bugs. Now it's limited by how fast anyone can patch the flood the AI keeps surfacing.

⚠️ Why "dual-use" is the scary phrase here

The exact same skill that lets Mythos protect critical software lets it attack critical software. As Bain & Company put it: AI doesn't create new vulnerabilities, it exposes the ones that were always there. The catch is that defenders and attackers get the same superpower at the same moment — and only one of them has to patch everything.

Meet the 52 organizations holding the keys

Since Anthropic wouldn't sell Mythos to the public, it built a walled garden instead: Project Glasswing, named after the glasswing butterfly with its transparent, see-through wings. The idea is a defensive coalition — give the world's most important software defenders a head start before the attackers catch up.

It launched with 12 founding partners, plus around 40 additional critical-infrastructure organizations. That's the list of names that now have access to a tool most security teams on earth can only read about.

Category Who's In
Big Tech AWS, Apple, Google, Microsoft, NVIDIA, Broadcom
Cybersecurity CrowdStrike, Palo Alto Networks, Cisco
Finance & Open Source JPMorganChase, the Linux Foundation
Government The NSA used it — despite the Pentagon blacklisting Anthropic over a separate dispute
Refused China's government requested access in April 2026 and was turned down

By June 3, 2026, Anthropic expanded the program by roughly 150 more organizations across 15+ countries. To get in, each one has to clear Anthropic's security requirements first. Partners share a pool of $100 million in usage credits, and Anthropic threw in $4 million in donations to open-source security groups on top.

For everyone outside the club, the price was theoretical anyway — Mythos ran at $25 per million input tokens and $125 per million output tokens, roughly five times the cost of Anthropic's public Opus model.

Now the uncomfortable question: is any of this a stunt?

Here's where the story splits the room. Not everyone bought the "too dangerous to release" framing — and some of the loudest skeptics are serious people.

David Sacks, the entrepreneur and investor who heads the White House council of advisors on technology, told reporters the threat should be taken seriously — but couldn't resist adding the obvious:

"It's hard to ignore that Anthropic has a history of scare tactics."
— David Sacks, White House tech advisor, via TechXplore

The timing didn't help. Both Anthropic and OpenAI are reportedly positioning for IPOs later in 2026. Security expert Bruce Schneier noted on his blog that he found the timing of all the hype "interestingly coincidental with their IPO." When you announce that you've built something so powerful you're scared to sell it, the thing gets enormous attention precisely because nobody can have it.

Then came the technical pushback, which is harder to wave away:

The Skeptic Their Argument
AISLE Security 8 of 8 cheap open-weight models found the same bug (CVE-2026-4747) — one at just $0.11 per million tokens
Cisco (June 8) Tested 6 models across 1.8 billion lines of code; concluded the scaffolding does the hard work, not the frontier model
Niels Provos The engineer who wrote the original BSD flaw reproduced the "discovery" using older, cheaper models — calling it "an orchestration problem, not a frontier-model one"
EA Forum analysis On most cyber tasks, Mythos isn't dramatically ahead of GPT-5.5 — and GPT-5.5 is cheaper

There's also the detail that gives skeptics ammunition: per the flyingpenguin verification roundup, over 99% of the vulnerabilities Mythos found remain unpatched and undisclosed, so most of the eye-popping claims can't be independently checked yet. Anthropic published cryptographic commitments to prove the exploits existed at the time of writing — but "trust us, we have the receipts" is a tougher sell to a skeptical security community.

The part nobody voted on

Strip away the benchmark fight, and there's a deeper unease that writer Ricardo Garcês captured well on Medium. His question wasn't "is this real?" The capabilities are real enough. His question was about who decided.

"Who decided that 52 of the world's largest technology companies should be the ones to control this? No public debate. No independent oversight. Just a press release, a list of approved partners, and a message that was simultaneously reassuring and unsettling."
— Ricardo Garcês, writing on Medium

That tension got sharper because of where Anthropic sits politically. According to TradingView, the company has been blacklisted by the Trump administration after it set limits on military use of its models, and is currently in litigation with the federal government. So a private company, at odds with its own government, ended up deciding which institutions worldwide get early access to what some are calling a cyber-weapon. The UK's AI Security Institute ran its own independent tests and confirmed Mythos was a genuine step up — but even they noted their test environments were softer than the real world.

What happened next: the vault opened a crack

Anthropic always said Mythos was a beginning, not an endpoint. The plan was to bring Mythos-class power to the public eventually, once the safety scaffolding could catch up. In June 2026, that's exactly what started happening.

The company released Claude Fable 5 — described across coverage as the first publicly available Mythos-class model. The trick is that Fable 5 and Mythos 5 are reportedly the same underlying model, with one difference: Fable 5 ships with safety classifiers that block the most dangerous cybersecurity and biology outputs, while Mythos 5 (the unrestricted version) stays locked behind Glasswing. Fable 5 lands at $10 per million input tokens and $50 output — half the price of the original Mythos Preview.

In other words: the public finally gets the brain. The vault keeps the version without the muzzle.

 The QuvirAI take

Here's where this lands for us, after reading every announcement, benchmark, and angry rebuttal we could find.

Both things are true at once, and that's what makes it interesting. The skeptics are clearly right that Anthropic benefits enormously from the "too dangerous to sell" narrative, especially with an IPO circling. When the people refuting your claims can reproduce your headline discovery on a $0.11 model, the "watershed moment" framing starts to look a little convenient.

But dismissing the whole thing as marketing feels lazy too. A 27-year-old bug in OpenBSD is a 27-year-old bug whether or not Anthropic has an IPO. The capability trend is real even if this specific model's lead is exaggerated. And the genuinely scary line in all of this isn't about Mythos at all — it's Anthropic's own warning that within 6 to 12 months, other labs will have Mythos-class models, and some of them won't bother with a Glasswing or a safety classifier.

That's the part worth losing sleep over. Not the one company that built a weapon and locked it up. The next ten that build the same thing and leave the door open.

FAQ

Can I use Claude Mythos?

No. Mythos Preview (and Mythos 5) is restricted to approved Project Glasswing partners and vetted critical-infrastructure organizations. There's no public API and no waitlist. The closest you can get is Claude Fable 5 — the same model with safety classifiers added.

Why won't Anthropic just release it?

Officially, because its ability to autonomously find and exploit software vulnerabilities is too dangerous to hand to everyone at once. Critics argue the restriction also generates huge publicity ahead of a rumored IPO. Both can be true.

Was it really leaked by accident?

Anthropic blamed "human error in the CMS configuration," and said the incident was unrelated to any of its AI tools. Around 3,000 internal documents sat on an unsecured URL until researchers flagged them and Fortune reported the story.

Is this an AGI moment?

No serious source is calling Mythos AGI. It's a frontier model with an unusually sharp spike in one domain (cybersecurity). Impressive and concerning, but narrow — not general superintelligence.

The bottom line

Project Glasswing is either the most responsible thing a frontier AI lab has done — building a weapon and choosing not to sell it — or the most sophisticated piece of capability marketing the industry has produced. The honest answer, reading the evidence, is that it's a bit of both, and the ratio depends on benchmarks nobody outside the club can fully verify yet.

What isn't in dispute is the direction. Vulnerability discovery just stopped being a human-speed activity, and every security team on earth now has to plan for a world where the attacker might be a model running for a few hundred dollars overnight. Whether Mythos itself deserves the legend or not, that future is already loading.

The same AI tools rewriting cybersecurity are also letting one person build a million-dollar company. We mapped how solo founders are hitting $10K/month with AI in 2026 — and the 40% failure rate nobody mentions.

Read How Indie Hackers Are Building $10K/Month AI Products Solo →

Comments

0