Testing reveals Claude Mythos’s offensive capabilities and limits

Could Claude Mythos Preview, Anthropic’s latest large language model, be leveraged for fully automated cyber attacks?

The UK government’s AI Security Institute (AISI) tested its capability to successfully engage in capture-the-flag (CTF) challenges and multi-step attack scenarios, and found that that while its cybersecurity capabilities exceed those of previously available models, it can’t reliably execute autonomous attacks on hardened networks.

Claude Mythos attack capabilities limits

What is Claude Mythos Preview?

Anthropic introduced Claude Mythos Preview to the public earlier this month, and stated that the LLM is exceptionally good at discovering previously overlooked and difficult-to-detect bugs and vulnerabilities in operating systems, software, web applications, and cryptography libraries.

Given its effectiveness, the model will not be publicly released, as malicious actors could leverage it to discover zero-day vulnerabilities and develop exploits for both novel and known-but-unpatched weaknesses.

Instead, Anthropic launched Project Glasswing, a selective program giving major technology, cybersecurity, and financial organizations early access to the model. Joining them are the Linux Foundation and 40 organizations that build or maintain critical software infrastructure, all working to secure the world’s most important software before comparable AI tools reach wider audiences.

Claude Mythos: Cyber attack capabilities and current limits

What Claude Mythos Preview means for cybersecurity is being hotly debated online and offline. The results of tests performed by the AI Security Institute offer more insight on what dangers cybersecurity defenders may soon face.

The model is good at solving capture-the-flag (CTF) challenges, which are aimed at identifying and exploiting weaknesses in target systems, AISI researchers found.

“On expert-level tasks — which no model could complete before April 2025 — Mythos Preview succeeds 73% of the time,” they shared.

When it comes to more complex attacks, it’s less effective.

“Real-world cyber-attacks require chaining dozens of steps together across multiple hosts and network segments — sustained operations that take human experts many hours, days, or weeks to complete,” AISI noted.

“As a first step towards measuring this, we built ‘The Last Ones’ (TLO): a 32-step corporate network attack simulation spanning initial reconnaissance through to full network takeover, which we estimate to require humans 20 hours to complete. Claude Mythos Preview is the first model to solve TLO from start to finish, in 3 out of its 10 attempts.”

That said, three successes out of ten attempts tells only part of the story: the test environment was, by the researchers’ own admission, an easier target than most real-world networks: there were no active defenders, no defensive tooling, no consequences for tripping alerts.

“This means we cannot say for sure whether Mythos Preview would be able to attack well-defended systems,” the researchers said.

Still, the model can autonomously navigate an attack on a small, poorly defended system once someone gets it through the door (i.e., initial access is achieved by attackers).

“This highlights the importance of cybersecurity basics, such as regular application of security updates, robust access controls, security configuration, and comprehensive logging,” they opined, and pointed organizations towards UK National Cyber Security Centre’s advice on how cyber defenders should use AI to their own advantage.

Advice for AI-assisted defense

Anthropic’s researchers also advised defenders to use available AI models to strengthen defenses. They should use them for vulnerability discovery, analysis of cloud environments for misconfigurations, to help accelerate migrations from legacy systems to more secure ones, automate parts of incident respones, and more.

Mythos Preview’s ability to write n-day exploits autonomously means that patch cycles will have to be shortened, as well. “Software users and administrators will need to drive down the time-to-deploy for security updates, including by tightening the patching enforcement window, enabling auto-update wherever possible, and treating dependency bumps that carry CVE fixes as urgent, rather than routine maintenance,” Anthropic warned.

A paper recently released by the Cloud Security Alliance, written with the input of cybersecurity experts and the wider cybersecurity community, provides more specific guidance for Chief Information Security Officers on how to adapt their organization’s security program to this emerging threat landscape.

Subscribe to our breaking news e-mail alert to never miss out on the latest breaches, vulnerabilities and cybersecurity threats. Subscribe here!

More about

Testing reveals Claude Mythos’s offensive capabilities and limits

What is Claude Mythos Preview?

Claude Mythos: Cyber attack capabilities and current limits

Advice for AI-assisted defense

Featured news

Resources

Don't miss