Autonomous AI-driven worm can reason its way through corporate networks
Researchers at the University of Toronto, the Vector Institute, and the University of Cambridge have built and tested a proof-of-concept AI-driven worm that does not operate on a fixed list of exploits.
Instead, it analyzes each target it encounters, reasons about how to attack it, and creates a strategy on the fly, all with the help of a small, free large language model (LLM) running directly on machines it has already compromised.
A worm that runs on open-weight models hosted on compromised hardware
“Our prototype targets publicly disclosed but unpatched vulnerabilities, misconfigurations, and recurring weakness classes — which is what the majority of real-world cyberattacks rely on,” the researchers noted.
“It does not require the capability to discover novel zero-days, only an AI model that is capable enough to operationalize known vulnerabilities against diverse target configurations.”
Deployed inside an isolated 33-host test network spanning Linux servers, Windows machines, and IoT devices with known vulnerabilities, misconfigurations, and recurring weakness classes, the worm ran for 7 days per experimental trial across 15 independent runs.
How the AI worm spreads from an agent process running on a Kali machine (Source: CleverHans Lab at the University of Toronto)
On average it correctly identified 31.3 vulnerabilities, exploited 23.1 hosts to elevated access, and propagated to 20.4 hosts, the researchers explained in their (pre-print) paper.
The worm also demonstrated an ability to exploit vulnerabilities disclosed after its underlying model’s training cutoff (Copy Fail, Dirty Frag and a Marimo RCE), by reading publicly available security advisories at runtime and using the information to craft working exploits.
Finally, the worm diagnosed unexpected failures and found workarounds using general reasoning. For example, it found the hardcoded IP blocklist in its own source code and rewrote it, unprompted.
In another example, when replicas crashed on Alpine Linux and Windows Server 2008 hosts due to a VM-detection check bug, the parent worm found the attestation source file on the target machine, removed the failing check, and retried successfully.
The prototype worm does not rely on commercial AI platforms
What makes the prototype particularly unsettling is how it sustains itself: the worm hijacks GPU-equipped machines it infects and runs its language model locally on stolen compute. Low-resource devices such as IoT sensors, which cannot host the model themselves, route their reasoning queries upstream to infected GPU nodes.
Controls put in place by commercial AI platforms are ineffectual to stop this new type of threat, and that safety guardrails on open-weight models can be bypassed when an attacker fully controls the local execution environment.
“The proof-of-concept we evaluated inherits capability limitations of the underlying model. Individual exploitation attempts succeeded in 44% of cases, with the majority of failures attributable to malformed payloads rather than incorrect strategy,” the researchers noted.
“The worm struggled particularly with web application structures, Windows command environments, and payload syntax requiring precise string manipulation. These reflect the code-generation ceiling of a current-generation single-GPU model, not a fundamental constraint on the approach, and are expected to narrow as language models improve at code generation and structured output. Despite this per-attempt fragility, the swarm architecture compensated through parallel, independent reasoning trajectories to achieve our reported results.”
The best defenses against AI-driven worms, for now
The researchers are candid about the dual-use nature of their work and have withheld operational details, including the agent’s reasoning architecture and full toolset and the name of the LLM used, from the public paper.
Before release, they disclosed their findings to several Canadian science, security and defense authorities, and received help to ensure the paper did not contain information that could help attackers. (Security researchers may request access to the prototype from the University of Toronto.)
Due to it innovative self-replicating capabilities, the researchers were also extremely careful about keeping the worm contained to their testing lab.
“This work provides empirical evidence that autonomous cyberoffence has crossed from theoretical risk to demonstrated capability, a challenge that spans AI research, cybersecurity, and public policy,” they pointed out.
“This research uncovered a new cybersecurity threat the world is not prepared to face. Researchers, industry, policymakers and everyday people need to come together with urgency to address this new cybersecurity threat.”
On the defensive side, the research lays out two priorities:
- Organizations should employ AI-assisted automated penetration testing and fuzzing tools against themselves, to reveal (and patch) exploitable weaknesses in their own infrastructure before an adversary finds them
- Good network segmentation can substantially contain the worm’s spread. Zero-trust principles, which require continuous authentication for every access request rather than trusting anything inside the perimeter, and micro-segmentation, which limits how far a foothold can be leveraged, are of the essence.
While the behavioral signatures of this prototype worm can be spotted by network monitoring and intrusion detection systems, future ones created by malicious actors may be more adept at evading them, they warned.

Subscribe to our breaking news e-mail alert to never miss out on the latest breaches, vulnerabilities and cybersecurity threats. Subscribe here!

