$20 per zero-day is already the WordPress plugin reality
Vulnerability researchers have spent the past year arguing about whether AI agents can find real bugs at scale or whether they mostly generate noise. A pipeline built in three days by researchers from TrendAI and CHT Security supplies an answer, along with a price tag that the security industry will have to reckon with.

The system, presented at Ekoparty Miami, pairs AI-driven static analysis with automated Docker provisioning and dynamic verification through Chrome DevTools MCP. It surfaced more than 300 critical zero-day vulnerabilities across the WordPress plugin ecosystem in 72 hours of scanning. Every finding was manually verified by the researchers and responsibly disclosed before publication.
The economics
The AgentForge orchestration dashboard logged roughly 222 million tokens consumed across 95 tasks during the campaign. Steven Yu, a threat research engineer at TrendAI, translated that to an average of about $20 per vulnerability discovered.
He qualified the number carefully. “This doesn’t mean you can easily find a vulnerability in any WordPress site for just $20,” Yu told Help Net Security. “It depends heavily on the security of the codebase. The WordPress ecosystem is extremely vast and complex, leading to highly variable code quality. In other frameworks or ecosystems, we might not see the same results at this cost threshold.”
The qualifier matters because WordPress plugins are an outlier. The ecosystem runs to more than a million plugins, many maintained by solo volunteers without security budgets, and the code quality reflects that. A hardened enterprise codebase would not surrender bugs at the same rate or at the same cost.
What is settled, by Yu’s account, is that the price floor is already crossed for someone willing to look. “We are already in a state where any motivated attacker with a credit card can execute this,” he said. “Both white-hat and black-hat actors are already implementing these types of actions at scale.”
Vulnerability classes the pipeline surfaced
The 300-plus findings span pre-authentication remote code execution, SQL injection hidden behind PHPCS annotations that mark vulnerable queries as safe, privilege escalation through the WordPress hook system, server-side request forgery, and a downgrade attack chain. One pre-auth RCE was identified in a plugin with more than 1,000 GitHub stars.
The downgrade chain was assembled by the AI without human guidance. The agent located a vulnerability that allowed it to roll a target plugin back to an earlier version, recognized that the earlier version carried its own exploitable flaws, and chained the two into a working attack. Yu confirmed no manual prompts or pre-taught patterns were involved. The same vulnerability class was identified through pattern hunting across OpenCart and Joomla codebases.
Disclosure infrastructure under strain
The pipeline addresses what the security industry has taken to calling “AI slop,” the wave of low-quality, AI-generated vulnerability reports that has pushed several major open-source projects to reject AI submissions outright. By requiring every AI-generated finding to pass dynamic verification before reaching the disclosure queue, the system eliminated more than 80% of false positives.
The downstream pressure remains. Yu said manual verification of each WordPress plugin vulnerability took his team between 30 and 60 minutes. He described the human review layer as the primary bottleneck.
“Organizations such as ZDI and NIST are currently struggling with massive backlogs due to the explosion of AI-assisted vulnerability reports,” Yu said. “When AI can scale discovery from a few findings per day to hundreds per second, the traditional human-centric triage model becomes unsustainable.”
His expectation for the next six months is a higher volume of disclosed vulnerabilities and a parallel rise in zero-day abuse by attackers running similar pipelines. He anticipates a structural shift in how disclosure programs accept submissions, with several vendors moving toward invite-only or membership-based models that prioritize researchers with established track records and ban accounts that submit AI-generated noise.
The longer-term answer Yu pointed to is more automation, applied at the receiving end. “The ultimate solution is to fight AI magic with AI magic,” he said. AI-assisted triage that automates environment setup and verification would let human experts concentrate on the most complex cases.
Where the AI still stops
Yu was direct about the ceiling. Drag-and-drop builders such as Elementor sit in the “computer use” category and will likely yield to the next wave of agent tooling within months. Other failure modes are harder. Exploits that need a working payment API key, a valid user account, or an SMS verification code stop the agent because the gap is in the environment, not in the model. Some calls require a human to define whether a feature is intended or malicious in the first place, a judgment that more training data will not resolve.

Download: Automating Pentest Delivery Guide