Only 11% of production agents pass the AI agent security bar

Enterprise teams are running AI agents that write code, drive browsers, answer customer calls, manage cloud infrastructure, and query data warehouses with standing credentials. A new independent assessment of 100 production agents finds that nearly all of them carry the conditions for a single hostile document to take them over.

AI agent security capability

The AI Risk Quadrant (AIRQ) report, a 2026 Q2 edition produced by independent researchers, scores 100 commercial and publicly available AI agents across three dimensions: attack surface, blast radius, and defense controls. The picture that emerges is one of fast capability growth pulling ahead of the controls meant to contain it.

The lethal trifecta is the default

The report describes a “lethal trifecta” common across the cohort: private data access, exposure to untrusted content, and the ability to take outbound actions. The combination appears in 98 percent of agents scored. Eight of the ten agent classes show 100 percent trifecta exposure. Only General Assistant Agents and Data Engineering Agents each have a single exception.

External data ingestion is the universal attack surface in the cohort. Documents, web pages, tickets, emails, and retrieved snippets produce indirect prompt injection on nearly every agent scored. The combination of trifecta exposure and ingested content from outside sources means a single poisoned message can steer agent behavior across every system the agent can reach.

Capability and defense move in opposite directions

The two riskiest categories in the cohort are coding agents and computer-use agents. They pair the widest attack surfaces and largest blast radii with the thinnest defenses. Coding agents rank second in capability and eighth in defense across the ten classes scored. Computer agents post an average output guardrail score of exactly zero. Every agent in that class earned zero points on output validation, exfiltration-channel blocking, and rendering sanitization.

Work Copilot and Business Process agents sit at the other end. They count among the most heavily defended classes in the cohort, with smaller blast radii to begin with.

Only 11 percent of agents land in the Fortified Leaders quadrant, where high attack surface combines with strong defenses. Most of these agents are enterprise solutions where the defense is inherited from platform-level governance: tenant isolation, role-based access, and audit frameworks that existed long before AI was added on top. Forty percent of the cohort sits in the Exposed Giants quadrant, which the report says holds 60 percent of the total risk budget.

Eugene Neelou, AIRQ Project Lead and AI Agent Security Expert, told Help Net Security that the agents arriving with the weakest defenses tend to be the ones arriving through the back door of the enterprise. “Our data shows that coding agents and computer agents rank as the top 2 highest attack surfaces, top 2 highest blast radius, and top 2 lowest defense controls,” he said. “These agents are self-serve products with bottom-up adoption that usually bypass procurement gates.” Top-down adoption of enterprise-heavy AI agents goes through compliance review; self-serve adoption skips it.

Audit without defense

The report finds that 37 percent of the cohort scores well on logging and observability and poorly on the four defense components that prevent or limit harm. For those agents, audit capabilities function as a forensic asset. A further 38 percent complete irreversible actions before any monitoring path can plausibly fire.

Eighty-three percent of claimed defenses lack independent verification, according to the assessment. Only 17 percent of assigned defense credits carry an independent verification mark. The components most relevant to blast radius reduction, such as execution isolation, are the least verifiable.

Neelou described how the verification process works. “AIRQ is designed to reward vendor transparency, mimicking a regular enterprise vendor selection process. Independent verification means evidence from public sources, as opposed to confidential vendor docs,” he said. The gap exists, he added, because most vendors claim or are expected to have certain controls and the technical evidence stays weak.

Tool execution is the dividing line

Tool execution is the single variable that best predicts blast radius. It alone explains 76 percent of blast radius, outpredicting agent class, vendor reputation, and every individual defense component. The report describes agent risk in the cohort as effectively bimodal, with tool-executing agents forming one population and the rest forming the other.

The recommended procurement gate is documented and tested sandboxing. Sandboxing cuts residual risk by roughly 2.6 times. Cloud or container-level isolation captures about 6 times reduction. Most of the benefit comes from the first step.

Vendor-shipped and customer-configured diverge

A recurring theme in the report is that the same platform can score points apart depending on which build is evaluated, with spreads wider than entire agent classes. Procurement signs off on one configuration; security inherits another.

Neelou drew an analogy to cloud computing. “Similar to the shared responsibility model in cloud security, a final agentic product deployed by the buyer often has a different security posture than a default platform configuration,” he said. He pointed buyers toward the AIRQ methodology itself as the minimum question set, covering 5 to 10 factors per scoring dimension that a buyer should demand answers to before deployment.

The long view

CVE volume in the AI agent market is climbing quarter over quarter. The report recommends quarterly re-audits because categories with low CVE counts are in a pre-discovery phase, where research attention has yet to surface the issues that exist.

Buyers should treat the agent as the unit of risk above the underlying model, compare agents within the same class and the same quadrant, separate compliance certifications from technical defense scoring, and score every platform twice, once as the vendor ships it and once as the customer configures it.

Neelou said the scoring framework is built for the long arc, with the quarterly edition serving as a snapshot. The methodology, he said, is designed to be open, usable, and reproducible at any time.

More about