TrojAI unveils new capabilities to secure agentic AI beyond the prompt layer
TrojAI has announced major new capabilities designed to secure the growing deployment of agentic AI in the enterprise going beyond the prompt layer.
“The innovations we are unveiling this week address some of the most significant and rapid changes to the AI security ecosystem. Enterprise deployment of agents is accelerating quickly, and these new TrojAI capabilities enable a new level of visibility and protection needed for the Agentic enterprise,” said Lee Weiner, CEO of TrojAI. “Enterprises need to understand exactly what their AI agents are doing and to enforce policy across entire workflows, not just prompts. This is fundamental to deploying AI securely at scale.”
Agent-Led AI Red Teaming
TrojAI Detect now includes Agent-Led AI Red Teaming, which uses coordinated autonomous agents to conduct red team testing on AI agents, applications and models. This advancement allows AI security teams to easily perform complex testing scenarios that map to a wide range of known security frameworks with the click of a button.
Key features include:
- Agentic testing: Specialized agents work together to test AI models, apps and agents, automatically correlating results into a single, actionable report.
- Multi-turn attacks: Agents automatically orchestrate multi-turn and dynamic attack chains, eliminating manual configuration and using TrojAI’s vast library of datasets and manipulations.
- Adaptive learning: Testing agents retain history and memory to evolve strategies across attacks, becoming more effective with each new cycle of testing.
- Framework mapping: Test results are automatically mapped to OWASP, MITRE and NIST.
Agent-Led AI Red Teaming transforms AI security testing from a complex, multi-step process into a streamlined, intelligent assessment aligned to industry-standard frameworks.
Agent Runtime Intelligence
To complement build-time risk assessment, Agent Runtime Intelligence is available as a new platform capability in private preview. It goes beyond the prompt layer to capture and analyze full AI agent execution traces, giving enterprises deep visibility into how AI agents behave at runtime, including tool usage, memory access, data retrieval patterns and system prompt exposure. This enables security teams to govern, test and enforce policy across complex, multi-step AI workflows.
With Agent Runtime Intelligence, TrojAI enables visibility for:
- Tool exposure and excessive agency
- Prompt injection propagation across workflows
- Sensitive data access during retrieval
- System prompt exposure and memory interactions
The capability integrates seamlessly with TrojAI’s existing dashboards, MCP governance, SIEM integrations and compliance tooling.
Real-Time Protection of Coding Agents
As AI coding agents become embedded in development workflows, they introduce a new class of security risk. Real-Time Protection of Coding Agents extends TrojAI Defend to safeguard AI coding assistants such as Claude Code and Codex as they generate, retrieve and modify code.
The capability detects exposed secrets, prevents sensitive data leakage, including PII, and blocks indirect prompt injection attacks, such as malicious instructions embedded within a retrieved file. By monitoring agent behavior in real time, TrojAI ensures that coding agents operate within defined security guardrails without disrupting developer productivity.
With these three platform enhancements, TrojAI is redefining how enterprises protect the next generation of intelligent systems so they can confidently embrace AI innovation securely, transparently, and at scale.