It’s time to give AI security its own playbook and the people to run it

In this Help Net Security interview, Dr. Nicole Nichols, Distinguished Engineer in Machine Learning Security at Palo Alto Networks, discusses why existing security models need to evolve to address the risks of AI agents. She explains how organizations should approach threat modeling, governance, and monitoring for agents that can reason and act.

Nichols also shares practical steps, like logging and clone-on-launch, to help keep systems secure as these agents grow more capable.

AI agent security

Do you think we need a new security paradigm for AI agents, or can we adapt existing ones (like zero trust, SDLC, etc.)? What’s missing from current models?

There are two pieces of this question that need to be answered separately. There are the security paradigms themselves, but also the security threats which effectively are the design requirements for the paradigms. The AI agent landscape is rapidly changing and given today’s knowledge, there are elements of current paradigms that can be updated and adapted. To ensure those paradigms and adaptations are continuously responsive to the next emerging threats – which may not fit either the adapted paradigm or our current concept of an AI agent specific paradigm – it will be prudent to put more resources than we have previously towards tracking emerging threats and their implications.

In addition to AI potentially exploiting novel attack paths, or changing the calculus of what is “likely,” AI has the potential to scale attacks much faster, necessitating faster deployment of defenses. In short, we must keep a continuous eye towards the specific threats to and from AI systems, and be prepared to adapt at a faster pace. This is more important than a specific need for a particular paradigm, which may become obsolete much faster than we have been accustomed.

How should organizations think about threat modeling for AI agents that have both reasoning capabilities and access to operational tools?

To effectively model AI agent threats, organizations need to incorporate two principles. First, the system will likely be a compound one, with different models customized to perform specific tasks, using reasoning, memory and operational tools, which may be third-party agents outside the organization’s direct control. It is more important than ever to adopt a holistic approach to identifying when, where and how the elements interact, and which types of exploits are relevant at those junctures, as well as upstream and downstream of the interaction.

We will also soon need to think of AI security vulnerabilities as a sub-discipline alongside reverse engineering, cryptography, API and cloud security. Teams will need to ensure representation from all of these perspectives to account for the complex interactions emerging in the AI agent ecosystem.

What kind of governance or oversight structure would you recommend for organizations deploying autonomous or semi-autonomous AI agents at scale?

It will be important to be proactive in this capacity, as the governance and oversight practices are still being established. At the operational level, we ideally want strict boundaries on what agents are allowed to do, however, there are ways in which the current permissioning does not match with AI. New paradigms and technologies are needed. Adapting existing paradigms is a critical first step in evolving to address the emerging novel differences.

A related challenge is the ambiguity of ownership and accountability for security within the AI supply chain. When models are re-tuned to perform on a new domain, data poisoning or other tampering could occur at any stage. Internally, it is much easier to coordinate a joint assessment. However, in cases where third parties are obfuscating information such as specific weights of the model, training procedures or the exact training data to protect private IP, the ability to share the information necessary to reproduce and study AI vulnerabilities is diminished.

How feasible is it to implement runtime monitoring for agent behavior in real time? Are there specific observability techniques you’ve seen work well?

Real-time security monitoring is critical for AI agents. As their sophistication and autonomy increase over time, the speed of defenses must be commensurate. It’s essential that agent runtime security is built into and around each of the building blocks of agents. This includes the models, data ingestion, and connection protocols for the actions and communications within and by an agent. We have a tremendous opportunity to build that upfront, as the AI agent ecosystem grows.

There is a spectrum of techniques that can provide introspection to real-time AI agent security monitoring, and the feasibility of these techniques to be adopted now and in the near term, and are also evolving. Some of the most practical and near term techniques are built from logging agent identities corresponding to decisions and actions. In the longer term, clone-on-launch patterns can be utilized to spin up from a secure baseline and discarded after achieving its goal, to reduce the potential of security incidents propagating.

What’s your view on simulating real-world environments to safely test agent decision-making under stress or edge conditions?

One of the fundamental security differences of AI agent systems is how dynamic and functionally connected they are with their operating environment. They are designed to observe and act within that environment to achieve objectives. Data poisoning, goal hijacking and other AI-specific vulnerabilities must be identified and prevented at this interface between the agent and its environment because this is a prime operational pathway to subtly insert false data, which can have major system security impacts.

It’s very time-consuming to establish synthetic environments for agent security evaluations that accurately reflect the nuance and resolution of real-world features. It is important to have a common environment as a comparable baseline for observability of offensive and defensive security events, for common and edge-case scenarios.

Simultaneously, the popularization of plug and play agent creation tools makes it imperative that tools for securing AI agents are as abundant and available as the tools to build agents. The cybersecurity community has embraced the value and necessity of having freely available malware tools for similar reasons: insecure computers on a network pose a risk to other connected systems. Likewise, insecure agents will be a weak link in the AI agent ecosystem.

More about

It’s time to give AI security its own playbook and the people to run it

Do you think we need a new security paradigm for AI agents, or can we adapt existing ones (like zero trust, SDLC, etc.)? What’s missing from current models?

How should organizations think about threat modeling for AI agents that have both reasoning capabilities and access to operational tools?

What kind of governance or oversight structure would you recommend for organizations deploying autonomous or semi-autonomous AI agents at scale?

How feasible is it to implement runtime monitoring for agent behavior in real time? Are there specific observability techniques you’ve seen work well?

What’s your view on simulating real-world environments to safely test agent decision-making under stress or edge conditions?

Featured news

Resources

Don't miss