Agentic AI memory attacks spread across sessions and users, and most organizations aren’t ready
In this Help Net Security interview, Idan Habler, AI Security Researcher at Cisco, breaks down a threat most security teams haven’t named yet: agentic memory as an attack surface.
Habler walks through MemoryTrap, a disclosed and remediated method to compromise Claude Code’s memory, showing how a single poisoned memory object can spread across sessions, users, and subagents. He explains why AI memory needs the same governance as secrets and identities, and what organizations must rebuild to contain trust propagation between agents before contamination becomes invisible.

When we talk about “agentic memory surfaces,” we’re describing something most security teams have no vocabulary for yet. How do you explain this attack surface to a CISO who still thinks of memory threats in terms of buffer overflows?
Memory has a completely new meaning in agentic systems. It is a persistent retrieval and instruction layer. It stores preferences, earlier context, summaries, workflow patterns, and learned behavior that can be used in future sessions.
That matters because when memory is reused between tasks, sessions, or users, it becomes an important part of the system’s decision context. The risk is not that an attacker will corrupt memory in the classic sense. The concern is that an attacker will alter what the model later recognizes as legitimate context. In this way, agent memory resembles a persistent control surface rather than a momentary state. That’s the frame I’d like security leaders to adopt.
That is also why the MemoryTrap case study, our recently disclosed (and remediated) method to compromise Claude Code’s memory, is a useful example. What made this discovery significant was not only the initial infection but also the fact that attacker-controlled impact reached persistent memory and other trusted instruction surfaces, allowing it to shape future behavior over time.
Prompt injection via poisoned memory retrieval seems like the obvious nightmare scenario. But what are the less discussed vectors that practitioners are underestimating?
When it comes to AI memory, there are a few security concerns that people don’t think about. Trust laundering is the most complex type of attack. It involves mixing untrusted data with trusted data and using it as a shared input. Because it blends in as normal reasoning, it is incredibly hard to trace and can quietly manipulate the AI long after the initial attack.
The way AI systems share context between users, sessions, and automated agents makes this risk much worse. If an infected memory object or resource is retained during one interaction, it rarely stays isolated; it might quickly spread to a completely different user chat or be passed along to a subagent doing a background operation. In the end, a single poisoned memory might propagate across the whole system because context is always being shared. This turns a single trick into widespread vulnerability.
Human memory is notoriously unreliable, and we’ve built legal and organizational systems that account for that. Agent memory presents the opposite problem: it’s persistent, searchable, and treated as authoritative. What organizational assumptions need to be rebuilt?
AI memory must be managed with the same strict governance as secrets, identities, and essential system configurations, including monitoring its origins, setting expiration dates, and demanding explicit authorization.
Companies often mistake the permanence of AI memory for accuracy. In reality, an instruction’s persistence is no guarantee of its validity; it can easily be a ‘stale’ rule or a successful manipulation that the system simply refuses to forget. To adapt, enterprises must treat long-term retrieved inputs (e.g., memory files, RAG indexing) as important operational data.
Cross-agent memory sharing introduces a trust propagation problem. If Agent A trusts Agent B’s memory, and Agent B was compromised three tasks ago, the contamination is invisible. How do you contain that?
When AI agents share memories, they transfer trust rather than just data. If Agent A reads Agent B’s memory, it will inherit any hidden faults or malicious inputs that B may have picked up. To prevent a single compromised memory from infecting the system, businesses must use tight validation scanning, which acts as an automated fact-checking process when data is transferred between agents.
Furthermore, attacks like MemoryTrap demonstrate the importance of properly isolating high-trust locations, such as the AI’s system prompt, from untrusted, user-controlled data. If these trust levels are mixed up, an agent could easily misinterpret a hidden harmful input for a trusted system instruction.
To safeguard against system hijacking via memory corruption, organizations should prioritize a separation of system instructions from user inputs. We recommend implementing real-time scanning during data transfers, maintaining rigorous provenance tracking for all memory sources, and establishing protocols for the rapid quarantine of corrupted data.

Download: 2026 SANS Identity Threats & Defenses Survey