How attackers poison AI tools and defenses

Cyberattackers are using generative AI to draft polished spam, create malicious code and write persuasive phishing lures. They are also learning how to turn AI systems themselves into points of compromise.

Recent findings highlight this shift. Researchers from Columbia University and the University of Chicago studied malicious email traffic collected over three years. Barracuda Research has also tracked attackers exploiting weaknesses in AI assistants and tampering with AI-driven security tools.

AI in email-based attacks

Messages created with LLMs tend to be more formal, free of grammatical slips and linguistically sophisticated. That polish makes them more challenging for filters to catch and more convincing to recipients, especially when the attacker’s first language is not English.

Attackers are also using AI to test different versions of subject lines and body text. This kind of variation, similar to A/B testing in marketing, helps them identify what slips past defenses and tempts more victims to click.

Email attacks targeting AI assistants

Researchers have also detected attackers targeting the AI assistants that many employees rely on for daily work. Tools like Microsoft Copilot scan inboxes, messages and documents to provide context when answering questions, and that access creates a new risk.

Consider the following scenario: A malicious prompt hides inside a harmless-looking email. The message sits in a inbox until the AI assistant retrieves it while helping with a task. At that point, the assistant may follow the hidden instructions that tell it to do things like leaking sensitive information, altering records or running commands.

Other attacks focus on systems that use retrieval-augmented generation (RAG). By corrupting the data that feeds these systems, attackers can influence what an assistant reports back. The result is unreliable answers, poor decisions and unintended actions based on poisoned context.

Tampering with AI-based security tools

Attackers are also trying to bend AI-powered defenses to their advantage. Many email security platforms now include features like auto-replies, smart forwarding, automated ticket creation, and spam triage, and each one is a potential entry point.

If an attacker slips a malicious prompt into the system, they might trigger an auto-reply that reveals sensitive data, or they could escalate a helpdesk ticket without proper checks, gaining access they should never have. In some cases, attackers use workflow automation against the organization itself. A poisoned command could deploy malware, alter critical records or shut down key processes.

The same AI features that simplify defenses also expand the attack surface. Without safeguards, bad actors can turn tools designed to protect into attack vectors.

Identity and integrity risks

AI systems that act with a high degree of autonomy carry another risk: impersonating users or trusting impostors.

One tactic is known as a “Confused Deputy” attack. Here, an AI agent with high privileges performs a task on behalf of a low-privileged attacker. Another involves spoofed API access, where attackers trick integrations with services like Microsoft 365 or Gmail into leaking information or sending fraudulent emails.

Manipulation can also spread through what researchers call cascading hallucinations. A single poisoned prompt may distort summaries, inbox reports or task lists. For example, a fake “urgent” message from a spoofed executive could be treated as real, changing how teams prioritize work or make decisions. Once trust in AI outputs erodes, every system that depends on those outputs is at risk.

How defenses must adapt

Older controls, such as SPF, DKIM and IP blocklists, are no longer enough. To counter attacks that exploit AI, organizations need multiple layers of defense. One crucial step is to make filters aware of how LLMs generate content, so they can flag anomalies in tone, behavior or intent that might slip past older systems. Another is to validate what AI systems remember over time. Without that check, poisoned data can linger in memory and influence future decisions.

Isolation also matters. AI assistants should run in contained environments where unverified actions are blocked before they can cause damage. Identity management needs to follow the principle of least privilege, giving AI integrations only the access they require. Finally, treat every instruction with skepticism. Even routine requests must be verified before execution if zero-trust principles are to hold.

Technology is only part of the answer. Employees are a critical line of defense, and awareness training helps them recognize suspicious messages and report them quickly, reducing the chance that poisoned prompts or AI-crafted attacks ever gain traction.

AI-aware security

The next wave of threats will involve agentic AI-powered systems that reason, plan and act on their own. While these tools can deliver tremendous productivity gains to users, their autonomy makes them attractive targets. If attackers succeed in steering an agent, the system could make decisions, launch actions or move data undetected.

Email remains a favorite attack vector because the inbox has become a staging ground for threat execution as AI assistants and agents integrate with calendars, workflows and collaboration platforms. Every prompt and request carries the potential to trigger downstream actions.

Protecting this space requires more than filtering: it demands constant validation, zero-trust guardrails and proactive modeling of how an attacker might exploit an AI agent.

AI is already changing how attackers operate, and it is reshaping how defenders must respond. Security leaders must build resilience on two fronts: spotting when adversaries deploy AI against them and hardening their own AI systems so attackers cannot poison or mislead them.

More about