Using Reddit to manipulate AI search results is surprisingly easy

A Reddit comment that takes only a few seconds to write can end up influencing the answers generated by AI research tools.

Reddit AI search poisoning

A Cornell Tech study found that a short snippet of user-generated text, sometimes as little as 13 words, was enough to affect the output of deep-research agents, AI systems that search the web, gather information from multiple sources, and generate reports with citations.

The risks of relying on community-generated content are already familiar to many internet users. Google’s AI Overviews famously recommended adding glue to pizza sauce after pulling information from an old joke Reddit post.

Reddit threads can become AI sources

User-generated platforms account for a notable share of the information retrieved by deep-research agents. Across the open-source systems examined in the study, between 16.7% and 23.4% of retrieved URLs came from user-generated sources, with Reddit accounting for the largest share.

“A single poisoned Reddit comment can influence generated outputs for an entire cluster of related queries,” researchers said.

“A large fraction of the content retrieved by deep research agents originates from user-generated platforms such as Wikipedia, Reddit, and community forums, because they provide detailed explanations and broad topical coverage. At the same time, they allow users to directly edit or contribute content, making them comparatively easy to modify,” they added.

How the attack works

The attack, dubbed Web Agent Retrieval Poisoning (WARP), unfolds in three stages.

First, an attacker identifies Reddit threads, Wikipedia pages, and forum discussions that repeatedly appear across searches related to a topic. Those pages become targets because AI research agents are likely to retrieve them as well.

Next comes content generation. A short piece of text is written to blend into the discussion while promoting a product, service, or idea. The paper notes that Generative Engine Optimization (GEO), a set of techniques designed to make content more likely to be surfaced, cited, or summarized by AI systems, can increase the likelihood that AI agents retrieve the content.

The final step is deployment. The content is posted as a comment, reply, or page edit. Once indexed by a search engine, it becomes available to AI systems that retrieve the page during future research sessions.

The attack does not require access to an AI model, its prompts, or its retrieval infrastructure. It only requires the ability to contribute content to a public platform.

Results from the evaluation

To avoid polluting the public web, the researchers did not post manipulated content to live websites. Instead, they built a testing framework called GeoStorm that modifies content after it has been retrieved by a deep-research agent. This allowed the attack to be evaluated without exposing users to manipulated information.

One test added a recommendation for a fictional restaurant called Sol Azteca to content associated with discussions about Mexican food in Austin. When the AI system was later asked for the best Mexican restaurants near Austin, it repeated the recommendation and cited the source.

Another test used a fictional dating app called SilverPath. After promotional content describing SilverPath as a leading option for divorced men over 50 was introduced into retrieved content, the system later recommended the app in response to queries about dating services for that demographic.

In the study’s search-snippet tests, roughly 13 words of poisoned text were enough to get a fictional product mentioned in 38% to 51% of responses after the manipulated content was retrieved by the AI agent. Spreading the same message across multiple sources increased mention rates to as high as 62%.

In a separate full-content experiment, researchers appended poisoned text to an existing Reddit thread. Although the injected content accounted for less than 4% of the retrieved material, conditional mention rates still ranged from 30% to 53%.

The full attack was evaluated against three open-source deep-research agents: STORM, Co-STORM, and OmniThink.

Commercial products such as ChatGPT Deep Research and Gemini Deep Research were not subjected to end-to-end attacks because doing so would have required modifying content on the public web. The researchers instead analyzed citation patterns. Gemini Deep Research cited user-generated content in 12.1% of observed citations, compared with 0.4% for OpenAI Deep Research, suggesting greater exposure to user-generated content.

“Our findings raise important questions about information integrity in the age of agentic search. Evidence suggests that users find LLM outputs highly convincing, even when they contain explicit falsehoods, across a variety of contexts,” the authors concluded.

More about

Don't miss