LLM privacy policies keep getting longer, denser, and nearly impossible to decode
People expect privacy policies to explain what happens to their data. What users get instead is a growing wall of text that feels harder to read each year. In a new study, researchers reviewed privacy policies for LLMs and traced how they changed.

Policies keep getting longer
Researchers looked at privacy policies from 11 providers and tracked 74 versions over several years. The average policy reached about 3,346 words, which is about 53 percent longer than the average for general software policies published in 2019.
This growth follows the expansion of AI services. New features introduce new data types and new usage scenarios, and rules differ across regions, which leads to new disclosures. Providers tend to build on top of existing text rather than revise it, so each update adds more material for users to work through.
Beyond the main policies, providers publish extra documents such as model training notices or regional supplements. Users must read these alongside the core policy to understand how their data is handled. Information is spread across several pages, which increases the effort needed to look up how a service collects or uses data.
Privacy policies written at a level few users can manage
Length is not the only challenge. Reading difficulty reaches a level usually expected from advanced college students. Earlier software policies measured far lower. This rise signals that users must work through dense sentences and complex explanations.
The writing often includes long phrases describing legal grounds, data handling steps, retention rules, and regional rights. These topics are hard enough on their own, and the style used in the documents adds extra strain. Sentences stack several conditions together, leaving users to sort through technical points that feel far removed from everyday experience.
Vague wording makes it hard to know what happens to data
Besides length and difficulty, the study found widespread vagueness. Providers often use wording that avoids firm commitments by relying on terms like may or might, which leaves readers unsure about how their information is handled. This wording also makes it hard to know what triggers certain actions in the service.
Vague wording is not unusual in privacy notices, but its persistence in LLM policies stands out. Readers cannot tell when a process will apply or how often it will occur. This matters because prompts, uploads, and outputs can include sensitive material. When a policy describes these flows with softened language, users have little sense of what will happen to the information they provide.
Training and user rights bring new questions
Training on user data appears in several policy versions with varying limits. Some providers use firm language in early policies. Later versions shift the tone or list more conditions. After a regulatory action in Europe, one provider revised its terms to describe training practices with greater detail and added new user controls.
The researchers note that providers often say that data used for training is aggregated or stripped of identifiers. Later edits may soften those claims. Some policies add statements that providers can match data back to a person when required by law. This creates tension for users who want certainty about how their information is handled.
User rights sections grow more complex too. They include access, correction, and deletion rights as well as options tied to model development. Some of these rights come with limits. A policy states that a provider will try to correct any inaccurate detail produced by the model, then adds that this may not be possible. Some age limits have gone up, and one policy even says anyone under 18 is a child.
User rights aren’t worth much if they’re buried under layers of legal fog. If LLM providers want trust, they’ll need to deliver policies that users can understand, and rights people can use.