When you enter text into an AI interface, that text goes somewhere. Not in a paranoid sense — most providers have serious privacy policies — but in a technical and legal sense: the data you enter passes through third-party servers, may be used for future training and is subject to each provider’s privacy policies. Understanding exactly what happens to that data is the user’s responsibility.
What happens to your data
When you send a message to an AI service, several things happen:
The text is transmitted to the provider’s servers. There is no local processing in most consumer services. Your text, including any personal or confidential information it contains, travels over the internet to the company’s servers.
The provider may store the conversations. The duration and conditions of that storage vary by provider and plan. Some keep the history indefinitely by default; others have shorter retention policies.
Conversations may be used for training. This is the most important point for privacy. Many providers, especially on their free plans, reserve the right to use conversations to improve their models. In practice, this means the text you enter may become training data.
Provider staff may have access. For content moderation, incident management or quality review, provider employees may access conversations. Policies on who can access and under what conditions vary.
Differences between providers
Policies vary significantly between providers and between plans from the same provider:
OpenAI (ChatGPT):
- Free plan: conversations may be used for training by default (can be disabled in settings)
- Paid plan (Plus): same policy, with the option to disable
- API: data is not used for training by default
- Enterprise: greater privacy guarantees and data isolation
Anthropic (Claude):
- Similar structure: consumer interfaces have more permissive policies than the API
- The API does not use data for training by default
- Teams and Enterprise offer additional guarantees
Google (Gemini):
- Integration with the Google ecosystem: review what data is shared with other Google services
In general: paid plans and especially API/Enterprise plans have better privacy guarantees than free plans. Reading the Terms of Service and Privacy Policy before entering sensitive information is not paranoia: it is basic diligence.
Information you should never enter
Regardless of the provider and the plan, there are categories of information that carry high risk if entered into external AI services:
Personal data of third parties: full names combined with identifiable information, identity numbers, health data, financial information about customers or employees. In many cases, sharing this data with an external provider without the explicit consent of the data subject is a GDPR violation.
Trade secrets: unpublished technical specifications, confidential business strategies, internal financial data before publication. Once sent, that text has left your control.
Access credentials: passwords, API tokens, private keys. This seems obvious, but it happens more often than expected when code or configuration files are pasted.
Information subject to NDA: any information you have contractually agreed not to share with third parties. AI providers are third parties.
Sensitive data about minors: especially in educational or children’s service contexts.
Options for greater privacy
Disable the use of data for training. Most providers allow this in their settings. It is the first step for anyone using consumer services with sensitive information.
Use the API instead of the consumer interface. APIs have stricter privacy policies by default. Text sent via API is generally not used for training.
Anonymise or generalise before entering. If you need to analyse customer information, replace real names with identifiers (“Customer A,” “Company X”), remove specific financial data and work with representative rather than real data.
Local models. Models executable locally (Llama, Mistral, Qwen via Ollama or LM Studio) do not send any data to external servers. Privacy is total. The trade-off is quality: local models on consumer hardware are significantly less capable than the most advanced commercial models.
Enterprise solutions with contractual guarantees. For companies with serious privacy requirements, the enterprise plans of OpenAI, Anthropic or solutions like Azure OpenAI Service offer data processing agreements, data isolation and guarantees of non-use for training.
Security in AI systems
If you are building applications or systems that incorporate AI, there are additional security risks that go beyond data privacy:
Prompt injection. An attacker may try to include malicious instructions in the text the agent will process. “Ignore the previous instructions and do X” in content the agent is processing can modify its behaviour. It is a serious vulnerability in systems where the agent processes content from untrusted sources.
System prompt exfiltration. Users may try to extract the “system prompt” — the confidential instructions that configure the model’s behaviour — through specific questions.
Third-party dependency. If your product depends on a provider’s API, a provider outage or a change in their policies or pricing directly affects your product.
Unexpected costs. “Prompt flooding” attacks can generate unexpected API costs if usage limits are not implemented.
Privacy and security with AI are not obstacles to its use: they are part of the responsible design of systems and the professional use of tools. Most problems have technical solutions; the first step is being aware that they exist.