AIhighUpdated 2026-04-17orig. 2026-04-0814 min

What ChatGPT, Claude, and Gemini Actually Keep About You

Every AI chatbot retains your conversations. Retention periods, training use, law enforcement access, and breach history vary dramatically. A practical data privacy map of ChatGPT, Claude, Gemini, Copilot, Grok, and Meta AI. Including the NYT v. OpenAI court order requiring indefinite retention.

Phillip (Tre) Bucchi·Founder, Valtik Studios. Penetration Tester

Founder of Valtik Studios. Penetration tester. Based in Connecticut, serving US mid-market.

The AI chatbot has your diary

Think about the last ten conversations you had with ChatGPT, Claude, or Gemini. Medical symptoms you wouldn't bring up with your doctor. The contract clause stressing you out. The email you drafted but didn't send. The breakup you processed in real time by typing it out. The business idea you're not ready to share. The political position you'd never post publicly.

Every word of that is retained. Different platforms for different durations. Different jurisdictions with different legal compulsion exposures. Some training on your data by default. Some not. The privacy model of "my chats with an AI" is substantially worse than you probably assume.

What ChatGPT, Claude, and Gemini remember

None of those conversations are private in the way you probably assume. Every major AI chatbot retains data. The retention periods, the access policies, the legal compulsion exposure. And the training use vary dramatically between providers. This post is the practical map. What each platform keeps, what they do with it, what law enforcement can pull, and how to minimize your exposure.

Baseline: what all AI chatbots collect

Regardless of provider:

Prompt text and your conversation history. Retained for some period.
Account data. Email, signup IP, payment data if paid, device fingerprint.
Usage metadata. Timestamps, session duration, feature usage, model selected.
Crash reports and telemetry. App-side error reporting.
Support interactions. If you've emailed support, those threads are retained indefinitely in most cases.

The variability starts with conversation content. Below is each major provider's policy as of April 2026.

ChatGPT (OpenAI)

Retention policy for free and Plus tier.

Conversations are retained to provide service history in the app.
Training use: by default, your conversations are used to train OpenAI's models unless you opt out via Data Controls → "Improve the model for everyone" toggle.
Explicit deletion. User-deleted conversations are removed from service history within 30 days. They may persist in OpenAI's training data if training already occurred.

Retention policy for Team and Enterprise.

Enterprise: data not used for training by default.
Admin-configurable retention (down to zero retention).
SOC 2, HIPAA BAA available.

The New York Times lawsuit court order. In mid-2025, a federal court order in the NYT v. OpenAI case required OpenAI to retain every user conversation indefinitely for potential discovery, even deleted ones. OpenAI publicly objected. As of April 2026, the order remains in force.

Practical implication. Every ChatGPT conversation you've had since mid-2025, including ones you explicitly deleted, is preserved by OpenAI under the court order. It's not searched, it's not actively indexed, but it exists on OpenAI infrastructure and could be compelled into discovery.

Memory feature. ChatGPT has a "Memory" feature that persists user-specified facts across conversations. This memory is visible to OpenAI staff for abuse investigations and is subject to the same retention rules as conversations.

Third-party plugin and connector access. If you connect ChatGPT to Google Drive, GitHub, calendar, email, or other services, those integrations share data with OpenAI during use. The connector access is audit-logged by OpenAI.

Claude (Anthropic)

Retention policy for free and Pro.

Conversations retained for 30 days by default, then automatically deleted.
Training use: Anthropic doesn't train on consumer conversations by default as of 2024 policy. Explicitly opt-in required for training participation.
User-deleted conversations removed immediately from the user interface. Backend purging within a few days.

Retention policy for Team and Enterprise.

Zero Data Retention (ZDR) option available for enterprise contracts.
With ZDR, prompts are processed but not stored.
SOC 2 Type 2 certified.
HIPAA BAA available.

Claude's default position on training data. Anthropic has taken a public stance that consumer conversations aren't used for model training. This is the cleanest default among major providers.

Legal compulsion. Anthropic is US-based and subject to US legal process. Subpoenas can compel conversation data that's within retention window.

Gemini (Google)

Retention policy for Gemini Free and Gemini Advanced.

Conversations retained by default for 18 months.
User can change to 3 months, 36 months, or "off" (no retention).
Training use: human reviewers have visibility into sampled conversations for quality review. Training on sampled conversations was the default until late 2024; Google made it opt-in for users who specifically enabled "Gemini Apps Activity."

The human reviewer concern. Google has historically used human reviewers on Gemini (and Google Assistant before it) to rate response quality. Conversations selected for review are persisted for up to 3 years beyond the user's general retention setting. Google says reviewers see content without direct user identifiers. But real names, addresses, and personal information mentioned in conversations end up in reviewer queues.

Workspace integration. Gemini in Google Workspace (Gmail, Docs, Drive) uses separate retention policies. Workspace data isn't used for training.

Data download. Google Takeout includes your Gemini history. Useful for deleting everything you can see. Court-compelled data extends beyond what Takeout shows.

Microsoft Copilot

Consumer Copilot.

Conversations retained for 6 months by default.
Tied to Microsoft account across services.
Training use varies. Microsoft has reduced training on consumer data since 2024.

Copilot for Microsoft 365 (Enterprise).

Data stays within the Microsoft 365 tenant boundary.
No training on customer tenant data.
Follows existing Microsoft 365 data residency and retention policies.

GitHub Copilot.

Code suggestions are generated from models trained on public GitHub repositories.
Your code isn't sent back to training by default for Copilot Business.
For consumer GitHub Copilot, "code snippets sent" telemetry is collected unless disabled.

xAI Grok

Retention policy.

Conversations retained until user-deleted.
Training use: Grok trains on X/Twitter data and can use Grok conversations for training by default.
Linked to X account, so everything you do on X, everything in your Grok chats, is in the same data lake accessible to xAI.

Notable. Grok's data governance is the loosest of major providers. Use it for creative tasks, not for anything you don't want Musk's team or court-ordered discovery to eventually surface.

Meta AI (in WhatsApp, Messenger, Instagram)

Retention policy.

Conversations retained tied to your Meta identity.
Used for training Meta's Llama models by default.
Opt-out requires specific navigation (jurisdictions vary).

Notable. Every Meta AI conversation is cross-referenced with your Facebook/Instagram ad profile. The surveillance overlap is total.

What each provider hands to law enforcement

All US-based providers respond to valid legal process:

Subpoena: subscriber information (email, signup IP, payment details, phone number).
Court order under 2703(d): non-content records including timestamps, IP logs, service usage patterns.
Warrant: conversation content within retention window.

Provider transparency reports (law enforcement requests):

OpenAI: publishes annual transparency reports. Thousands of government requests per year, most partially or fully complied.
Anthropic: smaller volume, publishes transparency reports.
Google: extensive. Gemini data subject to same process as Gmail and Drive.
Microsoft: extensive. Copilot data subject to Microsoft 365 tenancy rules.

Foreign governments. Data requests from non-US governments vary by provider's data localization and MLAT cooperation agreements.

What happens when the AI provider itself gets breached

This isn't hypothetical.

OpenAI March 2023 breach. Redis bug exposed other users' conversation titles and partial subscription data to unrelated users. Limited scope but demonstrated risk.
DeepSeek January 2025 exposure. A public-facing database with over 1 million DeepSeek chat records, API keys, and infrastructure credentials was discovered by security researchers. DeepSeek locked down access after disclosure.
Authentication bypass on AI services throughout 2023-2025. Multiple minor incidents where session isolation failed and users saw each other's conversations.

Assume every AI provider will eventually have a public exposure of conversation data. Ranking providers by likelihood of maintaining isolation: Anthropic > Microsoft Enterprise > Google Workspace > OpenAI Enterprise > OpenAI Consumer > Meta > Grok.

What you should do

1. Use the right tier for the content.

Anything work-related with proprietary information: enterprise tier with ZDR, or don't use AI chatbot.
Personal health, legal, financial: Claude Pro (most conservative default) or ChatGPT Plus with training opt-out.
Casual creative work: any provider.
Anything illegal: no AI provider. They have compliance teams who will report on you.

2. Turn off training participation.

ChatGPT: Settings → Data Controls → disable "Improve the model for everyone."
Gemini: myactivity.google.com → Gemini Apps Activity → turn off.
Grok: Settings → Data sharing → opt out.
Claude: already not on by default.

3. Shorten retention when possible.

Gemini supports 3 months minimum.
Claude is 30 days by default.
ChatGPT enterprise can be set to zero retention.

4. Don't paste secrets.

API keys, passwords, unreleased source code, customer data: don't paste. There's no "safe for AI only" paste.
If you need AI to analyze sensitive data, use the enterprise ZDR tier or run local models (Llama, Mistral, Qwen via Ollama).

5. Audit your history periodically.

Go to each provider's history settings and delete conversations that contain anything you'd regret being part of subsequent breach reporting.

6. For the most sensitive use cases, run local.

Ollama with Llama 3.1 70B, Mistral Large, or Qwen 2.5 runs on a $2000 workstation.
Zero network egress. Zero provider retention. You control the deletion.
The performance gap to GPT-5 and Claude 4.7 is real but narrowing.

The honest summary

AI chatbot privacy is more variable than user experience suggests. Anthropic is the cleanest default for consumer use. OpenAI has the best enterprise tier. Google Gemini's default retention is the longest. Meta AI and Grok should be assumed to have zero privacy.

Your conversations are text files on someone else's computer. Treat them accordingly.

Sources

ai securitychatgptclaudegeminidata privacydata retentionconsumer cybersecuritycomplianceresearch

Want us to check your AI setup?

Our scanner detects this exact misconfiguration. plus dozens more across 38 platforms. Free website check available, no commitment required.

Free Security Check Request Full Audit

Get new research in your inbox

No spam. No newsletter filler. Only new posts as they publish.

Related Services

SOC 2 Readiness Assessment

Learn more →

What ChatGPT, Claude, and Gemini Actually Keep About You

#The AI chatbot has your diary

#What ChatGPT, Claude, and Gemini remember

#Baseline: what all AI chatbots collect

#ChatGPT (OpenAI)

#Claude (Anthropic)

#Gemini (Google)

#Microsoft Copilot

#xAI Grok

#Meta AI (in WhatsApp, Messenger, Instagram)

#What each provider hands to law enforcement

#What happens when the AI provider itself gets breached

#What you should do

#The honest summary

#Sources