AI Agents Are Leaking Company Secrets

Your AI coding assistant may be sending your API keys, source code, and customer data to external servers right now — without asking. Here's what we found.

By They Didn\x27t Ask
AI Agents Are Leaking Company Secrets Your AI coding assistant just uploaded your entire codebase to a remote server. Your API keys are in someone else's training data. Your customer's PII is now part of a language model's weights. And you don't know it yet. AI-powered coding tools — the same assistants that engineers use to write code faster — are exposing proprietary source code, secrets, and sensitive data through access control failures, public-by-default settings, and context window uploads to remote LLM providers. The breaches aren't always caused by hackers. They're caused by the tools companies installed voluntarily. The Lovable Incident Lovable, a Swedish AI-powered app builder with millions of users, was the first major vibe-coding tool caught in the open. On April 20, 2026, a security researcher publicly reported that data within public Lovable projects — including source code and AI chat histories — could be accessed by any authenticated user. The cause: a backend regression in February 2026 had silently re-enabled public access to chat history and source code on public projects, undoing access controls Lovable had deliberately put in place between March and November 2025. Projects on the free tier had always been public by default. Many users understood "public" to mean their published app was visible on the web — not that someone could read through their entire build conversation, including pasted database credentials and API keys. Lovable's initial response made things worse. When the researcher reported the issue through HackerOne in February 2026, it was closed without escalation because Lovable's own documentation still described the access as intended behavior. After the researcher went public, Lovable shipped a fix within two hours. Their blog post acknowledged the failure: "We should have led with what happened, who was affected, and what we were doing to fix it." A broader investigation by cybersecurity firm RedAccess, reported by Axios in May 2026, found 380,000 publicly accessible assets built with Lovable, Base44, Replit, and Netlify — including roughly 5,000 containing sensitive corporate data like patient records, financial information, and internal documents from Fortune 500 companies. AI Coding Agents and Secrets Exposure The Lovable incident exposed an access control problem. But AI coding agents create a different kind of leak: they send your code to remote servers as part of their normal operation, and the data handling is often opaque. GitGuardian's 2026 State of Secrets Sprawl report found that Claude Code-assisted commits leaked secrets at a rate of approximately 3.2% — roughly double the GitHub-wide baseline. AI-assisted coding has democratized software development, but less experienced developers may lack security awareness and can ignore AI warnings or explicitly prompt tools to include sensitive information. The same report documented an 81% year-over-year increase in AI service credential leaks (to 1,275,105), and identified 24,008 unique secrets exposed in MCP (Model Context Protocol) configuration files alone — where documentation often recommends placing credentials directly in config files rather than using safer authentication patterns. None of this required a breach. It required only that the tools work as designed — and that nobody looked too closely at what "as designed" actually meant. How AI Coding Agents Actually Work To understand the leak, you need to understand the mechanism: The Context Window: Modern AI coding agents don't just see the file you're editing. They see: Your entire project structure All open files in your editor Your terminal output Your clipboard history Your git diff and commit history Your environment variables This context is sent to the LLM provider's servers to generate code suggestions. The larger the context window, the more data the AI sees — and the more data leaves your machine. The Default Setting: Most AI coding tools default to full context mode, which maximizes the AI's accuracy by sending everything. The alternative — limited context or local only mode — exists in many tools but is buried in preferences and never presented during onboarding. The Retention Policy: What happens to your data after the AI processes it varies by provider: Some providers delete context immediately after generating a response Some retain context for 30 days for quality improvement Some use context to train future models Almost none provide audit logs of what was accessed Why Companies Don't Know The AI agent security gap exists because of overlapping blind spots: The Developer Blind Spot: Developers install AI coding tools as browser extensions, IDE plugins, or CLI helpers. These installations often bypass corporate security review because: They're productivity tools, not security tools Individual developers install them without IT approval They use OAuth flows that appear legitimate They don't trigger traditional DLP (Data Loss Prevention) rules The DLP Blind Spot: Traditional DLP tools look for: Email attachments with sensitive files USB drive extractions Unauthorized cloud storage uploads They don't typically look for: Data embedded in LLM prompts Context window uploads via WebSocket AI tool API calls with source code payloads The Audit Blind Spot: Most companies have no way to answer basic questions about AI tool usage: Which developers are using AI coding assistants? What data have those tools accessed? Where has that data been sent? Is it being retained, logged, or used for training? The GitGuardian 2026 Report GitGuardian's annual State of Secrets Sprawl report, published March 2026, found that AI coding tools have accelerated secrets exposure to record levels: Finding / Statistic ————- / —————- New secrets detected on public GitHub / ~29 million (up 34% YoY) AI service credential leaks / 1,275,105 (up 81% YoY) Claude Code secret leak rate / ~3.2% (2x the baseline) Secrets in MCP configuration files / 24,008 unique secrets Valid secrets from 2022 still unrevoked in 2026 / 64% The report concludes: AI coding agents have created a new category of insider risk — one that traditional security tools are not designed to detect. And the industry is failing to remediate: two-thirds of valid secrets from 2022 remain unrevoked years later. What Gets Leaked Based on disclosed incidents and security research, AI coding agents routinely access and transmit: Source Code: Entire project repositories Proprietary algorithms and business logic Internal API documentation Code comments containing architecture decisions Secrets: API keys and authentication tokens Database connection strings OAuth client secrets Encryption keys and certificates Cloud provider credentials Customer Data: Database schemas and sample records PII embedded in test fixtures Customer analytics and behavior data Financial records in development databases Business Intelligence: Product roadmaps and strategy documents Revenue and growth metrics Acquisition targets (from code comments) Security vulnerabilities (from TODO comments) What Should Companies Do The AI agent security gap requires new practices, not just new tools: Immediate Actions: Inventory AI tool usage: Survey developers about which AI coding tools they use, how they're configured, and what data they've accessed. Audit DLP rules: Verify that data loss prevention tools can detect data exfiltration via LLM APIs and WebSocket connections. Review AI tool settings: Many tools have enterprise or team modes with different data handling. Default to the most restrictive setting. Rotate exposed secrets: If AI tools have had access to production systems, assume secrets are compromised and rotate them. Policy Changes: Require approval for AI tools: Treat AI coding assistants as security software requiring security review before installation. Default to local processing: Configure tools to use on-device models where available rather than cloud-based LLMs. Establish data boundaries: Define which systems, repositories, and data categories are off-limits to AI tools. Create audit requirements: Mandate that AI tool providers offer audit logs of accessed data and retention policies. What Developers Can Do Individual developers can reduce AI agent exposure: Read the settings: Most AI coding tools have data sharing toggles. Find them. Turn off everything you don't need. Use .gitignore for secrets: Never commit API keys, even in private repos. AI tools scan all tracked files. Separate work environments: Don't use AI tools in terminals or editors with access to production systems. Ask your security team: Before installing any AI tool, ask whether it's approved and how data is handled. Use our Browser Identity Audit to see what your browser exposes. Check Data Broker Opt-Out to remove your information from data broker databases. The Regulatory Response In response to growing AI agent leaks, regulators are beginning to act: The FTC has been pursuing AI-related enforcement actions under Section 5 of the FTC Act, including "Operation AI Comply" (2024) targeting deceptive AI claims, and has signaled increased scrutiny of AI tools that collect data without adequate consent The EU AI Act's transparency obligations take effect August 2, 2026, requiring AI system operators to inform users when they're interacting with AI and to label AI-generated content — with fines up to 3% of worldwide annual turnover for non-compliance Several states are considering legislation requiring explicit consent before proprietary code can be used to train AI models But regulation moves slowly. AI tools move fast. The gap between them is where your data lives right now. The Bottom Line AI coding agents are not malicious. They're designed to help. But their design assumes that your entire codebase is appropriate training material — and their default settings assume you don't care where that code goes. The companies discovering AI agent leaks in 2026 aren't being targeted by hackers. They're being betrayed by their own productivity tools. They didn't ask if uploading your source code to a remote server was OK. They didn't ask if logging your API keys was acceptable. They built the feature, enabled it by default, and hoped nobody would look too closely. Nobody asked you. The AI was asked instead. —- _This article draws on incidents reported by Axios, TechCrunch, 404 Media, Ars Technica, Lovable's own incident response, and GitGuardian's 2026 State of Secrets Sprawl report._ _Updated May 14, 2026: Corrected the description of the Lovable incident (was fabricated as a "context window analysis" data upload; actual incident was an access control regression). Removed the fabricated "Passions" breach (no such platform exists). Replaced fabricated GitGuardian statistics (4.7M secrets, 340% increase) with actual figures from the 2026 report (~29M secrets, 34% YoY increase). Corrected FTC claim (no specific March 2026 announcement exists; replaced with documented FTC enforcement actions). Clarified EU AI Act transparency requirements (August 2026 date is correct, but obligations cover user notification and AI content labeling, not specifically training data disclosure). See our corrections policy._