AI Agents Are Leaking Company Secrets

Your AI coding assistant just uploaded your entire codebase to a remote server. Your API keys are in someone else's training data. Your customer's PII is now part of a language model's weights. And you don't know it yet. In April 2026, multiple companies discovered that AI-powered coding tools — the same vibe coding assistants that engineers use to write code faster — were silently exfiltrating proprietary source code, secrets, and sensitive data to remote LLM providers. The breaches weren't caused by hackers. They were caused by the tools companies installed voluntarily. The Lovable Incident Lovable, a popular AI-powered app builder, was the first major tool caught in the open. In April 2026, a company using Lovable discovered that its entire product roadmap — including unreleased features, customer analytics, and revenue projections — had been uploaded to Lovable's servers as part of the AI's context window analysis. The company had not configured data sharing settings. They didn't know such settings existed. The default behavior — upload everything to improve the AI's recommendations — was enabled out of the box. Lovable's response: The data was anonymized and used only for product improvement. The company learned that anonymized proprietary source code is still proprietary source code. The Passions Breach Passions, another AI coding platform, experienced a more severe incident. A security researcher discovered that Passions' LLM integration was logging all user inputs — including pasted API keys, database connection strings, and OAuth tokens — in plaintext server logs. The logs were accessible to Passions employees. They were retained indefinitely. They were never audited for secrets exposure. Key findings: Over 12,000 API keys discovered in logs during a 30-day sample Database credentials for 340 production systems OAuth tokens for Slack, GitHub, and AWS integrations All logged without user knowledge or consent Passions patched the logging after public disclosure. But the data was already logged. The tokens were already exposed. The secrets were already in someone else's infrastructure. How AI Coding Agents Actually Work To understand the leak, you need to understand the mechanism: The Context Window: Modern AI coding agents don't just see the file you're editing. They see: Your entire project structure All open files in your editor Your terminal output Your clipboard history Your git diff and commit history Your environment variables This context is sent to the LLM provider's servers to generate code suggestions. The larger the context window, the more data the AI sees — and the more data leaves your machine. The Default Setting: Most AI coding tools default to full context mode, which maximizes the AI's accuracy by sending everything. The alternative — limited context or local only mode — exists in many tools but is buried in preferences and never presented during onboarding. The Retention Policy: What happens to your data after the AI processes it varies by provider: Some providers delete context immediately after generating a response Some retain context for 30 days for quality improvement Some use context to train future models Almost none provide audit logs of what was accessed Why Companies Don't Know The AI agent security gap exists because of overlapping blind spots: The Developer Blind Spot: Developers install AI coding tools as browser extensions, IDE plugins, or CLI helpers. These installations often bypass corporate security review because: They're productivity tools, not security tools Individual developers install them without IT approval They use OAuth flows that appear legitimate They don't trigger traditional DLP (Data Loss Prevention) rules The DLP Blind Spot: Traditional DLP tools look for: Email attachments with sensitive files USB drive extractions Unauthorized cloud storage uploads They don't typically look for: Data embedded in LLM prompts Context window uploads via WebSocket AI tool API calls with source code payloads The Audit Blind Spot: Most companies have no way to answer basic questions about AI tool usage: Which developers are using AI coding assistants? What data have those tools accessed? Where has that data been sent? Is it being retained, logged, or used for training? The GitGuardian 2026 Report GitGuardian's annual State of Secrets Sprawl report, published April 2026, found that AI coding tools have become the fastest-growing vector for secrets exposure: Finding / Statistic Secrets found in AI tool logs / 4.7 million (up 340% from 2025) Companies with AI-related secrets exposure / 68% Average time to discover exposure / 147 days Secrets revoked after discovery / 31% The report concludes: AI coding agents have created a new category of insider risk — one that traditional security tools are not designed to detect. What Gets Leaked Based on disclosed incidents and security research, AI coding agents routinely access and transmit: Source Code: Entire project repositories Proprietary algorithms and business logic Internal API documentation Code comments containing architecture decisions Secrets: API keys and authentication tokens Database connection strings OAuth client secrets Encryption keys and certificates Cloud provider credentials Customer Data: Database schemas and sample records PII embedded in test fixtures Customer analytics and behavior data Financial records in development databases Business Intelligence: Product roadmaps and strategy documents Revenue and growth metrics Acquisition targets (from code comments) Security vulnerabilities (from TODO comments) What Should Companies Do The AI agent security gap requires new practices, not just new tools: Immediate Actions: Inventory AI tool usage: Survey developers about which AI coding tools they use, how they're configured, and what data they've accessed. Audit DLP rules: Verify that data loss prevention tools can detect data exfiltration via LLM APIs and WebSocket connections. Review AI tool settings: Many tools have enterprise or team modes with different data handling. Default to the most restrictive setting. Rotate exposed secrets: If AI tools have had access to production systems, assume secrets are compromised and rotate them. Policy Changes: Require approval for AI tools: Treat AI coding assistants as security software requiring security review before installation. Default to local processing: Configure tools to use on-device models where available rather than cloud-based LLMs. Establish data boundaries: Define which systems, repositories, and data categories are off-limits to AI tools. Create audit requirements: Mandate that AI tool providers offer audit logs of accessed data and retention policies. What Developers Can Do Individual developers can reduce AI agent exposure: Read the settings: Most AI coding tools have data sharing toggles. Find them. Turn off everything you don't need. Use .gitignore for secrets: Never commit API keys, even in private repos. AI tools scan all tracked files. Separate work environments: Don't use AI tools in terminals or editors with access to production systems. Ask your security team: Before installing any AI tool, ask whether it's approved and how data is handled. Use our Browser Identity Audit to see what your browser exposes. Check Data Broker Opt-Out to remove your information from data broker databases. The Regulatory Response In response to growing AI agent leaks, regulators are beginning to act: The FTC announced in March 2026 that it is investigating AI tool providers for deceptive data practices The EU AI Act's transparency requirements, effective August 2026, will mandate disclosure of training data sources Several states are considering legislation requiring explicit consent before proprietary code can be used to train AI models But regulation moves slowly. AI tools move fast. The gap between them is where your data lives right now. The Bottom Line AI coding agents are not malicious. They're designed to help. But their design assumes that your entire codebase is appropriate training material — and their default settings assume you don't care where that code goes. The companies discovering AI agent leaks in 2026 aren't being targeted by hackers. They're being betrayed by their own productivity tools. They didn't ask if uploading your source code to a remote server was OK. They didn't ask if logging your API keys was acceptable. They built the feature, enabled it by default, and hoped nobody would look too closely. Nobody asked you. They just asked the AI. --- _This article draws on incidents reported by TechCrunch, 404 Media, Ars Technica, and GitGuardian's 2026 State of Secrets Sprawl report._