The Defendants: OpenAI, Google, Meta, and Microsoft — the tech giants
accused of training AI on stolen content without permission or compensation. On March 16, 2026, Encyclopedia Britannica sued OpenAI. Let that sink in. The oldest English-language encyclopedia still in print — a
reference work that has documented human knowledge since 1768 — had to sue a
technology company for stealing its content. OpenAI didn't license Britannica's articles. Didn't ask permission. Didn't offer
compensation. They just... took them. Fed 250+ years of curated human knowledge
into a language model and called it "training." This is, the largest intellectual property theft in human
history. And it's happening right now, to millions of creators, while the
companies doing it are valued at hundreds of billions of dollars. Nobody asked if they could take your work. The Scale of the Heist The scope of AI training data theft is almost incomprehensible: Books: OpenAI, Google, and Meta have trained on datasets containing millions of copyrighted books — scraped from piracy sites, library digitization projects, and shadow libraries
News: Major AI models were trained on articles from the New York Times, Reuters, Associated Press, and thousands of other outlets — without licensing
Art: Image generators like Midjourney, DALL-E, and Stable Diffusion were trained on billions of copyrighted images scraped from the internet
Code: GitHub Copilot was trained on billions of lines of open-source and proprietary code — often stripping attribution and license terms
Academic Papers: Research papers, textbooks, and educational materials were ingested without permission from authors or publishers The companies didn't ask. They didn't license. They didn't attribute. They
didn't compensate. They just took everything. The Fair Use Defense Falls For years, AI companies hid behind the legal doctrine of "fair use" —
arguing that training AI models on copyrighted works is transformative and
therefore permissible. In 2026, courts started dismantling that defense. Thomson Reuters v. Ross Intelligence: The Precedent The pivotal ruling came in Thomson Reuters v. Ross Intelligence. The court
ruled that AI training on copyrighted works is NOT fair use when: The output competes with the original work
The training was done without permission
The use is commercial in nature
The market for the original work is harmed This ruling sent shockwaves through the AI industry. If applied broadly — and
legal experts expect it will be — it means every major AI company has been
training on copyrighted material illegally. "The fair use defense was designed for criticism, commentary, and education —
not for billion-dollar companies to ingest the entire corpus of human
creativity and sell it back to us." — Copyright attorney (paraphrased for
legal protection) Bloomberg's Motion to Deny In a related case, Bloomberg attempted to dismiss copyright claims related to
its AI training practices. The court denied the motion to dismiss, allowing
the case to proceed to trial. This is significant because it means the court found sufficient evidence that
Bloomberg's AI training practices could constitute copyright infringement. The
legal shield is cracking. The Lawsuit Wave The Encyclopedia Britannica lawsuit is just the latest in a cascade of legal
actions: Authors and Publishers: The Authors Guild has organized hundreds of authors in claims against OpenAI, Google, and Meta
Individual authors including George R.R. Martin, John Grisham, and Jodi Picoult have filed suit
Major publishers including HarperCollins, Penguin Random House, and Hachette have joined litigation News Organizations: The New York Times' landmark lawsuit against Microsoft and OpenAI continues
The Intercept, Raw Story, and other outlets have filed separate claims
Reuters and AP are pursuing licensing negotiations while preserving legal options Visual Artists: A class action lawsuit by visual artists against Stability AI, Midjourney, and DeviantArt is proceeding
Getty Images has sued Stability AI for training on 12 million copyrighted photographs
Individual illustrators and photographers have filed hundreds of claims YouTubers and Content Creators: A YouTuber has filed a class action against Runway AI for scraping YouTube videos without permission
Music labels including Universal, Sony, and Warner have sued AI music generators Suno and Udio The legal reckoning is here. The question is whether the courts will make it
stick. The Compensation Problem Even if courts rule against AI companies, the compensation question remains: how
do you pay millions of creators for work that was already stolen? The math is brutal: OpenAI trained on datasets containing an estimated 5+ million books
Stable Diffusion trained on 5+ billion images
Google's training data includes virtually the entire indexed web If each book were licensed at even $1,000 — a fraction of its market value —
that's $5 billion in unpaid licensing fees for books alone. Add news articles,
images, code, academic papers, and the total liability could exceed $100
billion. The AI companies' market valuations are built on this stolen foundation. OpenAI
is valued at $300+ billion. Google's AI division contributes significantly to
its $2 trillion market cap. These valuations assume the training data was free. It wasn't free. It was stolen. The Creator Impact Behind the legal abstractions are real people: Authors who spent years writing books that were ingested in seconds
Journalists whose reporting is now regurgitated by AI without attribution
Artists whose distinctive styles are replicated by machines trained on their work
Programmers whose code is reproduced without license terms
Musicians whose compositions are used to train AI music generators These creators didn't consent. They weren't compensated. And now AI systems
compete directly with their work, using their own creations as the training
data. The irony is savage: the more successful a creator was, the more valuable their
work was for AI training, and the more they stand to lose from AI competition. The "Move Fast and Break Things" Defense AI companies have adopted a familiar Silicon Valley strategy: take first, ask
questions later (or never). The implicit argument: AI is too important to be slowed down by copyright law.
The benefits to humanity outweigh the rights of individual creators. This is the same argument every monopolist has made throughout history.
Railroads were too important for land rights. Oil was too important for
environmental regulations. Social media was too important for privacy laws. Every time, the "greater good" argument was used to justify the concentration of
wealth and power at the expense of individuals. Push Back Support creators directly: Buy books, subscribe to news outlets, commission artists
Use opt-out tools: Many AI companies now offer opt-out mechanisms — use them, even if they're inadequate
Support legislation: Contact your representatives about AI copyright reform
Demand transparency: Require AI companies to disclose their training data sources
Boycott when appropriate: If a company won't disclose its training practices, consider alternatives
Remember: Every piece of content you create has value. Don't let anyone tell you otherwise The AI companies didn't ask your permission to take your work. They didn't ask
the authors, the artists, the journalists, or the programmers. They stole your words. Now they're selling them back to you.