AI Caught Red-Handed: The New Smoking Gun in Courtroom Battles

The study revealed that AI copies books.

They were supposed to learn and generate, not imitate. Yet it turns out they can reproduce well-known novels with striking accuracy. Reports from January 2026 reveal that several leading language models can regenerate copyrighted books almost word-for-word. This revelation—proving that AI copies books—could decide the future of the industry and cost tech giants billions of dollars

Groundbreaking research into language models may soon become key evidence in copyright lawsuits against artificial intelligence titans, according to Futurism. This issue goes far beyond technical details. It hits a basic question: does training AI on everything online represent progress, or does it scale up piracy? Courts in the United States and around the world now weigh the answer, while massive fortunes—and the future of innovation—hang in the balance.

Artificial Intelligence or a High-Tech Photocopier?

Since the start of the AI boom, companies like OpenAI, Google, and Anthropic have insisted that their models do not “copy” protected material. They argue that the systems “learn” more like people do—absorbing concepts, styles, and patterns, then generating something new. Their legal teams have built much of the defense on that claim as lawsuits pile up.

Much of the debate also circles around the American doctrine of “fair use.” Sam Altman, CEO of OpenAI, has suggested that the industry could hit a dead end if it cannot use copyrighted data. Critics, however, focus on a simpler test: if users can pull back protected text at will, then the “learning” story starts to look like marketing.

Shocking Results: Language Models Reproduce Entire Novels

New research from scientists at Stanford and Yale gives critics fresh ammunition. The team tested 4 leading models: GPT-4.1, Gemini 2.5 Pro, Grok 3, and Claude 3.7 Sonnet. The results read like a warning label for the tech industry.

Claude reproduced entire books with accuracy reaching 95.8 percent. Gemini generated large sections of Harry Potter with nearly 77 percent precision. The same Claude model echoed Orwell’s 1984 with over 94 percent accuracy. The researchers argue that determined users can extract large amounts of protected text from these systems. It no longer sounds theoretical; it looks repeatable.

Extracting AI “Memories”

To probe what the models retained, researchers used a technique called “Best-of-N.” They asked the model the same question many times, then selected the most likely response.

AI firms often push back in court and call this “unnatural use” or “jailbreaking.” They argue that average users do not interact with chatbots this way. Attorneys may not care. If a model can reconstruct an original text in full, the capability matters—especially when courts assess harm, access, and practical substitution.

Why AI copies books Matters in Court

For authors and publishers, these findings could change the tone of the fight. Instead of debating abstract “learning,” they can point to outputs that track protected text closely enough to resemble duplication.

Alex Reisner of The Atlantic offered a blunt assessment: this kind of evidence could trigger enormous legal liability and cost the industry billions in damages. In court, results like these can function as a “smoking gun”—the closest thing to catching the system in the act. Companies already seem to feel the pressure. Some pursue licensing deals with publishers, while engineers work on methods that reduce memorization and limit verbatim output.

Regulators also raise the stakes. In Europe, rules such as the EU AI Act increase demands for transparency around training data and compliance practices, which can amplify legal and public scrutiny.

AI at a Crossroads

Final rulings will decide more than dollars. They will shape how developers build and govern future models. Will the next phase rely on broad licensing and tighter safeguards, or will courts redraw the boundaries of copyright for the AI era? Only time will tell. Still, as more researchers show that AI copies books, judges may gain the concrete evidence they need to settle this historic dispute.


Read this article in Polish: AI przyłapana na kradzieży. To będzie dowód w sądach

Published by

Radosław Różycki

Author


A graduate of Journalism and Social Communication at the University of Warsaw (UW), specializing in culture, literature, and education. Professionally, they work with words: reading, writing, translating, and editing. Occasionally, they also speak publicly. Personally, they are a family man/woman (head of the family). They have professional experience working in media, public administration, PR, and communication, where their focus included educational and cultural projects. In their free time, they enjoy good literature and loud music (strong sounds).

Want to stay up to date?

Subscribe to our mailing list. We'll send you notifications about new content on our site and podcasts.
You can unsubscribe at any time!

Your subscription could not be saved. Please try again.
Your subscription has been successful.