Anthropic ordered to pay $1.5 billion for training on pirated books

By axadle On Sep 7, 2025

Anthropic settlement is a turning point in the AI-versus-authors fight — but the larger battle is only beginning

The tentative deal announced this week, in which Anthropic has agreed to pay at least $1.5 billion to resolve a U.S. class action alleging that the company trained its Claude models on pirated books, marks one of the clearest commercial reckonings so far between generative-AI companies and the creators whose work fuels them.

- Advertisement -

On its face the settlement is startling: the filing says it covers roughly 500,000 titles and would translate to about $3,000 per work — about four times the minimum statutory damages under U.S. copyright law. It also requires Anthropic to destroy pirated copies it admits to having downloaded for training, while keeping rights to books it legitimately purchased and scanned. For authors and their advocates, the deal sends a potent message. “This settlement sends a strong message to the AI industry that there are serious consequences when they pirate authors’ works to train their AI,” Mary Rasenberger, CEO of the Authors Guild, said in support of the agreement.

Not a clean victory for either side

Yet the legal record is more complicated than a straightforward win for plaintiffs. In June, U.S. District Judge William Alsup concluded that Anthropic’s use of books in the act of training — the moment when text helps a model learn patterns of language — was sufficiently “transformative” to qualify as fair use. But Judge Alsup stopped short of endorsing the company’s broader practice of creating and retaining a permanent digital library of millions of downloaded books, some of them pirated. That, he ruled, falls outside the protective reach of fair use.

The net effect is a split decision: training can be framed as fair use, but building a searchable, storable trove from pirated sources cannot. The settlement appears to pay for the latter, while leaving large questions about how far companies may go in assembling and re-using data going forward.

What this means for AI’s business model

Anthropic’s settlement arrives at a moment when the industry is flush with cash: the company disclosed a $13 billion funding round that values it at about $183 billion. Investors and competitors alike — from OpenAI and Google to Meta and Microsoft — are racing to scale the foundation models that promise new productivity tools, search replacement and creative companions. Those models, however, are thirsty for data.

Anthropic’s experience underlines a dilemma for builders: should they keep harvesting the internet’s messy troves, rely only on licensed or public-domain materials, or pay creators upfront? Each path carries costs and trade-offs. Licensing contracts scale into millions or billions of dollars for the largest models; restricting training data to cleared sources could slow progress or concentrate power among a handful of deep-pocketed firms; and aggressive scraping invites more lawsuits, bad press and, as here, settlement bills running into the billions.

Legal patchwork, global consequences

David Lammy Named UK Deputy Prime Minister After Rayner Resigns

Sep 6, 2025

Labour MEP Declares Orbán Unwelcome in Dublin Ahead of Demonstration

Sep 6, 2025

The case also highlights the legal uncertainty still surrounding AI training. Courts have issued divergent signals. A San Francisco judge recently sided with Meta, finding that the company’s alleged use of authors’ work to train its Llama models was “transformative” enough to be fair use. Other suits, including a fresh complaint accusing Apple of using pirated books to train features branded as “Apple Intelligence,” are working their way through the system.

Outside the U.S., the picture is even more fragmented. Copyright laws vary; some jurisdictions place greater emphasis on authors’ economic rights, others on public interest or innovation. That regulatory patchwork creates incentives for companies to adopt broad, catch‑all strategies that may pass legal muster in one place but flounder in another. It also raises thorny cross-border enforcement questions: how will settlements negotiated in U.S. courts shape conduct by companies operating globally?

For writers, a mixture of relief and lingering unease

For many authors, the settlement brings both vindication and unease. Plaintiffs Andrea Bartz, Charles Graeber and Kirk Wallace Johnson — whose suit helped precipitate this settlement — argued that their books were copied without permission, credit or compensation. “This landmark settlement far surpasses any other known copyright recovery,” plaintiffs’ attorney Justin Nelson said, framing the deal as precedent-setting for the AI era.

At the same time, judges’ recognition that training can be “transformative” leaves open a pathway for companies to argue that some uses of copyrighted material are lawful. Creators face an uncertain horizon: will they negotiate licensing frameworks that deliver ongoing revenue, or will they find their work turned into training fodder that courts deem acceptable? And even if licensing emerges as the norm, who will pay — a handful of giant tech firms, the end-user, or a mixture passed down through subscription fees?

Broader questions for society

This episode asks bigger questions than any single settlement can answer. How should democracies balance the economic rights of creators against the social benefits of broadly capable AI systems? Could mass licensing that compensates authors fairly coexist with open-science and open-source movements that prize freely shared data? What protections should exist for less-established creators — the independent novelist, the academic scholar, the small-press poet — who lack the clout to litigate or negotiate?

And there is a cultural angle as well. Books are not merely data points; they are labor, cultures, ideas and livelihoods. The sight of an algorithm regurgitating phrases learned from a stranger’s novel can feel like a violation of an implicit social compact. As AI moves from novelty to infrastructure, societies must confront whether old intellectual property frameworks suffice or whether new norms and laws are needed.

The Anthropic settlement does not settle those debates. It does, however, change the calculus for companies and creators alike: in an industry awash with capital, copyright enforcement can still bite, and the costs of ignoring authors are real. The next chapters will be written in courtrooms, legislatures and boardrooms around the world. Will policymakers seize the moment to clarify rules, or will companies and courts continue to establish precedents incrementally — and unevenly? The answer will shape not just markets and lawsuits, but the very literary and cultural ecosystem that feeds our increasingly automated future.

By Ali Musa
Axadle Times international–Monitoring.