AI has opened a new frontier of law and technology where policy and legal frameworks are struggling to catch up fast enough to the latest innovations. One key area that has yet to be definitively settled in the courts is the issue of fair use in the training of AI models. Copyright law provides an exception that permits third parties to create derivative works without permission of the copyright owner: fair use.
Guided largely by public policy considerations, fair use doctrine expressly permits use of copyrighted material without permission from an owner, for such purposes as, for example, news reporting, commenting/criticism, or scholarship (17 U.S.C §107), as well as for parody or satire. 1 A common inquiry among fair use analysis is whether the derivative work is transformative; whether the derivative uses the original work for a new purpose or for a different manner.
Extending that common inquiry to AI model training, key questions at the heart of many of these lawsuits is whether or not training an AI model on copyrighted works constitutes fair use and is sufficiently transformative of the work such as to be a permitted instance of infringement for the greater good.
Though these issues have not been settled, AI companies continue to train and develop their models, and parties continue to sue those companies for infringement. One of these litigations recently settled, involving AI tech company Anthropic, maker of the AI chatbot Claude. 2
Several authors sued for copyright infringement, ultimately settling to the tune of about $1.5 billion dollars.3 However, even more recently Judge Alsup rejected the settlement, directing counsel to further consider other important factual questions that would allow authors (i.e. class members) to appropriately partake in the settlement before a preliminary approval.4 Here are three takeaways from that case and settlement, as related to copyright and AI:
1. Judges are willing and able to both understand and rule upon fair use in training AI models
AI training data has been at the heart of the issues surrounding copyright infringement and AI for some time now. How potentially infringing material is copied, stored, and maintained, and whether a model has access to copies of the work, or to transformed data based upon the work, are all elements to consider when assessing the propriety of how an AI model was trained on registered works, and whether that training might support a claim of infringement beyond the fair use exception.5 Assessing whether an AI model is retaining improper copies of a work is an area where DisputeSoft’s expert witnesses can assist litigators to get to the answers they need to try their cases.
In this specific case, there were essentially two “buckets” of data; one bucket of legitimately-acquired works, “that were purchased, scanned, and used in training LLMs,” and another bucket of improperly-acquired “pirated” data.6
The Judge William Alsup succinctly summarized his findings. For the legitimately-acquired works, he ruled that, “the use of the books at issue to train Claude and its precursors was exceedingly transformative and was a fair use under Section 107 of the Copyright Act. And, the digitization of the books purchased in print form by Anthropic was also a fair use but not for the same reason as applies to the training copies.
Instead, it was a fair use because all Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies.”7
Essentially, training the AI data model on legitimately-acquired copies of registered works to create the chatbot Claude he deemed “exceedingly transformative and was a fair use.”8 This ruling gives some precedent and support to bolster AI tech companies in using legitimately-acquired data to train their models will be considered fair use in the eyes of the law, a position that AI tech companies have held for many years.
2. Whether training data was acquired legitimately is and will be a key issue moving forward
Judge Alsup did take issue in this case with the second bucket of “pirated” works, however. These copies of works were not legitimately acquired, and it is on these grounds that the judge was allowing the case to proceed. He succinctly summarized the issue, and ruled that, “Anthropic had no entitlement to use pirated copies for its central library. Creating a permanent, general-purpose library was not itself a fair use excusing Anthropic’s piracy.”
Basically, the ends do not justify the means, even if fair use is in play, essentially drawing the line between purchasing a book, then using that to train your data, and stealing a book. The Judge is basically saying the potential fair use of a general-purpose library does not excuse the improper acquisition.
3. Even if these issues aren’t settled in court, AI tech companies are willing to settle
Anthropic is attempting to settle this case for a whopping $1.5 billion dollars, essentially a $3,000-per-book license to use each of the 500,000 books at issue in this lawsuit for its AI training data.9 In the “move fast and break things” tech startup world, there are real risks involved with tech companies improperly using or acquiring registered works to train their models in the race to be the first to market with the next big AI application.
It is possible, and even likely, that we will see more settlements in the future from companies that, in their rush to get the product out the door, made some mistakes in developing their products. If anything, the fact that Anthropic is still going strong, and can afford to pay such a huge settlement, indicates that even with the risks and harm, it was still lucrative enough to ask for forgiveness instead of permission, a bit of a chilling lesson.
When it comes to complex litigation, technology due diligence, and AI and copyright infringement, DisputeSoft is equipped with the skills and ability help you and your litigation team make sense of fast-moving technology. We understand the issues at the intersection of AI, copyright, and code.
- For more information from DisputeSoft about “fair use”, see https://www.disputesoft.com/artificial-intelligence-systems-present-copyright-infringement-concerns-and-challenges/
↩︎ - https://www.anthropic.com/company ↩︎
- https://www.npr.org/2025/09/05/nx-s1-5529404/anthropic-settlement-authors-copyright-ai ↩︎
- https://news.bloomberglaw.com/ip-law/anthropic-judge-blasts-copyright-pact-as-nowhere-close-to-done ↩︎
- For more information from DisputeSoft about “fair use”, see https://www.disputesoft.com/artificial-intelligence-systems-present-copyright-infringement-concerns-and-challenges/
↩︎ - https://media.npr.org/assets/artslife/arts/2025/order.pdf, p. 9. ↩︎
- https://media.npr.org/assets/artslife/arts/2025/order.pdf, p. 9. ↩︎
- https://news.bloomberglaw.com/ip-law/anthropic-judge-blasts-copyright-pact-as-nowhere-close-to-done
↩︎ - https://www.npr.org/2025/09/05/nx-s1-5529404/anthropic-settlement-authors-copyright-ai ↩︎

