U.S. Court’s Landmark Ruling on AI Training as Fair Use: The Anthropic Copyright Case

6/25/20252 min read

On June 23, 2025, the U.S. District Court for the Northern District of California issued a pivotal ruling in the copyright infringement lawsuit against AI company Anthropic. For the first time, the court explicitly recognized that training artificial intelligence models using copyrighted works constitutes “fair use” under Section 107 of the Copyright Act. However, obtaining training data from pirated websites remains a separate infringement.

Case Background

The plaintiffs—authors Andrea Bacz, Charles Grabber, and Kirk Wallace Johnson—accused Anthropic of using their copyrighted works without authorization to train its AI models. This case centers on the legal boundaries of AI training data copyrights and is a landmark decision closely watched by the industry.

Key Court Findings

Recognition of Fair Use
The court held that using copyrighted works to train AI is a noncommercial, highly transformative use that qualifies as fair use, providing crucial legal support for AI innovation.
Zero Tolerance for Pirated Data
The court emphasized that acquiring training data from pirated or unauthorized websites constitutes independent copyright infringement, allowing copyright holders to pursue separate legal action against such misuse.
Clear Boundary Between Innovation and Infringement
The ruling delineates lawful AI training practices from illegal data acquisition, helping to standardize industry data collection methods.、

Legal Analysis

1. Balancing the Four Factors of Fair Use

The court conducted a detailed assessment under Section 107 of the Copyright Act, focusing on:

Purpose and Character of the Use: AI training is noncommercial and transformative, adding new value beyond the original work.
Nature of the Copyrighted Work: Although protected, the use is permissible within reasonable limits for transformative purposes.
Amount and Substantiality Used: Training involves learning from large data sets rather than verbatim copying.
Effect on the Market: The training use does not significantly harm the market for original works and may even foster innovation.

2. Legal Risks of Pirated Data

Data sourced from pirated websites clearly violates copyright laws.
Entities using such data, knowingly or unknowingly, face separate infringement lawsuits and monetary damages.
Copyright owners can pursue legal claims both against piracy sources and unauthorized users.

3. Implications for the AI Industry

AI developers must implement rigorous compliance frameworks to ensure lawful data sourcing and avoid pirated content.
Fair use is not a blanket exemption; rights clearance and due diligence remain essential.
Ongoing dialogue between technology and law will refine regulations and judicial interpretations.

Outlook

Advancing AI Copyright Compliance Standards
This case sets a judicial precedent globally, encouraging enterprises to standardize copyright-respecting practices in AI training.
Enhancing Copyright Protection Awareness
It motivates copyright holders to intensify anti-piracy efforts and safeguard their intellectual property rights.
Upgrading Industry Norms
AI developers are urged to tighten control over data sources and ensure proper licensing, mitigating legal risks.

Conclusion

The Anthropic case underscores the vital role of law in balancing AI innovation with copyright protection. As AI applications continue to expand, legally and ethically sound use of copyrighted materials will remain a central challenge for the industry and legal community alike.