AI Copyright Issues in 2026: Ethics & Best Practices

Table of Contents

Updated March 7, 2025

Quick Answer

AI copyright law in 2026 is being shaped by hundreds of pending lawsuits worldwide. Training on copyrighted works without licence is permitted by narrow exceptions (US fair use, EU Text and Data Mining, UK Section 29A) but outputs substantially similar to training works remain infringing.

Training is NOT automatically infringement — and it is NOT automatically fair use
Generative outputs can infringe if substantially similar to protected works
The US Copyright Office holds that purely AI-generated works lack human authorship

What Is the AI Copyright Landscape?

Input (training data collection and use)
Model (can a trained model itself infringe?)
Output (is a generated work a derivative?)

Key authorities are the US Copyright Office (Reports on Copyright and AI, Part 1 March 2024, Part 2 January 2025, Part 3 May 2025), the UK IPO, the EU Copyright Directive Articles 3 and 4 (TDM exceptions), and Japan's Article 30-4 of the Copyright Act.

Key Details / Requirements

Major Pending Lawsuits (Selected)

Case	Plaintiffs	Defendants	Filed	Core Issue
New York Times v. OpenAI & Microsoft	NYT	OpenAI, Microsoft	Dec 2023	Training and verbatim memorisation
Andersen v. Stability AI	Artists	Stability AI	2023	Training on artworks
Getty Images v. Stability AI (US + UK)	Getty	Stability AI	2023	Training on Getty library
Authors Guild v. OpenAI	Authors	OpenAI	2023	Novels in training data
Concord Music v. Anthropic	Publishers	Anthropic	2023	Song lyrics
Bartz v. Anthropic	Authors	Anthropic	2024	Books in training (settled September 2025 for USD 1.5B)

Global TDM and Fair-Use Regimes

Jurisdiction	Rule	Opt-Out Allowed?
USA	Fair use (17 USC 107)	N/A
EU	Copyright Directive Art. 3 (research) and Art. 4 (commercial)	Yes for Art. 4 via machine-readable opt-out
UK	Sec 29A CDPA (non-commercial TDM only)	N/A
Japan	Art. 30-4 Copyright Act (non-enjoyment exception)	No
Singapore	Computational Data Analysis (Sec 244 Copyright Act 2021)	No

Real-World Examples / Case Studies

Bartz v. Anthropic (2025) — The first major AI training settlement: USD 1.5 billion class-action settlement over books used in training, though Judge Alsup had ruled earlier that training itself was transformative fair use when done on lawfully acquired copies.

New York Times v. OpenAI (ongoing) — Federal complaint alleges GPT-4 reproduces Times articles verbatim and competes with the Times' own business.

Stability AI (UK) — Getty Images High Court trial concluded in 2025 with a partial win for Getty on trademark grounds.

US Copyright Office Zarya of the Dawn (2023) — Comic authored by Kris Kashtanova; text and arrangement protected, but Midjourney-generated images denied registration.

What This Means for AI Teams

In 2026, AI teams must:

License training data whenever practical (Getty, Shutterstock, Reuters have all signed licensing deals)
Implement training-data provenance records (per EU AI Act Art. 53(1)(c))
Respect robots.txt signals and TDM opt-outs (EU Copyright Directive)
Add output filters for memorisation and near-duplicate generation
Indemnify customers against third-party copyright claims (as Adobe, Microsoft, Google, OpenAI now do for enterprise customers)

Compliance Checklist

Publish a training-data sources document
Honour machine-readable opt-outs (robots.txt, TDM Reservation Protocol, C2PA)
License copyrighted datasets where feasible
Build memorisation tests into evaluation pipelines
Offer customer IP indemnification where commercially appropriate
For deployers: record prompts and outputs to demonstrate non-infringement
Track ongoing cases and US Copyright Office guidance

Conclusion

AI copyright is the most unsettled area of AI law. Teams that document provenance, license data, and indemnify customers will weather the lawsuits best.

Audit your training data with Misar AI's copyright provenance toolkit.