Encyclopedia Britannica and Merriam-Webster Sue OpenAI Over AI Training
The publishers of Encyclopedia Britannica and Merriam-Webster have escalated the legal battle against generative AI, filing a lawsuit Friday that accuses OpenAI of illicitly harvesting copyrighted material. The complaint alleges that GPT-4 has effectively memorized proprietary content, allowing it to output near-verbatim text that directly cannibalizes the publishers' web traffic.

The legal filing highlights specific instances where OpenAI’s models reproduced entire passages from Britannica’s archives, undermining the publisher's role as an original source. Rather than functioning as a traditional search tool that directs traffic to external websites, the lawsuit argues that ChatGPT acts as a direct substitute, stripping away the utility of the publishers' platforms. According to the plaintiffs, this unauthorized ingestion of data constitutes a systemic infringement of intellectual property rights, as the models were built using their content without license or compensation.
This litigation aligns with a broader industry pushback against AI firms regarding the ethics of data scraping. The New York Times is currently engaged in a similar legal challenge against OpenAI, while other sectors continue to test the boundaries of fair use. Anthropic recently set a significant precedent in this landscape by settling a class-action lawsuit with authors for $1.5 billion, signaling that the cost of training large language models on protected works is becoming a central liability for the industry.
Comments (0)
No comments yet. Be the first!