Landmark Intellectual Property cases usually reshape industries.
Sony Corp v. Universal City Studios (1984) reshaped entertainment by making home video recording legal, which paved the way for VHS, DVDs, and streaming.
Apple v. Samsung (2011–2018) defined what “copying” means in tech design and influenced how products are engineered to avoid litigation. This case, however, might actually reshape the way humans learn about the world.
In December 2023, The New York Times filed a lawsuit against OpenAI and Microsoft. The lawsuit claims that OpenAI used millions of NYT articles, many of which were actually behind the paywall, to train generative AI systems like ChatGPT, and that this use violated copyright law. What began as a dispute over who owns information has quickly escalated into one of the most consequential intellectual property cases of the century.
Generative AI nowadays can write almost anything: essays, summaries, even scripts — and it answers questions faster than any journalist can type. The issue is that behind this convenience lies quite an uncomfortable truth: much of what these models produce originates from the very reporters whose work AI might one day replace.
NYT stated in their December 2023
complaint, “Independent journalism is vital to our democracy. It is also increasingly rare and valuable. For more than 170 years, The Times has given the world deeply reported, expert, independent journalism. Times journalists go where the story is, often at great risk and cost, to inform the public about important and pressing issues. They bear witness to conflict and disasters, provide accountability for the use of power, and illuminate truths that would otherwise go unseen”.
So what is this case? Is it a fight over licensing fees, over the limits of fair use? Or over the survival of independent journalism in an algorithmic age? In reality, it is all three. This case will determine not only how AI learns, but how people continue to learn, and whether the pursuit of knowledge can remain a human act in our world, which is slowly but surely becoming machine-driven.
In an interview with Harvard Law Today, Mason Kortz, a clinical instructor at the Harvard Law School Cyberlaw Clinic
said, that this suit might be the first big test for AI in the copyright law space. The article states, “This is sci-fi type stuff. We don’t have to worry about that.” With the current attention of business, media, and the public, this lawsuit could have a significant impact on the development of AI systems. “I’m interested to see where all this goes,” he added.
At the heart of The New York Times v. OpenAI lies a centuries old tension between innovation and ownership. The case hinges on the doctrine of fair use; it is a flexible exception in U.S. copyright law that allows limited use of protected works without permission, but only if the use is transformative and serves the public interest.
The concept dates back to 1841, in
Folsom v. Marsh, when Justice Joseph Story first proposed that copying may be lawful depending on the purpose, the amount that was taken, and the market effect. In 1994, the Supreme Court’s decision in
Campbell v. Acuff-Rose Music Inc proposed a new idea - that a use is fair if it transforms the original work, adding to it new meaning or purpose.
NYT
claims that millions of its copyrighted pieces, including paywalled ones, were scraped and used to train ChatGPT, letting the system reproduce portions of those works nearly word-for-word. According to NYT, this is not transformative — it is substitutional: now, instead of paying for NYT articles, readers can just ask ChatGPT to summarize them. This undermines the newspaper’s business model and erodes its control over the distribution of its own journalism.
OpenAI argues that training an AI on text is transformative by nature. According to it, the model does not store or republish articles, but learns linguistic patterns and knowledge structures from them - like a human who gains knowledge by reading.
The only problem is that this is not quite what happens, and that OpenAI going through a paywall is not the only issue in this case. Humans can read something, understand the underlying information, and learn something new. That would not be considered copying information. But according to
Digiday, “LLMs don’t have the ability to do that since they are machines, meaning the models absorb the ‘expression’ of the facts, not the facts themselves, which should be considered copyright infringement, according to The New York Times’ lawyers”. That is exactly why this lawsuit started: OpenAI did not just use NYT articles to get new information. It literally copied them in its outputs.
Excerpt from 2023 Court Complaint filed by The New York Times
Now, it might seem that this case is all about financial profit. OpenAI uses NYT articles, NYT loses subscribers and, therefore, money. However, it is about much more than just money. If anything, it is about much more than even copyright. This case is about power in an age where information has replaced labor as the world’s key currency. For centuries, wealth was measured by what people could build or extract, but today it lies in what they can know and in who controls the flow of that knowledge.
NYT represents an old model of value: truth verified through human effort, funded by readers and protected by law. OpenAI represents a new one: knowledge synthesized by machines, distributed instantly and freely, but rooted in data it did not pay to gather. So who gets to profit from knowledge - the producers or the processors?
We have actually seen this tension before. The
Betamax decision once terrified Hollywood and then made it richer. Napster nearly destroyed the music industry, yet it paved the way for Spotify. Google Books turned mass digitization from a threat into a library for the digital age. Each time, new technology looked like an existential danger until society found a balance between innovation and protection - and now that danger is OpenAI.
If the court rules in favor of OpenAI, the decision would legitimize the unrestricted use of journalistic work for AI training. On paper, that’s a victory for innovation - training costs go down since AI firms do not need to negotiate licenses for everything, more models are trained, etc. In practice, it would strip news organizations of their economic foundation. Once AI systems can summarize, paraphrase, or even replicate articles for free, which is exactly what ChatGPT did, subscriptions will decline, ad traffic will fall, and newsrooms will shrink. Readers who once paid for access to original reporting will settle for a quick, algorithmic digest, which is convenient and, well, costless. The business of journalism, which is already fragile, would begin to fracture. As funding evaporates, investigative reporting, the most expensive and essential form of journalism, becomes unsustainable. The news would still appear to flow, but much of it would be recycled content, passed from one system to another with diminishing verification or context. In that world, journalism’s watchdog function weakens. The press, which was once a counterweight to power, risks becoming an accessory to it, constrained by financial scarcity. Finally, AI generated news summaries would begin to replace direct reporting altogether. Readers would grow accustomed to the voice of machines that mimic authority without accountability. Over time, society could forget the feel of original journalism - and what vanishes is not just the profession, but the very experience of being informed by another human being.
And here is where the paradox lies, it is a self-destructive feedback loop. News stories are the raw material that keeps models accurate, current, and connected to reality. So, if journalism collapses because people stop subscribing or reading, the information supply dries up. Without that fresh data, AI will be forced to learn from its own outputs: government data, corporate press releases (which are inherently self-interested), and user-generated content - uneven, unreliable, and toxic. Without independent journalists, AIs will not be learning from investigated reality. They will be learning from whoever shouts the loudest online and will quickly start to degrade. So ironically, AI needs journalism - it just does not pay for it yet.
And right now AI kills the thing that keeps it alive.
Pauline Iakubova is a Deputy News Editor. Email them at feedback@thegazelle.org