‘Not telling the whole story’: OpenAI challenges the claims of the NYT’s copyright lawsuit
11 months ago Benito Santiago
In response to a lawsuit filed by the New York Times, which accused news outlet OpenAI of using its news content to train its AI models, OpenAI filed receipts. The leading AI developer backed its commitment to the news industry, saying, “We support journalism, we cooperate with news organizations, and we believe the New York Times lawsuit is meritless.”
OpenAI also accused the New York Times of incomplete reporting, saying “The New York Times is not telling the whole story.” The company points out that the examples used by the newspaper came from old articles widely available on third-party websites, and the New York Times has hinted that it created its AI needs to generate damning evidence.
“It appears that they deliberately crafted questions, often including long sections of text, to force our model to refresh,” OPNAI said, suggesting that the New York Times acted in bad faith by submitting unnatural questions as evidence.
“Even when we use these kinds of questions, our models typically don't show the kind of behavior that the New York Times says, which suggests that they either instructed the model to adjust or cherry-picked from multiple experiments.”
Instant cheating is a common practice in which people trick an AI model into doing things they weren't programmed to do, using specific cues that trigger a very different response that wouldn't normally occur.
OpenAI emphasized the cooperation with the news industry.
“We work hard on our technology design process to support news organizations,” the company wrote, highlighting the deployment of AI tools to help journalists and editors and the goal of mutual development for AI and journalism. OpenAI recently partnered with Axel Springer—publisher of Rolling Stone—to provide more accurate news summaries.
Addressing the issue of content “regurgitation,” as the New York Times described it, OpenAI acknowledged that it's a rare but existing issue they're working to alleviate.
“Memory is a rare failure of the learning process in which we are constantly developing,” he explains, and defends his training method. “Training AI models using publicly available Internet materials is fair use.”
However, OpenAI acknowledges the validity of ethical issues by providing an opt-out process for publishers.
Table of Contents
ToggleAI training and content storage
The battle between content creators and AI companies seems like a zero-sum game at the moment, because at the root of it all is the fundamental way in which AI models are trained.
These models are developed using extensive data sets that include articles, books, websites, and articles from various sources. Other models use pictures, diagrams, movies, sounds, and songs. These models, however, do not retain specific text or information. Instead, they study these materials to learn the patterns and structures of language.
This process is critical to understanding the nature of the lawsuit and OpenAI's defense, and why AI practitioners believe their businesses are using content fairly—much like an art student would study another artist or art style to understand its characteristics.
However, creators — including New York Times and best-selling authors — argue that companies like OpenAI are using their content in bad faith. They ensure that their intellectual property is not being exploited without permission or compensation, leading to competing AI-generated products and diverting viewers from their original content.
The New York Times accused OpenAI of using their content without express permission to devalue original journalism, highlighting the potential negative impact on independent journalism and its value to society. And, no matter how elaborate the question, if it “re-creates” any copyrighted material, it can be said to have been used.
It is for the court to decide whether it is fair or just.
This legal battle is part of a legal movement that could shape the future of AI, copyright laws and journalism. As it stands, there is no doubt that it will influence the discussion around the integration of AI in content creation and the rights of intellectual property owners in the digital age.
Still, OpenAI doesn't believe this is a zero-sum situation. Although critical of the main points of the debate, Altman's company said it is ready to extend an olive branch and find a positive outcome somewhere.
“We look forward to a constructive partnership with The New York Times and respect its long history, which includes reporting on the first neural network more than 60 years ago and supporting First Amendment freedoms.”
Edited by Ryan Ozawa.