OpenAI inks deal with Axel Springer on licensing news for model training. A sizable number of generative artificial intelligence technology developers contend that fair use entitles them to train AI models using copyrighted material that they have scraped from the internet without the owners’ consent. Some suppliers, such as OpenAI, are hedging their bets, maybe because they are concerned about the result of upcoming cases pertinent to the issue.
OpenAI announced today that it has agreed with Axel Springer, the Berlin-based publisher of publications such as Business Insider and Politico. The agreement enables OpenAI to add recent articles published by Axel Springer to its AI-powered chatbot ChatGPT, which has become very well-liked, and to train its generative AI models on the publisher’s content.
Following the company’s announcement that it would license portions of The Associated Press archives for model evaluation, this is the second agreement that OpenAI has made with a news organization.
Beginning in the future, users of ChatGPT will be able to access summaries of “selected” items from Axel Springer’s publications. These summaries will include stories that are often restricted behind a paywall. This will be the case; there will be attribution and links to the entire article with the excerpts.
OpenAI will provide Axel Springer with payments of an amount and frequency not determined in exchange for his services. The agreement has been effective for several years, and Axel Springer has stated that it would help the outlet’s existing AI-driven enterprises “that build upon OpenAI’s technology.” However, the agreement does not bind either party to exclusivity.
The Chief Executive Officer of Axel Springer, Mathias Dopfner, issued a scripted statement claiming, “We are excited to have shaped this global partnership between Axel Springer and OpenAI—the first of its kind.” To improve journalism’s caliber, societal relevance, and business model, we want to look into the possibilities that artificial intelligence-powered journalism presents.
Publishers and sellers of generative AI have a contentious relationship, with the former saying that generative models are infringing upon copyright and becoming increasingly concerned about the possibility of generative models cannibalizing traffic. This is in addition to the fact that publishers are using generative AI for problematic content strategies. Specifically, Google’s new generative AI-powered search experience, SGE, has moved links appearing in conventional lower-down search results pages. This might reduce traffic to those links by as much as 40%.
There is also opposition from publishers to the practice of suppliers training their models on material without pay agreements being in place. This is especially true in light of claims that tech giants like Google are experimenting with artificial intelligence tools to summarize news. An investigation conducted not too long ago revealed that hundreds of news organizations are now implementing coding to block OpenAI, Google, and other entities from scanning their websites for training data.
The month of August saw the publication of an open letter by several media groups, including Getty Images, The Associated Press, the National Press Photographers Association, and The Authors Guild. The letter demanded that artificial intelligence (AI) demonstrate greater openness and preserve copyright. According to the letter, the signatories requested authorities establish legislation that mandates openness in training data sets and allows media businesses to engage with AI model operators. In addition, the signatories said that these policies should be considered.
In the letter, it is said that “[current] practices undermine the core business models of the media industry, which are predicated on readership and viewership (such as subscriptions), licensing, and advertising.” The resultant consequence is a significant decrease in the variety of media and a weakening of the financial feasibility of enterprises investing in media coverage, which further reduces the public’s access to high-quality and trustworthy information. This is in addition to the fact that the copyright law has been violated.