4 July 2024

In the ongoing legal battle between Google and France’s competition regulator over copyright protections for news snippets, the Autorité de la Concurrence imposed a €250 million fine (approximately $270 million) on the tech behemoth on Wednesday.

The competition watchdog accused Google of not honoring some of its earlier agreements with news publishers.

The competition watchdog accused Google of not honoring some of its earlier agreements with news publishers. The decision is particularly significant as it highlights Google’s use of news publishers’ content to train its generative AI model, Bard/Gemini.

The competition authority criticized Google for not informing news publishers about the use of their copyrighted content by GenAI. This is contrary to Google’s earlier commitments to engage in fair payment discussions with publishers regarding the reuse of their content.

Copyright and Competition Violations

In 2019, the European Union enacted a digital copyright reform that extended copyright protections to news headlines and snippets. News aggregators like Google News, Discover, and the “Top Stories” feature box on search results pages had previously displayed these news stories on their platforms without any financial compensation.

Google initially tried to circumvent the law by disabling Google News in France. However, the competition authority intervened promptly, deeming Google’s unilateral action as an abuse of its dominant market position that could potentially harm publishers. This intervention essentially compelled Google to negotiate agreements with local publishers regarding content reuse. However, in 2021, Google was fined $592 million after the competition authority identified significant violations in its negotiations with local publishers and agencies.

Google labeled the penalty as “excessive” and announced its intention to appeal. However, it later sought to resolve the dispute by offering a series of commitments and withdrawing its appeal. These commitments, which were accepted by the French Autorité, include providing crucial information to publishers and negotiating in a fair manner.

Google has entered into copyright agreements with hundreds of publishers in France, which are governed by its agreement with the Autorité. Therefore, its operations in this area are heavily regulated.

No Appeal

Google has agreed not to challenge the Autorité’s latest findings in exchange for a streamlined process and a monetary payment. However, Sulina Connal, Google’s managing director for news and publishing partnerships, expressed her displeasure in a detailed blog post, stating that “the fine is not proportionate to the issues raised” by the authority. The blog post indicates that Google is eager to put an end to this saga, with Connal also stating: “We’ve settled because it’s time to move on and, as our many agreements with publishers show, we want to focus on the larger goal of sustainable approaches to connecting people with quality content and on working constructively with French publishers.”

Google has agreed not to challenge the Autorité’s latest findings in exchange for a streamlined process and a monetary payment.

With generative AI in the spotlight and the competitive rush to launch tools, Google’s approach to the content reuse issue appears to be changing.

GenAI Training in Focus

Today’s enforcement action by France’s competition authority reveals its focus on Google’s use of content from news publishers and agencies for training its AI foundation model and its related AI chatbot service, Bard (now known as Gemini). It discovered that Google used content from publishers and press agencies to train Bard, its generative AI tool launched in July 2023, “without notifying the copyright holders or the Authority,” according to its press release.

On this point, Google’s defense is twofold. In its blog post, it states that the competition authority “does not challenge the way web content is used to improve newer products like generative AI, which is already addressed in Article 4 of the EUCD” [EU Copyright Directive].

Article 4 of the Copyright Directive provides an “exception or limitation for text and data mining” — specifically for “reproductions and extractions of lawfully accessible works and other subject matter for the purposes of text and data mining”.

However, in its press release, the Autorité argues that it has not yet been determined whether the exemption applies in this case. (It’s worth noting that the relevant clause refers to “lawfully accessible works” — while Google is under a legally binding commitment to the competition authority to notify copyright holders about uses of their protected works and apparently failed to do so in this case.)

“When it comes to declaring whether using news content to train an artificial intelligence service falls under neighboring rights and protection, this question has not been answered just yet,” the competition authority wrote. “However, the Autorité considers that Google has breached its commitment #1 by failing to inform publishers that their content had been used to train Bard.”

Google’s blog post also briefly mentions the EU AI Act, suggesting its relevance. However, the legislation is not yet in force as it awaits final approval by the European Council.

The forthcoming AI legislation will also state that developers must comply with the bloc’s copyright rules. It introduces transparency requirements to achieve this goal — requiring them to establish a policy to respect EU copyright law and publicly provide a “sufficiently detailed summary” of the content used for training general-purpose AI models (such as Gemini/Bard).

This upcoming requirement for model makers to publish a training data summary may, in the future, make it easier for news publishers whose protected content has been used for GenAI training to obtain fair remuneration under EU copyright law.

No Technical Opt-Out

The Autorité also notes that Google failed to provide, until at least September 28, 2023, a technical solution to allow publishers and press agencies to opt out of their content being used to train Bard without such a decision affecting the display of their content on other Google services.

“Until this date, publishers and news agencies that wanted to opt out of this use case had to insert an instruction that blocks all content indexation from Google, including for Search, Discover, and Google News services. Those services are specifically part of the negotiation for revenue related to neighboring rights,” it wrote, adding: “In the future, the Autorité will carefully look at the effectiveness of Google’s opt-out processes.” In more technical terms, between July and September 2023, news publishers could insert a “noindex” tag into the robots.txt file to ensure that

6 thoughts on “France Fines Google $270M for Misuse of News Publishers’ Data in Gemini Training

  1. The fine imposed on Google by France’s competition regulator, the Autorité de la Concurrence, is a significant development in the ongoing legal battle over copyright protections for news snippets.

  2. While Google’s decision to settle the dispute and not challenge the Autorité’s findings may seem like an admission of guilt, it’s important to consider the company’s perspective. Google has argued that the fine is disproportionate to the issues raised by the authority.

  3. The Autorité’s focus on Google’s use of content from news publishers and agencies for training its AI foundation model is noteworthy. It raises important questions about the ethical use of such content and the need for clear guidelines and regulations. The forthcoming AI legislation in the EU, which will require developers to comply with the bloc’s copyright rules and establish a policy to respect EU copyright law, could provide some clarity on this issue.

  4. The lack of a technical opt-out solution for publishers and press agencies is a significant concern. The ability to opt out of their content being used to train Bard without affecting the display of their content on other Google services is crucial. The Autorité’s commitment to carefully examine the effectiveness of Google’s opt-out processes in the future is a positive step towards ensuring that the rights of publishers and press agencies are respected.

  5. This case serves as a reminder of the complex interplay between technology companies, regulatory authorities, and content creators. It underscores the need for clear and fair rules that balance the interests of all parties involved. As AI continues to evolve and become more integrated into our daily lives, such cases will likely become more common, necessitating ongoing dialogue and cooperation between these entities.

Leave a Reply