La crisi della Nato accelera il dibattito Ue sulla clausola di mutua difesa
Dal nostro corrispondente Beda Romano
by Lara Ricci
3' min read
3' min read
"The capacity of artificial intelligence systems is made possible by the greatest data theft in history!": Karin Schmidt-Friderichs, president of the German Publishers & Booksellers Association, used these words at the opening press conference of the Frankfurter Buchmesse, the world's most important book fair. "Creativity must be remunerated," he continued, "Whereas copyrighted texts and images have been used and continue to be used millions of times over without paying royalties and without the consent of the authors as training material for artificial intelligence. This is unacceptable. Clear rules are needed. The European Ai act is a starting point but does not yet address many issues, we are only at the beginning of a long debate that we must have not only as an industry but also as a society'.
The threat posed by artificial intelligence (AI) to the publishing, printing and other creative industries is one of the recurring themes of this fair, which hosts several conferences on the subject, such as the one organised by the International publishers association (Ipa) where Silke von Lewinski, from the Max Planck Institute for Innovation and Competition in Munich as well as lecturer at the University of Zagreb and legal expert for the European Commission, and Scott Zebrak, American copyright lawyer and founding partner of Oppenheim + Zebrak, LLP ('O+Z'), spoke.
Von Lewinski pointed out that, although no one knows precisely how Llm (Large language models) work, studies have recently been published showing that generative AI systems continuously make use of protected material, reproducing it countless times both when downloading it for practice and when re-processing it to provide the required answers. He went on to explain that the European Ai Act is only a first step to regulate the sector, and that it is not effective in protecting copyright.
"It is designed to protect the security of products that use AI, not to protect copyright," he said, explaining that although copyright is also discussed, it is done in an incomplete manner, there has not been adequate discussion on the subject, for instance with regard to the 'text and data mining' exception, which, according to many experts, would not be applicable in the case of generative artificial intelligence, because these are not automatic techniques to analyse digital data, i.e. to extract statistics and information useful for research, but systems that use texts to produce similar texts.
Zebrak explained that there are about 25 large lawsuits currently pending in the US involving large language models, which copy the content with which they are 'fed' many times over, either when they 'ingest' it, when they process and select it, or when they reassemble it to provide a response. For instance, the lawsuit brought by the New York Times against OpenAI and Microsoft for infringing copyright law with ChatGpt and Copilot, with the first judgments expected by the end of the year or the beginning of 2025.