Italy, Europe and the rest of the world on the hunt for nationalist chatbots

It's not just the US and China: from Italia 9B to Mistral Ai here are the large open source language models that have sprung up on our continent.

by Antonino Caffo

23 June 2024

Aggiungi Il Sole 24 Ore
ai preferiti su Google

3' min read

Stealing work from humans, reducing manual operations into automatisms of a few seconds, even zeroing creativity with a simple text prompt. Among the myriad negative consequences that generative artificial intelligence brings, many forget one: linguistic flattening. Those who have been using tools such as ChatGPT or Gemini for some time will not struggle to recognise a portion of text created with their patterns: bulleted lists, short and often repeated sentences. Translation from one idiom to another ends up aligning the content even more to adaptations that are often lacking in national characterisation, impersonal and, in short, uninteresting. And this is the reason why many countries in the world, as well as institutions and universities, have moved to create their own LLMs, native Large Language Models, tailor-made to resemble to all intents and purposes the customs and habits of a people, a sort of super-man aware of the space and time he is living in.

We were not the first, but neither were we the last. A few days ago, 'Italia' became available, the model developed by iGenius and 'trained' by Cineca on a local dataset, i.e. composed of Italian words and trained on 9 billion parameters and 50 thousand vocabulary tokens, with over 1,000 billion individual words to be associated for training. Little or a lot? For comparison, the old Gpt-3 operated on 175 billion parameters while Gpt-4 on about 100 trillion.

More than the dataset is the computing power

It is clear that it is not so important to have an inordinate amount of parameters when the ability to make inference, i.e. to transform data into logical sequences, is so important. A process that must be performed by a machine, which is the basis of the algorithm or cluster of algorithms. Musk, before launching Grok on X thoughtfully bought himself a bunch of Nvidia GPUs while Microsoft, in late 2023, unveiled Azure Maia 100 and Cobalt 100, the first two chips designed for AI-powered cloud infrastructure. As if to say: we built the car, we have the drivers, but there is a lack or shortage of workshops, which would really realise the concept of AI 'sovereignty'.

Italy is made

Released in open source mode, Italia aims to be an evolutionary tool for research and businesses across the country. Downloadable on the iGenius website and other AI product development platforms, Italia is trained on a dataset of text and code in Italian from a variety of sources, including Wikipedia, books, journal articles and source code.

It can be used via a web interface or an API. The former is simple to use and requires no programming knowledge. The API is more complex, but offers more flexibility and control. Editoriale nazionale is the first of the partners that wanted to contribute to the training of Italia, opening its historical archive of articles, but in the future it is expected that others will want to join in.

The cousins have Mistral

Mistral, developed by the French start-up Mistral AI, is a large language model with interesting features that make it a benchmark in the European artificial intelligence landscape. Unlike other proprietary LLMs, it is open-weight, which means that its code and parameters are publicly accessible and modifiable. This allows anyone to inspect its functioning, improve it and adapt it to their needs, promoting transparency and collaboration. Mistral boasts 7.2 billion parameters, among the most powerful models in Europe, to process complex information, generate high-quality text and perform demanding tasks with remarkable accuracy. The CEO of Mistral AI, Arthur Mensch, spent almost three years at DeepMind, the artificial intelligence lab that Google acquired in 2014 for $650 million. His two partners come from Meta: Guillaume Lample and Timothée Lacroix, among the creators of the LLaMA language model.

From China to the Emirates

On the other side of the world, the most active country is China. Beijing has reportedly approved more than 40 LLMs for public use while the rich and technologically advanced United Arab Emirates has created Falcon, a model dedicated to the Arabic language, with the aim of strengthening its presence in the digital landscape and supporting the region's economic and cultural growth. In May, the Abu Dhabi Technology Innovation Institute released Falcon 2, which is multimodal, capable of interacting and producing both text and image and audio files.

University research

And it is precisely in the university environment that LLM projects become interesting. Working on a closed, bounded perimeter is the best scenario in which to apply generative artificial intelligence algorithms. This is the goal of Dante, an LLM developed by Fabrizio Silvestri, professor at La Sapienza University in Rome. It is based on a 2017 Google 'transformers' model and uses recent innovations such as the 'quantization' of weights for a leaner version. The model was trained in Italian, leveraging the transalpine Mistral to reduce costs and deployment time.