Digital Economy

Ecco Italia, a large language AI model like Gpt, all Italian

It was presented today, and published open source for free download, by the Italian company iGenius in collaboration with Cineca

by Alessandro Longo

4' min read

4' min read

Ecco Italia, a large language AI model like Gpt, all-Italian. It was presented today, and published open source for free download, by the Italian company iGenius in collaboration with Cineca (Italy's largest computing pole, an inter-university consortium).

Although the version is still 0.1, Italia stands today as the largest and most accomplished large language model made in Italy, formed with our language and designed for the development of Italian companies and public administrations.

Loading...

In short, the Italian soul is present on several levels, as explained in today's presentation by the company. It is in the database used, more than 90 per cent Italian data, with the advantage of a better understanding of our language, its nuances, and our historical and cultural context. It also comes with an efficiency gain of 60 per cent, because the current models, based on English, when they have to handle other languages, do a continuous translation job that is invisible to the user.

Italianness is also in the spirit of the product: the objective, declared today, is to help Italy be an actress in this revolution and not a mere consumer of foreign products. This is why Italy is open source, to be an enabling element for the development of the country, our companies and PA; without any more dependence on foreign products.

The Distinctive Elements of Italy

From a technical point of view, Italia has 9 billion parameters, a context window of 4,096 tokens and a vocabulary of 50,000 tokens. It used trillions of tokens for training, using a heterogeneous mix of sources: public sources, synthetic data and industry content provided by iGenius' commercial partners.

By collaborating with Editoriale Nazionale, a company of the Monrif group, the company was able to use their historical archive of press articles as a supplementary source to improve the model.

Another distinguishing feature compared to the most famous, foreign models is the respect for rules and safety.

"In order to build our training dataset and ensure the ethical integrity of the generated content, we have developed language-specific security filters. These filters remove sensitive, explicit and high-bias content from our selected sources,' iGenius explains.

"These protection mechanisms, combined with the adoption of state-of-the-art data cleaning techniques, allowed us to mitigate the occurrence of bias, as well as limit hallucinations and the generation of content inconsistent with the conversation".

"Data security and information reliability have always been a priority for iGenius. We have invested in building an Italian dataset of the highest quality to develop a language model that is truly open, transparent and secure, in compliance with European artificial intelligence regulations such as the AI Act'.

Other filters concern copyright protection in the data used.

It is true that all models follow security procedures, to reduce the risk of hallucinations, discriminatory or harmful content; but Italy shows, at least in its declarations, an unprecedented attention to all European rules. And on copyright, let us recall the recent lawsuits and controversies that are pitting Open Ai, Microsoft and Google against various subjects (publishers, graphic designers, authors and creatives).

Italia is also distinguished by its target audience: it started out as being focused on companies and PA.

It was designed for companies operating in highly regulated sectors, such as financial services or public administration.

Despite being a single language specialisation model, Italian precisely, 'the high number of parameters coupled with the quality of the training process make it the ideal choice for the most critical use cases in the enterprise world, where the reliability of the generated content is of paramount importance,' the company states.

The search for a national AI sample

.

Italy is strongly looking for a 'national champion' in this technology (generative AI and large language models). Germany and especially France (whose Mistral had strong support from the French government and Microsoft) now have one.

The Italian government, as we read in the new Artificial Intelligence Strategy (or rather in its summary, since the full text has not yet been released), believes strongly in the need for our country too to equip itself with a national AI model, in favour of the development of our economy.

The first candidate seemed to be Minerva, from researchers at Rome's Sapienza University, but from the outset it proved to be unreliable in its results and lacked a clear sectoral focus.

Italy is more promising, also because iGenius is a company that has been working on AI since 2016 and has clients such as Intesa San Paolo, Allianz, Enel, Aon and Fincanteri with its own business intelligence product (data analysis to support business decisions; they call it 'the gpt of numbers').

With Italy they want to make a breakthrough, in popularity and for all-round support of the country.

The name of the model reveals these high ambitions. In the coming weeks, testing and eventual adoption by Italian companies and PAs will show whether today's was a good start. For Italy and for Italy.

Copyright reserved ©
Loading...

Brand connect

Loading...

Newsletter

Notizie e approfondimenti sugli avvenimenti politici, economici e finanziari.

Iscriviti