Guides

How does ChatMinerva respond? The challenge of building an Italian ChatGPT

The important thing is to evaluate each llm for what it is, not for what it could never be, a Silicon Valley giant.

by Alessandro Longo

4' min read

Translated by AI
Versione italiana

4' min read

Translated by AI
Versione italiana

If a brick weighs a kilo plus half a brick, how much does a brick weigh? The riddle is an old one, and many experts used it as a test for the first generative AI models, Gpt-3 and the like, which punctually failed. ChatMinerva, a newly-launched Italic chatbot, has this undoubted advantage: it takes us back to that past when we could make fun of generative AI. "The weight of that tile will be exactly 1 kg + 0.5 × 1 kg = 1.5 kg (or 1 500 g). In other words, it has the same weight as twice its mass!", he replies, confidently (the exclamation mark), when the right answer by now even stones, or bricks, know it, and it is two kilos. Chatgpt Instant (a faster version of the current Gpt 5.5 model) answers correctly, and in addition tells us the formula to get there, X=1+X/2, so two (kg). What makes us suffer the most is perhaps the linguistic mispronunciation ('that tile'), which does no credit to a model whose main distinguishing feature should be that it is trained with and for our language, Italian. This is how its creators, the Sapienza NLP research group of La Sapienza University of Rome, led by Professor Roberto Navigli, in collaboration with Babelscape, an academic spin-off founded ten years ago, presented it to the world.

Nor is it fair to be so harsh on a creature that, 'was built with more passion than budget, thanks to the unceasing work of dozens of researchers, PhD students and collaborators who believe in the possibility of creating Italian AI technology from which to build competitive products,' as Navigli put it.

Loading...

A bit like a homegrown hatchback, made with small, capable forces. You cannot put it on the track with formula one cars such as Gpt, Claude or Gemini. Unfortunately, however, we are used to these; with these it is inevitable to make a comparison.

"It is not surprising that ChatMinerva cannot solve the riddle of the brick, which nobody fails nowadays. We're talking about a model with a number of parameters (connections) several orders of magnitude lower than Gpt and the like," says Antonio Cisternino, an experienced AI researcher at the University of Pisa. ChatMinerva is the direct evolution of Minerva 7B, the large language model launched earlier by the same Sapienza NLP group, with 7 billion parameters, "very few now", says Cisternino. Navigli announces a further version, with 20 billion parameters, for the autumn. Gpt 3, launched in 2020, had 175 billion. OpenAI has not declared these values since then, but independent analyses (by Semianalysis) speak of almost 2 trillion parameters, which the model now uses in a small part in its responses, thanks to efficiency techniques achieved.

ChatMinerva responses suffer from these limitations. "They are more prone to errors - hallucinations - or not following the given instructions," says Cisternino. In our tests: if we ask to write an article on a topic, it does not do so but summarises a news item. If we ask him to summarise a news story instead, he gives us a few lines and does not elaborate on them if we ask him to.

'The answers are often stringent: we are still in our infancy,' confirms Antonio Chella, professor of robotics at the University of Palermo, an international luminary in the field. These days, he too, like others, is trying out ChatMinerva, because the curiosity and interest in this Italia academic endeavour is there. And it should be encouraged.

In addition to ChatMinerva, several initiatives are emerging that seek to exploit national expertise and local data.

Among the most advanced projects is Velvet, the family of language models developed by Almawave, a listed company of the Almaviva group.

Velvet was one of the first Italian llm to be developed with a focus on European languages and enterprise use cases. It focuses on regulated sectors, from public administration to financial services, where issues such as data sovereignty, regulatory compliance and transparency are becoming increasingly important.

Another player is Domyn (formerly iGenius), an Italian start-up among the best known in the AI sector. It has the Italia-10B, Colosseum-355B, Domyn Small 10B and Domyn-Large reasoning models of around 260-263 billion parameters for regulated sectors. The project has gained international visibility thanks to collaborations with Nvidia and participation in European initiatives for the creation of sovereign AI infrastructures.

Domyn's approach aims to combine advanced generative capabilities with the security, auditability and data management requirements of organisations. Specialised chatbots are also growing alongside generalist models. Several Italian software houses are integrating open source models into assistants dedicated to specific sectors: healthcare, manufacturing, tourism, professional services and public administration. In these cases, the value lies not so much in the basic model as in the ability to integrate vertical knowledge, corporate workflows and proprietary document bases. Also dedicated to specialised uses is the Vitruvian model family of the start-up Asc27.

As can be seen, it is not possible to say what the national champion in llm is, and it would perhaps not even be fair to do so. There are various attempts, some more oriented towards proof of concept research and development; others with ambitions of practical and industrial utility. The important thing is to evaluate each llm for what it is, not for what it could never be, a Silicon Valley giant.

Copyright reserved ©
Loading...

Brand connect

Loading...

Newsletter

Notizie e approfondimenti sugli avvenimenti politici, economici e finanziari.

Iscriviti