The project at Sapienza

ChatMinerva, the Italian AI with real-time web access, arrives

It is a multimodal Ai assistant capable of reading texts, interpreting images, analysing documents and surfing the Web in real time, all while conversing in Italian with an unprecedented level of reliability for a model developed entirely in Italia

by Rome Editorial Staff

3 June 2026

Aggiungi Il Sole 24 Ore
ai preferiti su Google

3' min read

Translated by AI

Versione italiana

3' min read

Translated by AI

Versione italiana

A system where photos of pages in a foreign language can be uploaded to be translated, and perhaps even summarised, into Italian in real time. Or a model to be asked to analyse scientific articles in detail. Although these are not absolute novelties in the world of artificial intelligence, they become so when we refer to the Italia panorama. The novelty in our country, in this sense, comes from ChatMinerva, freshly presented by the Sapienza NLP research group of La Sapienza University of Rome, led by Professor Roberto Navigli, in collaboration with Babelscape, an academic spin-off founded ten years ago.

It is a multimodal Ai assistant capable of reading texts, interpreting images, analysing documents and surfing the Web in real time, all while conversing in Italian with an unprecedented level of reliability for a model developed entirely in Italia. The project stands out for a feature that, in the current panorama, is far from being taken for granted: transparency and direct control over the entire life cycle of the system, from pre-training to fine-tuning, up to content moderation mechanisms.

From voice to OCR, up to 32 thousand tokens

The technical innovations are several. On the multimodal understanding front, the model is now able to process photographs, scanned pages, reports and scientific articles, combining visual and textual information and performing optical character recognition (OCR) on digitised documents. It is also possible to interact vocally with the system.

On the information access front, ChatMinerva integrates a Web RAG - Retrieval-Augmented Generation - system based on the open search engine DuckDuckGo, which allows the model to draw on up-to-date sources in real time, overcoming the typical limitations of models trained on static data.

Also noteworthy is the extension of the contextual window up to 32,000 tokens, achieved through continuous training: a threshold that allows long documents and articulated conversations to be handled without loss of coherence. Everything is manned by a dedicated security component, which analyses input and output to filter out unwanted, untrusted or sensitive content.

The training was made possible by the computational power of CINECA's Leonardo supercomputer, while a decisive contribution came from the user community: interactions collected during the public phase of Minerva 7B fed fine-tuning on millions of examples, both textual and multimodal.

The Roots of Minerva 7B

ChatMinerva is the direct evolution of Minerva 7B, the Large Language Model launched earlier by the same Sapienza NLP group and already then presented as the main Italian initiative in the field of large language models developed with full control over the sources and training processes - and the only one curated by a public university in Italia.

Minerva 7B had already charted an alternative course to the proprietary models of the big technological giants, focusing on openness, scientific rigour and independence. ChatMinerva takes up and amplifies that bet, transforming the basic model into an all-round interactive assistant, with capabilities that bring it significantly closer to international reference standards.

Rector Antonella Polimeni framed the result within the university's broader strategy: "The evolution of the Minerva project towards multimodal and interactive AI assistants confirms Sapienza's ability to transform frontier research into concrete innovation, at the service of knowledge and society. A path that, according to Polimeni, rests on the integration of scientific expertise, advanced infrastructures and collaboration with innovative realities in the area.

On the research front, Navigli makes no secret of his ambition: 'We want to show that it is possible to build frontier AI technology in Europe and Italia too, with an open, scientifically rigorous and independent approach'. And with a note of pride that almost sounds like a manifesto: 'ChatMinerva was built with more passion than budget, thanks to the tireless work of dozens of researchers, PhD students and collaborators who believe in the possibility of creating Italian AI technology from which to build competitive products'.