GOOGLE I/0

From Gemini to TPU clouds: here's how Google wants to be an 'AI-first company

At the Shoreline Amphitheatre in Mountain View, the 2024 edition of Google's historically most important event is on stage. Here's what's new

14 May 2024

5' min read

At the Shoreline Amphitheatre in Mountain View, the 2024 edition of Google's historically most important event is being staged, namely I/O, the appointment with the boundless community of developers working on the Californian giant's galaxy of applications and platforms. At this round, BigG had to reckon with OpenAI's announcements two days ago, concerning the new GPT-4o model and the promise, signed by Mira Murati (the Chief Technology Officer of the company financed with 12 billion dollars by Microsoft), of an even faster, more usable and, above all, freely available multimodal artificial intelligence for all. Once the 'danger' of the dreaded search engine bang had been averted, the CEO of Google and Alphabet, Sundar Pichai, and the other executives called to speak on stage focused their attention on the Gemini-related innovations and made AI, as was to be expected, the leitmotif of the busy round of presentations.

Pichai: 'one million testers for Gemini Advanced'.

The promise launched years ago by the CEO (again at I/O) regarding the migration of vision from 'mobile first' to 'AI first' has repeatedly been a topic of discussion, with Google 'accused' of lagging behind in the application of large-scale language models and consequently not keeping up with the competition in rethinking all its products (for a user base of nearly two billion globally) in the logic of machine learning and generative technology, making the latter widely available and easily accessible. The acceleration imparted by the company in the last twelve months, with the release of the basic models of the Gemini family and the integration of the chatbot in the main Google apps and on board Android mobile phones (replacing the old Assistant) has found a sort of confirmation in the last few hours, giving substance to Pichai's idea of transforming Google into an 'AI first' company. "Our model," the CEO explained in a confidential press briefing, "was built from the ground up to be natively multimodal across text, images, video, code, and more. I see this as a big step forward in turning any input into any output, and we have since made a number of quality improvements in different areas such as translation, reasoning, coding, and more, achieving cutting-edge performance on all benchmarks." Pichai then went to the numbers, confirming that there are now more than 1.5 million developers using Gemini, and that the 'porting' of Gemini 1.5 pro to the Gemini Advanced version (available in 35 different languages) involved more than one million testers in just three months.

More intelligence for Workspace apps

The Google CEO then highlighted what he said was the most exciting transformation made possible by Gemini, namely search. "We have answered billions of queries," Pichai pointed out, "as part of our generative search experience, including the longest and most complex ones, and this week we will launch to all users in the United States - and soon in several other countries - a completely new search experience with AI Overview, for which we expect to exceed one billion users by the end of the year. The objective, in short, is clear: to play ahead of the competition (read Open AI) and radically change the way of finding information within applications, starting with Google Photos, which from the summer will be enriched with a function that will ask it to show the most important moments relating to one of the contacts in the address book with a simple command. "We are making great progress towards our ultimate goal of infinite context," the CEO added, making it clear that the increased intelligence brought in by Gemini will show its effects in all Google Workspace tools, from Gmail to Meet. It will thus be easier to summarise all messages related to a given contact in a clear summary in the background, identify the most relevant e-mails, or even get a summary of key points and actions to be taken by analysing attachments. Gemini will also make text-to-speech models much more productive by generating a customised and interactive audio conversation, in which one can also participate, from a selected dataset of source material.

Coming AI agents with thinking ability

Soon, Pichai went on to explain, "Gemini will be able to mix and match inputs and outputs to get to what we mean when we talk about a whole new generation of AI tools, where Project Astra's universal artificial intelligence agents will come into play, that is, systems with reasoning, planning, and memory capabilities, capable of thinking, working through software and applications, or doing something on your behalf and with your supervision. We are still at the beginning, this is the conclusion the Mountain View CEO left to journalists, but all these new experiences are being prototyped and Google is already thinking about how to do it in a way that is personal, safe and suitable for everyone. The only requirement for not failing the goal? Having a lot of computing power. The demand for machine learning machines has grown a million-fold in the last six years and has increased tenfold every year. Which is why BigG is ready to drop its new ace, the next generation of TPUs (Tensor Processor Units), the proprietary chips that power many of its AI workloads and train and run, in combination with Nvidia's GPUs, generative models such as Gemini.

Trillium, the sixth generation Google Cloud TPU

Tensor Processing Units have been a priority at Google for at least a decade, ever since the Californian company's top management realised the strategic importance of developing customised hardware specifically for AI. Some of the innovations at this year's I/O - and among them Gemini 1.5 Flash, Imagen 3 and Gemma 2.0 - have been trained and are being run on TPUs, and Trillium is the sixth generation of BigG's now-popular family of accelerators, the highest performing and most energy-efficient of all. This title stems from 4.7 times the peak computing capacity per chip (up to 256) and twice the bandwidth (at the memory and interconnect level) compared to the v5e series, as well as an estimated 70 per cent power saving. Trillium will be available to Google Cloud customers at the end of 2024, and during the announcement the company also confirmed how it will be one of the first providers to offer Nvidia's new Blackwell GPUs in early 2025. The idea, in short, is clear and is to continue to beat a path to the development of the super computing infrastructure underpinning next-generation artificial intelligence applications: from real-time voice search to photo object recognition, from interactive language translation to LLM models such as Gemini, most of Google's frontier services would not be possible without TPUs.