Digital Economy

How much energy does Ai consume? One Gemini prompt is equivalent to 9 seconds in front of the TV

Google comes out and proposes to the market a method to measure the energy footprint of artificial intelligence models. But is 0.24 Wh a lot or a little?

Luca Tremolada

4' min read

4' min read

 There are two questions that chatbots can never answer accurately and unambiguously: "Who told you, are you sure?" and "How much energy do you consume?". On the first two they are working on, but on the third we finally know something concrete.

An average Gemini text query consumes 0.24 Wh and produces 0.03 grams of CO₂. This is comparable to the energy required to watch TV for less than 9 seconds. These estimates - much lower than other research from universities and research centres - were presented at a meeting for journalists to present a study on the environmental impact of AI queries, with a focus on the Gemini model.

Loading...

This is the first time since the advent of AI in 2022 that Google has come 'out of the closet' with numbers. The reason, it was explained at the event, is also to help bring other AI providers such as Microsoft, AWS and OpenAI, which have so far never spoken out on their own power consumption, out of the closet.

L’immagine evidenzia la riduzione delle emissioni per prompt del modello Gemini dal maggio 2024 al maggio 2025. Le emissioni Scope 2 sono state ridotte di 47 volte, passando da 1,07 gCO₂e/prompt a 0,02 gCO₂e/prompt, principalmente grazie a miglioramenti nell’efficienza del modello e nell’utilizzo delle macchine. Le emissioni Scope 1+3 sono diminuite di 36 volte, da 0,36 a 0,01 gCO₂e/prompt. In sintesi, il grafico dimostra come l’ottimizzazione del modello, una migliore gestione dell’hardware e l’approvvigionamento di energia pulita abbiano drasticamente ridotto l’impronta di carbonio per ogni prompt generato, con una riduzione particolarmente significativa delle emissioni Scope 2.

As Partha Ranganathan, Engineering Fellow and Vice President at Google, explained: 'To develop a comprehensive methodology and share the results, with the ultimate goal of encouraging industry-wide consistency in measuring the environmental impact and relative efficiency of AI inference'. The goal would then be to propose a standard shared with the AI industry.

Il grafico confronta l’efficienza energetica e le prestazioni di vari modelli di linguaggio di grandi dimensioni (LLM), tra cui quelli di Meta (Llama), OpenAI (GPT) e Google (Gemini). I modelli di Google (Gemini) si posizionano come i più efficienti dal punto di vista energetico, con un alto numero di prompt per kWh e un elevato Arena Score. La barra blu etichettata “Median Gemini” indica un’ampia gamma di efficienze misurate con approcci diversi (“Comprehensive Measurement Approach” e “Existing Measurement Approach”).

Many of the current calculation methodologies in the industry overlook several critical factors and do not take into account every level of the AI stack, from the underlying hardware and data centres to the model itself. To remedy this, Google has developed a comprehensive methodology that includes idle machine consumption, CPU, RAM, cooling and power distribution, not just chip power.

The limits of the Google study

.

As Partha Ranganathan, who is primarily concerned with the company's computing infrastructure and data centres, with a focus on efficiency and sustainability, pointed out, the study only focuses on average textual queries. The analysis spans a 12-month period and thus includes a wider diversity of prompts. In particular, the 'median prompt' was measured, which is defined as the prompt that ranks at the 50th percentile in terms of energy consumption. The process works like this: for each AI model, the average energy consumed to process a single prompt (a single user request) is calculated. A ranking is created: all user prompts are put into a list and sorted according to the energy consumption required to process them. At this point, the 50th percentile is identified, i.e. the prompt that is exactly in the middle of this ranking. This is the 'median prompt'. The limitation of this study is that it does not tell us anything about text-to-image and text-to-video systems, those, to put it bluntly, that generate images and videos and for this very reason are considered more energy-intensive. The researchers explained that there is little consensus on how to measure the impact of other types of generation. Moreover, it would be scientifically incorrect to provide estimates to compare in terms of energy impact a query to Google's search engine with a query to Gemini.

'The interaction patterns are different,' said Ranganathan, 'it would be like comparing apples with pears.

Il grafico confronta diverse metodologie di misurazione del consumo energetico dei modelli di intelligenza artificiale, mettendo a confronto gli approcci esistenti con una metodologia proposta (Proposed Approach). La tabella scompone il consumo in diverse categorie: “Chip Power”, “Utilization”, “CPU & RAM”, “Idle Machines” e “Overhead”. Questo approccio, definito “Comprehensive Approach”, mira a una misurazione più completa e realistica del consumo energetico.

How much water does a query consume?

An average query i.e. a median Gemini prompt uses 0.26 ml of water (about 5 drops). Google experts pointed out that the use of water can reduce energy consumption by 10 per cent compared to air cooling. At sites in water-stressed areas, such as Mesa in Arizona, air cooling is chosen so as not to burden local resources. More than 25% of data centres use non-potable or recycled water.

So how much is 0.24 Wh per application? A lot or a little?

0.24 Wh per query might not seem much, but this figure is only for a very simple prompt, such as 'hello'? Energy consumption increases as a conversation develops, queries become more complex and the memory required increases accordingly. Many analysts argue that measuring the environmental impact of generative AI per single query is crucial information, but it does not help to estimate the real impact of AI in terms of the volume of queries users make every day. Reading through Google's quarterly figures for the second quarter of 2025, we know about users but not about traffic volumes. The Gemini app has more than 450 million monthly active users while daily queries increased by more than 50 per cent compared to the first quarter. So we do not know how much energy Gemini consumes. We have to make do with having a unit of measurement, which has the merit, however, if adopted, of helping to compare LLM in the market.

The crux of energy efficiency?

The size of AI models, such as Gemini, has grown exponentially (doubling every 3.5 months or so). Google, they explained, is innovating to improve energy efficiency, implementing intelligent controls and redesigning power distribution systems. They are also shifting AI workloads to use electricity during periods of least stress on the grid.

In 12 months, Google reduced its footprint per query by 33 times in terms of energy and 44 times in terms of CO₂ emissions.

Copyright reserved ©
  • Luca Tremolada

    Luca TremoladaGiornalista

    Luogo: Milano via Monte Rosa 91

    Lingue parlate: Inglese, Francese

    Argomenti: Tecnologia, scienza, finanza, startup, dati

    Premi: Premio Gabriele Lanfredini sull’informazione; Premio giornalistico State Street, categoria "Innovation"; DStars 2019, categoria journalism

Loading...

Brand connect

Loading...

Newsletter

Notizie e approfondimenti sugli avvenimenti politici, economici e finanziari.

Iscriviti