New research from the UN Educational, Scientific and Cultural Organisation (UNESCO) & University College London (UCL) has discovered that small changes to Large Language Models (LLMs) can result in substantial energy savings. The new report Smarter, smaller, stronger: resource-efficient generative Al & the future of digital transformation found that small adjustments to AI models can reduce energy consumption by as much as 90%, all without reducing performance[i].
This is crucial given that Artificial Intelligence (AI) is growing in use, with compute demand doubling every 100 days[ii]. So too is the technology’s energy consumption, the IEA (International Energy Agency) estimates that electricity consumption from AI has grown by around 12% per year since 2017, more than four times faster than the rate of total electricity consumption.
UNESCO & UCL explain that generative AI platform ChatGPT uses roughly 0.34 Wh of electricity per query, similar to the energy needed to power an LED lightbulb for a few minutes. However, given that ChatGPT receives approximately 1 billion queries each day, this adds up to roughly 124GWh per year.
“Generative AI’s annual energy footprint is already equivalent to that of a low-income country, and it is growing exponentially. To make AI more sustainable, we need a paradigm shift in how we use it, and we must educate consumers about what they can do to reduce their environmental impact” Tawfik Jelassi, Assistant Director-General for Communication and Information at UNESCO, has said.
On a broader scale, AI has been credited with causing a surge in data centre energy demands. The IEA’s recent Energy and AI research revealed that growth in the use of Artificial Intelligence will lead to the global electricity demand from data centres more than doubling in the next 5 years, reaching 945 terawatt-hours (TWh) in 2030. Notably this figure is equivalent to Japan’s current annual electricity usage and represents an almost 128% increase from the 415 TWh of power used by data centres in 2024[iii].
To cut energy use, the researchers from UNESCO & UCL identified three approaches or techniques which provide substantial energy savings, without compromising the accuracy of LLM results. These are:
Smaller models: these were found to be just as smart and accurate as large ones and can cut energy use by up to 90% when tailored to a specific task. Currently, users rely on large, general-purpose models for all their needs. The research shows that using smaller models tailored to specific tasks, such as translation, can cut energy use significantly without losing performance.
Keep it concise: Shorter, more concise prompts and responses can reduce energy use by over 50%.
Compression is key: Model-compression can save up to 44% in energy. Reducing the size of models through techniques such as quantization helps them use less energy while maintaining accuracy.
In addition, the research found that smaller models are also more accessible models. Given that most AI infrastructure is currently concentrated in high-income countries, making AI more accessible in low-income nations is important. The three techniques outlined above are particularly useful in low-resource settings: where energy and water are scarce and connectivity is limited, small models are much more accessible.
[ii] Ibid
[iii] Data centre energy requirements to double in the next 5 years, as AI demands soar
Lauren has extensive experience as an analyst and market researcher in the digital technology and travel sectors. She has a background in researching and forecasting emerging technologies, with a particular passion for the Videogames and eSports industries. She joined the Critical Information Group as Head of Reports and Market Research at GRC World Forums, and leads the content and data research team at the Zero Carbon Academy. “What drew me to the academy is the opportunity to add content and commentary around sustainability across a wealth of industries and sectors.”