How large could the compute demand for GenAI inference become? UBS discusses

Published 07/05/2025, 14:40
Updated 11/05/2025, 09:12
© Reuters

Investing.com -- Despite investor concerns about tariffs and AI infrastructure overspending, UBS sees generative AI (genAI) inference compute demand poised to expand dramatically across sectors.

According to the bank, AI remains resilient to macroeconomic uncertainty, with major U.S. tech companies reaffirming capital expenditure (capex) plans and highlighting that compute demand continues to exceed supply.

UBS argues that inference—the process of running AI models to generate answers—will become the primary driver of future AI compute needs, overtaking training.

“The amount of computation we need as a result of agentic AI and reasoning is easily 100x more than we thought we needed this time last year,” said Nvidia (NASDAQ:NVDA) CEO Jensen Huang, quoted in a UBS note.

The bank echoes this sentiment, pointing to the emergence of more complex methods like Chain of Thought (CoT) reasoning as a key source of growing computational intensity.

In its projections, UBS lays out four categories of genAI use cases: chatbots, enterprise AI, agentic AI, and physical AI.

Chatbots like ChatGPT are expected to see compute demand rise from 10 exaFLOP/s in 2024 to 200 exaFLOP/s by 2030.

For enterprise applications, such as fraud detection and contract summarization, inference needs are forecast to grow even faster—from 15 to 440 exaFLOP/s over the same period.

The most dramatic growth is expected from agentic AI, which includes autonomous customer support and workflow automation. UBS estimates demand from this segment could climb to 14 zettaFLOP/s by 2030, which would mark an “enormous leap from today’s needs, which we estimate to be in the hundreds of exaFLOP/s,” the firm said in the note.

Physical AI, which includes robotics and autonomous vehicles, could eventually require compute in the yottaFLOP/s range as it evolves to replicate aspects of human cognition.

Today’s installed GPU compute capacity is estimated at around 4,000 exaFLOP/s (rising to 5,000 with Google (NASDAQ:GOOGL)’s Tensor Processing Units (TPUs)), but UBS notes much of it remains underutilized.

Limitations like GPU memory bottlenecks mean actual usage often falls short of nominal potential, making it unlikely that the current base can meet future demand, especially for agentic and physical AI.

“Inference is often constrained by GPU memory, meaning the actual FLOP/s a chip can deliver is well below its theoretical maximum—with memory limitations resulting in chips operating at as little as 25% of their nominal FLOP/s,” the note explains.

“Even with these limitations the available capacity might be enough for current chatbot needs, but far below what will be required for agentic and physical AI, which will demand computing power of a different order of magnitude,” it adds.

All in all, UBS concludes that the expanding role of inference in AI adoption, combined with rising hardware requirements, supports continued investment in AI infrastructure.

For investors, the bank sees “any pullbacks in stocks linked to our ‘AI’ and ‘Power and resources’ selections as attractive entry points.”

Latest comments

Risk Disclosure: Trading in financial instruments and/or cryptocurrencies involves high risks including the risk of losing some, or all, of your investment amount, and may not be suitable for all investors. Prices of cryptocurrencies are extremely volatile and may be affected by external factors such as financial, regulatory or political events. Trading on margin increases the financial risks.
Before deciding to trade in financial instrument or cryptocurrencies you should be fully informed of the risks and costs associated with trading the financial markets, carefully consider your investment objectives, level of experience, and risk appetite, and seek professional advice where needed.
Fusion Media would like to remind you that the data contained in this website is not necessarily real-time nor accurate. The data and prices on the website are not necessarily provided by any market or exchange, but may be provided by market makers, and so prices may not be accurate and may differ from the actual price at any given market, meaning prices are indicative and not appropriate for trading purposes. Fusion Media and any provider of the data contained in this website will not accept liability for any loss or damage as a result of your trading, or your reliance on the information contained within this website.
It is prohibited to use, store, reproduce, display, modify, transmit or distribute the data contained in this website without the explicit prior written permission of Fusion Media and/or the data provider. All intellectual property rights are reserved by the providers and/or the exchange providing the data contained in this website.
Fusion Media may be compensated by the advertisers that appear on the website, based on your interaction with the advertisements or advertisers
© 2007-2025 - Fusion Media Limited. All Rights Reserved.