AI promised cost savings, but Microsoft and Uber say it’s costing more than human workers

Sayantani Biswas
Updated25 May 2026, 12:24 PM IST
Video thumbnail
Microsoft, Uber Hit: AI Cost Crisis as Compute Spending Surpasses Human Workforce Costs

Microsoft has reportedly begun cancelling the majority of its direct Claude Code licences and redirecting its engineering workforce towards GitHub Copilot CLI instead. The reversal comes only six months after the technology giant opened access to Claude Code across thousands of its developers, project managers, designers and other staff, encouraging broad experimentation with AI-assisted coding.

Adoption was swift and enthusiastic. It was, perhaps, too swift. The sheer scale at which employees embraced the tool has now prompted the firm to pull back on technology its own engineers had grown to depend on, according to a report by The Verge.

Also Read | Trump posts AI drone strike on Iranian ships hours after Iran talks breakthrough

The decision does not affect Microsoft's broader commercial relationship with Anthropic. The company's Foundry deal, which includes investment of up to $5 billion in Anthropic and grants Foundry customers access to Claude models, remains intact, as does Anthropic's $30 billion commitment to purchase Azure compute capacity.

Uber Burned Through Its Entire 2026 AI Budget in Four Months

Microsoft is not an isolated case. Uber's chief technology officer, Praveen Neppalli Naga, told The Information in April that the ride-hailing company had exhausted its entire 2026 AI coding tools budget within just four months of the year.

The disclosure is particularly striking given that Uber had been actively stoking adoption, deploying internal leaderboards to rank teams by their AI tool usage.

The pattern across both companies points to a tension that has received little attention in discussions about workplace AI: the harder firms push employees to use the technology, the faster costs accumulate.

AI Token Economics: Why Cheaper Prices Are Not Leading to Cheaper Bills

At the heart of the problem is how AI computing is priced. Large language models charge per token, the basic unit of text the model processes and generates, according to Fortune. Under this model, greater efficiency and greater use are financially indistinguishable: both drive up total spend.

Also Read | AI could kill the brokerage industry’s cash cow

Several large technology companies have been actively pushing token consumption higher. Amazon has encouraged staff to "tokenmaxx," a term meaning to use as many AI tokens as possible. At Meta, an employee created an internal tracking tool named "Claudeonomics" to monitor which workers were using AI most heavily.

Goldman Sachs has forecast that agentic AI systems, those that act autonomously across multiple steps rather than responding to single queries, could drive a 24-fold increase in token consumption by 2030, reaching 120 quadrillion tokens per month as enterprises deploy AI agents at scale.

The unit price of those tokens is expected to fall significantly. Research firm Gartner projects that by 2030, running inference on a one-trillion-parameter large language model will cost AI providers nearly 90% less than it did in 2025. But Gartner cautioned that this price deflation will not translate into lower enterprise bills.

Agentic models require substantially more tokens per task than standard models, consumption growth can outpace falling unit costs, and AI providers are unlikely to pass through the full benefit of cost reductions to business customers.

Also Read | Why India must look beyond cost optimization in its AI ambitions

"Chief Product Officers should not confuse the deflation of commodity tokens with the democratisation of frontier reasoning," said Will Sommer, senior director analyst at Gartner.

The Cost of Compute Is Already Exceeding the Cost of Employees

Perhaps the most striking acknowledgement of AI's cost problem came from within the technology industry itself. Bryan Catanzaro, Vice President of applied deep learning at NVIDIA, addressed the issue directly in a recent interview with Axios.

"For my team, the cost of compute is far beyond the costs of the employees," Catanzaro said.

Also Read | Why AI learning would work better in jails than elementary schools

The comment carries weight given NVIDIA's position as the primary supplier of the chips that power AI infrastructure globally. It suggests the economics of substituting or augmenting human labour with AI may be considerably more complicated than early forecasts implied.

Agentic AI Ambitions Face a Harsh Financial Reality

The cost pressures arriving now stand in contrast to the expansive visions of AI deployment that technology executives have been articulating publicly. NVIDIA Chief Executive Jensen Huang has said he anticipates 100 AI agents working alongside every human employee at his company one day.

Huang is part of a broader chorus of corporate leaders championing an agentic future in which digital workers carry out tasks across the enterprise with limited human oversight.

If token consumption continues to rise faster than unit costs decline, that future may arrive with a far heavier financial burden than executives have publicly acknowledged, Fortune predicts. The early signals from Microsoft, Uber, and others suggest that at current pricing and usage patterns, the economics of large-scale AI deployment remain deeply unresolved.

About the Author

Sayantani Biswas is an assistant editor at Livemint with seven years of experience covering geopolitics, foreign policy, international relations and global power dynamics. She reports on Indian and international politics, including elections worldwide, and specialises in historically grounded analysis of contemporary conflicts and state decisions. She joined Mint in 2021, after covering politics at publications including The Telegraph. <br> She holds an MPhil in Comparative Literature from Jadavpur University (2019), with a specialisation in postcolonial Latin American literature. Her research examined economic nationalism through Eduardo Galeano’s Open Veins of Latin America. She also writes on political language, cultural memory and the long shadows of conflict. <br> Biswas grew up in Durgapur, an industrial town in West Bengal shaped by migration, which drew families from across India to the Durgapur Steel Plant. As the only child in a joint family, she spent years listening—almost obsessively—to her grandparents’ testimonies of struggle, fear and loss as they fled Bangladesh during the Partition of 1947. This formative exposure to lived historical memory later converged with her training in Comparative Literature, equipping her to analyse socio-economic structures and their reverberations. <br> Outside the newsroom, she gravitates towards cultural history and critical theory, returning often to texts such as Paulo Freire’s Pedagogy of the Oppressed. As a journalist, she is committed to accuracy, intellectual rigour and fairness, and believes political reporting demands not only clarity and speed, but historical depth, contextual precision, and a disciplined resistance to spectacle.

Catch all the Business News , Corporate news , Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.

HomeCompaniesNewsAI promised cost savings, but Microsoft and Uber say it’s costing more than human workers
More