Bits

The cheaper AI got, the more it cost

Per-token prices are collapsing, yet the companies whose engineers adopted AI fastest are the ones now rationing it

01 Jun 2026

A new status game has taken hold in Silicon Valley. Over the past year, engineers at the largest technology companies have begun competing to consume as much artificial intelligence as they possibly can, a practice the industry has christened "tokenmaxxing," after the tokens (the metered chunks of text an AI model reads and writes, and the unit on which it bills) that keep score of the habit. At Meta, the contest acquired an internal leaderboard called Claudeonomics, which for a stretch this spring ranked the company's roughly 85,000 employees by how many tokens each had burned; in one thirty-day window the board logged more than 60 trillion of them before the company, after reporting by The Information, a technology-news outlet, quietly took it down. Andrew Bosworth, Meta's chief technology officer, had endorsed the underlying idea, pointing to his most prolific engineer, who was spending something close to his own salary on tokens, as proof of a productivity multiplier.

The logic is not confined to Meta. Jensen Huang, Nvidia's chief executive, has told engineers they ought to be burning tokens worth at least half their annual salary to count as fully productive, and Amazon built usage scores of its own. The premise is that token consumption is a proxy for output, and the rational response to any such proxy is to game it: employees at several firms have admitted to aiming the tools at work that did not need doing, simply to keep the number high. Measure the activity and you get the activity. The leaderboards, on this reading, are an expensive piece of theater.

Burn notice

The theater is the trivial version of the problem, because it is fixable; take down the board, measure something real, and the performance stops. The more instructive failure appears when the tools are used exactly as intended. Consider Uber, which rolled out Claude Code, Anthropic's agentic coding assistant, to its engineers in December. By March about 84% of them had adopted it; today something on the order of 70% of the code committed at the company originates with AI, and roughly 11% of its live backend updates ship with no human in the loop at all. Individual engineers were running between $500 and $2,000 a month in tokens. None of that is gaming; it is the tool working. And in April, Praveen Neppalli Naga, Uber's chief technology officer, told The Information that the company had exhausted its entire 2026 budget for AI coding in four months. "I'm back to the drawing board," he said, "because the budget I thought I would need is blown away already."

Microsoft reached the same wall from the opposite direction. In mid-May it began cancelling most internal Claude Code licenses inside its Experiences and Devices division, with the remaining access set to lapse on June 30, and pointed its engineers toward GitHub Copilot CLI, a cheaper coding tool the company already owns. (That Microsoft holds a stake in Copilot's maker, OpenAI, and was therefore trimming a competitor's product while steering demand toward its own, need not be the only reason for the decision to be one of them.) The pattern, as Fortune observed, exposes an uncomfortable arithmetic: at current rates of use, running the AI can cost more than the human it was meant to replace.

This is the part the industry did not advertise, and it is worth dwelling on who failed to advertise it. On January 27th, 2025, days after the Chinese lab DeepSeek shocked markets with a cut-price model, Satya Nadella, Microsoft's chief executive, greeted the news with what sounded like serenity. "Jevons paradox strikes again!" he wrote. "As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of." Cheaper AI, in this telling, was a tailwind: lower prices would drive volume, and Microsoft sold volume. Sixteen months later, Microsoft is the company rationing the commodity.

Nadella had his economics right; he simply read the consequence as a boon rather than a bill. The paradox he invoked belongs to William Stanley Jevons, a British economist who observed in 1865, in a treatise called "The Coal Question," that England's coal consumption climbed rather than fell as steam engines grew more efficient. Each engine needed less coal to do a given unit of work, which made coal cheaper to use, which put steam engines into mills and mines and locomotives that could never previously have justified one. Efficiency did not conserve the resource; it multiplied the appetite for it. The price of burning a ton fell, the number of tons burned soared, and the national coal bill went up.

Substitute tokens for coal and the rhyme is exact. Gartner, a research firm, reckons the cost of running inference on a frontier model will fall by roughly 90% between 2025 and 2030. That sounds like relief, and at the level of a single query it is. But the same efficiency that lowers the per-token price is what lets a company route 70% of its code through an agent, ship backend changes without a reviewer, or let an engineer spend a year's salary in tokens. The cheaper each unit becomes, the more units the work swallows, and the swallowing outruns the discount. Goldman Sachs, an investment bank, projects that agentic AI could push token consumption up twenty-four-fold by the end of the decade, to something on the order of 120 quadrillion tokens a month. A 90% cut in price set against a 24-fold rise in volume does not net to savings; it nets to a bill more than twice the size of today's.

Which leaves the labor-replacement story running backward. AI was sold as the conversion of expensive headcount into cheap tokens; under a meter, the more capable and more widely adopted the tool, the larger the line item it throws off, until the savings it was supposed to deliver become the thing finance reaches for first. Microsoft, Uber and the rest are not retreating because the tools disappointed them. They are retreating because the tools worked, and working, at these prices, is what nobody had budgeted for. The commodity, it turns out, is one a company can quite easily get too much of.

Burn notice

Get Vector in your inbox.

More from

The spreadsheet moment for AI

To own the customer, Microsoft rents an army

Meta sells 'excess compute' it doesn't have