๐ฎ The cost of tokenmaxxingBye, bye fixed costs. Hello token anxiety.
The bill no one budgeted forUber CTO Praveen Neppalli Naga shared last month that his 5,000 engineers had depleted their entire 2026 token budget in just four months. So has ServiceNow. Agentic adoption was bound to drive this kind of demand and the finance has to respond. CTOs are increasing their tech budgets this year – nearly 50% say their budgets are up by 10%. (As a side note, we believe that 10% is marginal given the token explosion we are experiencing.) A token explosion is great news for engineering teams, but a real headache for CFOs who signed off on modest pilots months ago. 71% of companies exceeded their AI budgets in 2025 and over half of the surveyed finance bosses say cost management is their greatest concern. Partially because AI costs can be highly variable and tricky to manage. The cost is not fixed as it was with good, old fixed-seat SaaS. Is this the new normal?Exponential token use is diffusion in action. In the US, the average monthly spend on AI by large enterprises grew 36% to $85,000 between 2024 and 2025¹. Labs in China told us that coding is where most labs are throwing their resources right now, their P0 in engineering terms. But AI exposure will increase across companies as capabilities expand. Blackstone’s Jon Gray reported a 15-fold increase year-over-year in token usage for Q1 across their 270 companies – a sign of diffusion across industries. More than 45% of surveyed organizations bear monthly AI budgets of $100,000 or more, up from 20% in 2024. Open-source is one route to lower costs. At ~6x the cost of closed models, a shift to open-source would’ve resulted in $24.8 billion in consumer savings in 2025.
US companies are making the switchSome US firms are moving to Chinese models. Airbnb admitted last year that it largely relied on Alibaba’s Qwen models for its customer service agents because they’re “fast and cheap.” Open-source is especially enticing for startups where lower costs matter more, and some estimates suggest that 80% of US AI startups use Chinese models. Pricing by task, not tokenSaaS companies going through a reinvention are challenging per-token pricing. One alternative is to charge per task rather than per token. Adobe wants to charge for what an agent does, not for what it consumes. That only works if they can reliably predict costs themselves, which is still an open issue. But not all token spend is equal. Tom Tunguz argues that token budgets are growing fastest in the very functions where companies will benefit from a higher ratio of labor spend to software spend. In other words, the bill is significant, but the tokens are used where returns are likely to be highest. Vibe-coded for this edition: Play our token budget word quiz and win a prize.
1
From a survey of 500 engineering professionals at the manager level and above. |
๐ฎ The cost of tokenmaxxing
Tuesday, 19 May 2026
Subscribe to:
Post Comments (Atom)







No comments:
Post a Comment