Drilling Into AI’s Financial Sustainability

In my April column, I talked about how the opaqueness模糊性 of the true cost of AI is a potentially fatal flaw for the profitable commercialization of the technology long term. Interestingly, in the two months since, we’ve seen some remarkable headlines from the tech industry potentially validating my argument at catastrophic scale.在我四月的专栏中，我谈到AI真实成本的模糊性可能是该技术长期盈利商业化的潜在致命缺陷。有趣的是，在随后的两个月里，我们看到科技行业出现了一些引人注目的头条新闻，可能以灾难性的规模验证了我的论点。

It feels like the winds in the AI industry are changing direction so fast that it’s difficult to keep track. A matter of a few months ago, tech companies and even some other businesses were cracking the whip to get staff to use AI more, demanding that teams integrate it into workflows, regardless of whether they had any clear need or particular desire for the software.感觉AI行业的风向变化如此之快，以至于难以跟上。几个月前，科技公司甚至其他一些企业还在鞭策员工更多地使用AI，要求团队将其整合到工作流程中，无论他们是否有明确的需求或特别的意愿。

Hindsight is 20–20事后诸葛亮

As anyone who thought about it could probably have predicted, when you tie people’s material livelihoods to using a thing more, a large sector of people will, in fact, use the thing more. This led to “tokenmaxxing”, token usage leaderboards inside companies like Amazon, and shocking quarterly AI token expense figures at tons of places such as Uber (and other companies that have not been willing to name names). It’s frankly unclear to me why these companies are surprised at these results, but nonetheless, this has led to a pivot in the instructions to staff both because this cost is unsustainable for any length of time, but also because the use of the AI has not produced sufficiently spectacular business outcomes.正如任何思考过这个问题的人可能预料到的那样，当你将人们的物质生计与使用某样东西挂钩时，很大一部分人实际上会更多地使用它。这导致了“代币最大化”、亚马逊等公司内部的代币使用排行榜，以及优步等许多地方（以及其他不愿透露名称的公司）惊人的季度AI代币支出数字。坦白说，我不清楚这些公司为何对这些结果感到惊讶，但无论如何，这导致了给员工的指示发生转变，既因为这种成本在任何时间段内都不可持续，也因为AI的使用并未产生足够出色的业务成果。

It’s possible that executive leadership believed that some semi-miraculous productivity explosion was going to come from AI usage, but if so, they really hadn’t done their homework. Lots of us in the field as well as people in media covering the industry sounded warnings about how AI is a tool, which can be used effectively or ineffectively, and expecting miracles will always disappoint.高管层可能相信AI的使用会带来某种半奇迹般的生产力爆发，但如果是这样，他们确实没有做好功课。我们领域内的许多人以及报道该行业的媒体人士都曾发出警告，指出AI是一种工具，可以被有效或无效地使用，期望奇迹总会令人失望。

I’ve used this kind of metaphor before, but consider if these companies were in construction, and electric drills were newly invented, making exceptional productivity improvements in building possible. The correct reaction would not be to buy as many drills as they can, to the point of making drill components scarce and driving up their price, and instructing staff to use a drill in every task, producing scoreboards displaying who was using drills for the most minutes of the day. You’d have buildings that had swiss cheese patterns of holes in them, you’d have spent exorbitantly on the drills and the electricity to power them, and you’d have about as much to show for it as tech companies do from AI now.我以前用过这种比喻，但想象一下，如果这些公司处于建筑行业，电钻是新发明的，使得建筑生产力大幅提升。正确的反应不是尽可能多地购买电钻，以至于造成电钻部件稀缺并推高价格，然后指示员工在每个任务中都使用电钻，并制作显示谁使用电钻时间最长的排行榜。那样的话，建筑上会布满瑞士奶酪般的孔洞，你会为电钻和电力花费巨资，而最终成果与现在科技公司从AI中获得的差不多。

Money Isn’t Infinite金钱并非无限

At any rate, reality has begun to come crashing down, and it was at least a quick return to earth. Some businesses are still buying drills, but the big players have noticed that the cost-benefit ratio here is not making sense, and are adjusting. However, as I explained in April, this is not going to be as easy as they think. Some companies are beginning to tell their teams that the use of AI needs to be for fruitful purposes, not just tokenmaxxing, to try and bring down costs while still reaping the benefits of the technology where it can generate value.无论如何，现实已经开始崩塌，至少是迅速回归现实。一些企业仍在购买电钻，但大型玩家已经注意到成本效益比不合理，并正在调整。然而，正如我在四月解释的那样，这不会像他们想象的那么容易。一些公司开始告诉团队，AI的使用必须用于有成效的目的，而不仅仅是代币最大化，以试图在降低成本的同时，在AI能产生价值的地方获取技术收益。

What they are not yet grasping is that budgeting for tokens and clearly defining when AI is going to help with a problem is a much more indeterminate task than using other kinds of technology. Let’s go back to my April article and recollect the experience of using AI for the individual.他们尚未理解的是，为代币做预算并明确定义AI何时能帮助解决问题，比使用其他技术要不确定得多。让我们回到我四月的文章，回顾个人使用AI的体验。

“[Y]ou can ostensibly control how many tokens you submit, and thus control your costs, but that control is limited. You can make your prompts brief, limit extraneous instructions, and keep down your costs for input as a result. However, when agentic tools get involved, and the LLM is constructing prompts to pass to other LLMs, you’re no longer in charge of the length of the prompts. Even more significantly, you have only the most minimal control over the number of tokens that any model responds with (such as by asking it to “be concise”). For the most part, the number of output tokens is a part of that nondeterministic unknown I described before. And, you’ll note, an output token costs 5x the price of an input token.”“你表面上可以控制提交的代币数量，从而控制成本，但这种控制是有限的。你可以使提示简洁，限制多余的指令，从而降低输入成本。然而，当代理工具介入，LLM构建提示传递给其他LLM时，你就不再控制提示的长度。更重要的是，你对模型响应的输出代币数量只有最微弱的控制（例如要求它‘简洁’）。大多数情况下，输出代币的数量是我之前描述的那种非确定性未知的一部分。而且，你会注意到，输出代币的价格是输入代币的5倍。”

To expand this further, any time you use AI, it has a chance of failing to successfully answer your question. So the slot-machine component piles on to the problem. The tech worker doesn’t know A. how many tokens any prompt will return or B. how many times a prompt will need to be fed in (potentially with edits) to get a successful answer to a question. To calculate the cost, we need to sum all the input prompt token counts, and all the output prompt token counts (A, which is unknown) for the length of the number of attempts required (B, which is also unknown). A and B vary indeterminately based on model architecture, the problem at hand, the randomness in the model, and other factors we are probably not even aware of behind the scenes. Then, we multiply by the price per token for whatever model or models are being used, which, as I explained in April, also varies.进一步扩展，每次使用AI时，它都有可能无法成功回答你的问题。因此，老虎机组件加剧了问题。技术工作者不知道A. 任何提示会返回多少代币，也不知道B. 需要输入多少次提示（可能经过编辑）才能得到问题的成功答案。要计算成本，我们需要将所有输入提示的代币数以及所有输出提示的代币数（A，未知）相加，再乘以所需尝试次数（B，也未知）。A和B根据模型架构、手头问题、模型中的随机性以及其他我们可能甚至不知道的后台因素而不确定地变化。然后，我们乘以所用模型或模型的每代币价格，正如我在四月解释的那样，这也会变化。

So, if you’re in the financial department of a tech company, and you need to determine the budget in dollars for AI tokens for the next year, I wish you all the best of luck. Even estimating based on the past usage, or with very fine detail about the company’s productivity goals, your chances of budgeting the correct amount seem pretty slim to me. However, you have to implement some kind of limit, this can’t be a blank check scenario, so you’re going to have to cut people off at some point.所以，如果你在科技公司的财务部门，需要确定明年AI代币的美元预算，我祝你好运。即使基于过去的用量或公司生产力目标的非常详细的细节进行估算，你正确预算的机会在我看来也相当渺茫。然而，你必须实施某种限制，这不可能是空白支票的情况，所以你必须在某个点切断供应。

Practical Implications实际影响

How’s this going to actually work? Is it “manual coding” in the second half of the year, after spending the first half using AI intensively? Are all our emails and marketing documents hand written in Q3 and Q4? Are we shutting down our AI transcription tools and voice-to-text software after a threshold is hit? This is a fascinating question to me, because I’ve personally witnessed how different the experience is of writing code with AI is from doing it without, and switching back and forth between the two processes would be incredibly disruptive.这实际上将如何运作？是在上半年密集使用AI后，下半年进行“手动编码”吗？我们所有的电子邮件和营销文档是在第三和第四季度手写的吗？在达到阈值后，我们是否关闭AI转录工具和语音转文本软件？这对我来说是一个引人入胜的问题，因为我亲眼目睹了使用AI编写代码与不使用AI的体验有多么不同，在这两个过程之间来回切换将极具破坏性。

This also brings up the question of how cost cutting on AI is going to affect the companies providing AI-based solutions. Last October I discussed how the hyperscalers (Anthropic, OpenAI, Google, etc) are pushing startups to implement AI-based features in their products, as an attempt to earn profits to return to the investors who have sunk many billions of dollars into this industry. As the cost of providing AI features increases, and companies move more and more to a pay-per-use model, this flywheel is going to start to collapse. If companies start using AI-based tooling less because their budgets cannot accommodate the spiraling costs, the pipeline of revenues back to the hyperscalers will dry up. Anthropic and OpenAI are planning IPOs this year, both with extremely uncertain paths to profitability and hundreds of billions of dollars owed back to investors, so a slowdown in AI usage is the last thing they need.这也提出了一个问题：AI成本削减将如何影响提供AI解决方案的公司。去年十月我讨论过，超大规模企业（Anthropic、OpenAI、谷歌等）正在推动初创公司在其产品中实施基于AI的功能，试图赚取利润以回报那些向该行业投入了数十亿美元的投资者。随着提供AI功能的成本增加，公司越来越多地转向按使用付费模式，这个飞轮将开始崩溃。如果公司因为预算无法承受螺旋上升的成本而减少使用基于AI的工具，那么回流到超大规模企业的收入管道将会枯竭。Anthropic和OpenAI今年计划IPO，两者都面临着极其不确定的盈利路径和数千亿美元的投资者债务，因此AI使用放缓是他们最不需要的。

It’s also worth mentioning that Apple announced their product foray into AI last week at WWDC, and critics are responding pretty positively so far. The new Siri using technology from Google Gemini will have substantial privacy protection (on device and private cloud compute and minimal data storage) and is also not going to cost users extra. With this available, and if the quality lives up to expectations, regular consumer use of ChatGPT and Claude may also be at risk.还值得一提的是，苹果上周在WWDC上宣布了其AI产品进军，迄今为止评论家的反应相当积极。使用谷歌Gemini技术的新Siri将具有强大的隐私保护（设备端和私有云计算以及最小数据存储），并且不会向用户额外收费。有了这个，如果质量达到预期，普通消费者对ChatGPT和Claude的使用也可能面临风险。

Conclusion结论

Watch this space, because while the stories of “companies shocked at AI bills” and “OpenAI and Anthropic shooting for the largest IPOs in history” are often reported separately, they’re really the same narrative from different angles. Even if tech companies do feel like AI is providing them benefits and giving productivity gains, they simply do not have unlimited budgets to apply to it. If they do not have unlimited budgets (and consumers certainly don’t, with CPG prices straining budgets and economic sentiment the lowest it’s been in almost a century of tracking), we have to come back and ask where the billions and billions that OpenAI, Anthropic, and others are expecting to generate in revenues are going to come from. Combine this with the public pushback against data centers and negative sentiment about AI generally, and hyperscalers have a real problem on their hands.关注这个领域，因为虽然“公司对AI账单感到震惊”和“OpenAI和Anthropic追求史上最大IPO”的故事经常被分开报道，但它们实际上是从不同角度讲述的同一个叙事。即使科技公司确实认为AI为他们带来了好处并提供了生产力提升，他们也没有无限的预算来投入其中。如果他们没有无限的预算（消费者当然也没有，消费品价格挤压预算，经济情绪处于近一个世纪追踪以来的最低点），我们必须回过头来问，OpenAI、Anthropic等公司期望产生的数十亿收入将从何而来。再加上公众对数据中心的抵制和对AI普遍的负面情绪，超大规模企业面临着真正的问题。

Read more of my work at www.stephaniekirmer.com在 www.stephaniekirmer.com 阅读我的更多作品