This is Brad DeLong's Grasping Reality—my attempt to make myself, and all of you out there in SubStackLand, smarter by writing where I have Value Above Replacement and shutting up where I do not… Is the Day of the Data Center About to Be Over?Marco Arment's Setup as the Canary in the Coal Mine—or, Rather, as the 50 Mac Mini Server Farm Vastly More Efficient than the NVIDIA-Powered Cloud-Bound Hyperscalers...
Marco Arment's Setup as the Canary in the Coal Mine—or, Rather, as the 50 Mac Mini Server Farm Vastly More Efficient than the NVIDIA-Powered Cloud-Bound Hyperscalers. Or, why John Giannandrea’s stewardship of Apple’s AI strategy may have meant that Apple has already won the AI-software race…John Giannandrea ran search and “AI” at Google, then jumped to Apple in 2018 to lead machine learning and “AI strategy.” That move mattered: Apple had lots of silicon and privacy rhetoric, but no coherent AI leadership. He then decided to push three intertwined bets:
John Giannandrea was greatly suspicious of chasing the largest-possible GPT LLM models, spending fortunes on their training, and then deploying them as behemoths expensive in terms of silicon, heat, and power for inference as well. Giannandrea was the anti-Sam Altman. Giannandrea’s vision was of software as a thick middle layer where most economically relevant inference happens on M‑series chips scattered across living rooms and backpacks, with hyperscale training and a much thinner layer of cloud inference above it. He was hired to (a) give Apple a coherent “AI” story, and fix Siri. Apple’s marketing arm had always been overpromising what Apple Siri would soon be able to do:
Apple could not and did not deliver. Back at the end of 2022 ChatGPT 3 had blown the doors off, Thereafter, all Wall Street and pundits could care about was the amazing software technology-demonstration project they could see: a flashy, cloud‑scale, natural-language almost-useful virtual assistant. Pleasing Wall Street and pleasing pundits required that Apple be in the GPT LLM game at the frontier. So when the “we can do this in less than a year from June 2024 bet” failed, either you revise the strategy or you revise the strategist. Apple did both: they started leaning harder on cloud partnerships and they reassigned Siri and other “must ship soon” pieces to people seen as more execution‑driven and able to ship come hell or high water, with control and supervision of Apple “AI” going to Mike Rockwell, Craig Federighi, and company. (Of course, nobody else has managed to deliver either. For example, last year Nilay Patel, Joanna Stern, and Jon Gruber made great fun of Alexa Plus:
and so on. And do note that I am told that Microsoft has just changed its terms of service to state that CoPilot is “for entertainment purposes only”. To be fair, there are people who claim that OpenClaw—tagline: “the AI that actually does things”—<https://openclaw.ai/> is getting close, at least in the large-but-limited domains of programming, scheduling, editing, summarizing, and, since SEO and chasing ad revenue have poisoned Google, searching.) But now let us start over: There is a semi celebrity-tech influencer <https://marco.org/2026/04/01/letter-to-john-ternus> restauranteer <https://www.instagram.com/thealbatrossob/>-auto enthusiast <https://carbuzz.com/rivian-launches-first-ad-campaign/>-podcaster <https://atp.fm/>-podcaster spouse <https://www.tiffanyarment.com/>-programmer <https://overcast.fm/> named Marco Arment <https://marco.org/>. As a programmer, Marco is the kind of programmer whom Steve Jobs would have called “a pirate”, a word that for Jobs had a strongly positive valence, as in “it’s better to be a pirate than to join the navy. Marco’s first big project was as one of the two people building the original Tumblr <https://www.tumblr.com/>. His second was Instapaper <https://www.instapaper.com/>. Both demonstrating that a single person who actually touches the codebase every day can carry an astonishing amount of weight if they’re ruthless about scope and opinionated about what the product is for. And now—well, go to the landing page of pretty much any podcast and you will see something like this: As a programmer, these days Marco is podcast player OverCast, and podcast player OverCast is Marco. OverCast is perhaps fifth, perhaps eighth, in podcast players in terms of use. It is quite probably the second or third in terms of use by people vociferous enough to have and express opinions about podcast players. I guess that OverCast has perhaps a number of monthly active users in the low seven figures. Two years ago Apple Podcasts began to deploy transcripts for every podcast it served. And Marco’s reaction was “oh crap!” He then saw a world in which serving transcripts was a table-stakes feature—a podcast player that did not offer transcripts was unambiguously significantly worse than one that did, and worse by a large margin. But transcripts are serious work, and serious work in the write-once-run-everywhere mode, with your per-unit fully-amortized cost inversely proportional to your user base. With only 1/30 of the scale of Apple Podcasts, and with podcast generation requiring serious “AI” work, how could Marco possibly over transcripts as a feature and still make any money at all from OverCast? As a podcaster, Marco is one of the triumvirate that is the Flagship Podcast of People Who Wanted to Make an Auto Enthusiast Podcast But Found They & Their Audience Were Much More Lively When the Topic Was Tech: the Accidental Tech Podcast, ATP <https://overcast.fm/>. And on episode 683 of ATP <https://atp.fm/683>, he tells the story of what he did: It starts:
To cut to the chase: Marco Arment uses Apple’s lightweight on-device transcription inference models and a server farm of 50 MacMinis—total acquisition cost $30,000, $6,000/year amortized—drawing less than 2000W of power ($3000/year)—two microwaves—in a Long Island data center to generate transcriptions in near real time for every podcast any one of his users subscribes to. And if you want a transcript that his server farm has not yet gotten around to? Press a button on your iphone, and it will do it on your device in ten minutes or so. Add in non-power server farm colo fees and get that he is doing all this for something like $10,000/year plus his programmer time and skills. Figure that each one of Marco’s MacMinis transcribes 300,000 minutes of audio a day. If he were to have outsourced the job to OpenAI, which charges $0.006/minute for using its cloud through the Whisper API, the work done by each MacMini would cost him $1800/day. The work done by fifty MacMinis going flat-out for a year would cost $1500 x 50 x 350—call it $30 million/year. 3000 times the cost. And even charging that price OpenAI is not making a profit thees days. No, I cannot believe the difference between $10,000/year for a MacMini server farm and $26,000,000 to do it with NVIDIA chips in the cloud. I must have made a big mistake somewhere—slipped a decimal or two or three. I cannot see where I did. But even if I have: This is not a rounding error. This is not a niche edge case. This is the reality of AI inference running on the silicon of John Ternus’s people using the software of John Giannandrea’s. And this reality is not consistent with any belief that the Day of the Data Center is dawning... Continue reading this post for free in the Substack app
|
Is the Day of the Data Center About to Be Over?
Saturday, 11 April 2026
Subscribe to:
Post Comments (Atom)




No comments:
Post a Comment