Inside the hunt for AI chips

Everyone wants Nvidia’s chips but they’re nearly impossible to get. Also: updates on the Reddit protest, Musk versus Zuck cage match, and more.

by Alex Heath

Aug 11, 2023, 6:34 PM UTC

Part Of

From ChatGPT to Gemini: how AI is rewriting the internet

see all updates

Alex Heath is a contributing writer and author of the Sources newsletter.

The most sought-after resource in the tech industry right now isn’t a specific type of engineer. It’s not even money. It’s an AI chip made by Nvidia called the H100.

Securing these GPUs is “considerably harder to get than drugs,” Elon Musk has said. “Who’s getting how many H100s and when is top gossip of the valley rn,” OpenAI’s Andrej Karpathy posted last week.

I’ve spent this week talking with sources throughout the AI industry, from the big AI labs to cloud providers and small startups, and come away with this: everyone is operating under the assumption that H100s will be nearly impossible to get through at least the first half of next year. The lead time for new orders, if you can get them, is roughly six months, or an eternity in the AI space. Meanwhile, the cloud providers are just starting to make their H100s widely available and charging an arm and a leg for the limited capacity they have. For the most part, these hosting providers also require extremely costly, lengthy upfront commitments.

This dynamic is leading companies to get creative about securing the chips they need. Take the San Francisco Compute Group, a project I stumbled upon on Hacker News while researching for this column. Evan Conrad, one of its co-leads who previously helped run the AI Grant program, tells me the goal is to put together a cluster of more than 500 H100s that will be available in the coming weeks for short-term contracts through an auction system. The idea is that you pay for only what you need at a price that is as close to actual cost as possible.

The initiative was born out of the frustrations that Conrad and Alex Gajewski, who also helped run AI Grant, experienced at their own startup. They were wanting to train a large language model and quickly realized they would need to pay several million dollars upfront for a lengthy commitment and, on top of that, pay more than double the sticker price of what it actually costs to run H100s. Their startup had raised less than a million dollars, making that idea completely infeasible. (There are, of course, competing GPUs to the H100, but Nvidia’s chip is considered the gold standard right now, particularly due to its associated CUDA software stack.)

At the same time, Conrad was hearing about similar frustrations from not only other startups, but also academic researchers and scientists who are effectively priced out of the AI boom. His friends Nat Friedman, the former CEO of GitHub, and Daniel Gross, recently announced their own, much larger cluster of over 2,500 H100s, but it’s reserved specifically for the startups the two invest in.

Conrad and Gajewski realized that, “holy moly, if we don’t do this, a good portion of scientific institutions in the country will just pause their research,” Conrad says. “And a lot of startups just won’t get built.” He said the goal for the San Francisco Compute Group, when it goes online, is to offer short-term H100 rentals at close to $2 per hour, which is about four times less than what I’ve heard AWS is charging. (Google Cloud has yet to make its H100s widely accessible, though I’m hearing that will happen around October.)

To finance their cluster, Conrad says they secured a bank loan that is collateralized by the H100s they are ordering, which should cost around $24 million all-in. It’s the second example I’ve heard of these chips being used as loan collateral, with the first example being the much larger financing round announced by CoreWeave this week. Banks are realizing what the people working in AI know: these chips are going to stay in hot demand.

Conrad tells me he has already received “more requests for access than we would ever be able to sell” since their cluster was announced less than two weeks ago, and now he and Gajewski are trying to secure more chips. “Almost everyone we know has gotten rugged” by providers that haven’t followed through with on-time delivery, he says, adding that it’s becoming more common for buyers to ask for photos and serial numbers as assurance. “It’s quite the tooth-and-nail fight out there.”

It’s not just smaller startups that are feeling the squeeze. The larger AI labs have mostly raised their massive rounds to fund their chip orders and are now being more cautious about how they manage their resources. Even OpenAI is feeling the GPU squeeze, according to a talk Sam Altman recently gave that, interestingly, the company requested be taken down after it was put online. In the interview, he said the multimodal version of ChatGPT (i.e., a version that can digest text-to-photos and vice versa) isn’t being widely released until next year due to limited compute availability.

The need for H100s means that Nvidia has its hands everywhere. In terms of how it’s distributing chips, its two overarching goals seem to be: 1) ensure that the large cloud providers building competing chips don’t get their hands on most of the H100s in existence, and 2) that its chips are actually being used by companies that need them instead of being hoarded.

At some point, supply will catch up with demand, though there are varying opinions on when exactly that happens. Dylan Patel, a leading analyst in the space for the boutique firm SemiAnalysis, told me he expects it will likely happen in the second half of next year. But I’ve seen others speculate it could last through all of 2024. (If you want a much more thorough, technical overview of the GPU supply bottleneck, I’d recommend this excellent article that has been getting passed around a lot this week.)

The wild card is Meta, which is apparently Nvidia’s largest customer for H100s right now. The social media giant is gearing up to introduce generative AI features and chatbots across its apps next month. Given the scale of its user base, interest in these features could squeeze the world’s GPU supply even more.

“In terms of how quickly some of these new [AI] products scale, that’s one of the big unknowns for the business and one of the things that we’re debating heavily when thinking through the amount of AI CapEx to bring online,” Mark Zuckerberg said on Meta’s last earnings call. “And we want to have the capacity in place in case they scale very quickly. But because they’re kind of brand-new things, there aren’t that many precedents for things like this. It’s actually quite hard to forecast.”

Steve Huffman.

Illustration by William Joel / The Verge | Photo by Greg Doherty/Variety via Getty Images

Reddit won

The last major subreddits to protest Reddit’s recent API changes have relented, according to this story from Gizmodo, which states that “the Reddit protest is over, and Reddit won.”

As I wrote in mid-June, employees at Reddit, including CEO Steve Huffman, have known this would blow over. “The company is largely behind Steve,” a senior employee told me at the time. This was back when Huffman was warning staff about wearing Reddit swag in public because the backlash was so strong.

It turns out that, while the vocal minority certainly put up a fight, most people aren’t impacted by Reddit’s changes, which include charging third-party clients for API access. Now that Reddit has won the protest, next comes the even harder part for Huffman: getting the company to profitability and on track for its eventual IPO.

Linda Yaccarino.

Quote of the week

“What a great brand sponsorship opportunity” - X / Twitter CEO Linda Yaccarino on the Musk versus Zuckerberg cage match.

This week Yaccarino gave her first interview to CNBC since taking the job. She made a point to stress that she has autonomy. (If you have to say that, do you really have it?) I’m told her comments that over 99 percent of the content on X is “healthy” and that most users like the rebrand was met with a lot of eye rolls internally.

That said, she’s clearly a more polished voice that Musk needs to be representing the company to advertisers. It’s wise of him to let her handle that, assuming he really is giving her the latitude to do so. (She’ll also be at the Code Conference next month.)

Cage match update

The billionaire brawl is still on, according to Elon Musk. In a series of posts on X this morning, he said he got an MRI earlier this week (I guess that’s what his quick trip to NYC was for) and that, while he may need minor surgery on his shoulder, recovery will only take “a few months.”

His proclamation that the fight will no longer be managed by the UFC suggests a falling out with Dana White, who has been mediating between Musk and Zuckerberg to date. Given that this is Musk, none of this could be true, and he may actually have no intention of fighting.

I’m told his announcement today that the fight will be livestreamed on both X and Meta, and that “everything in camera frame will be ancient Rome,” is news to Mark Zuckerberg, who has been training regularly in an octagon he had built in his backyard.

It’s also news to the Italian government. “It will not be held in Rome,” according to Italian culture minister Gennaro Sangiuliano.

People moves

Zachary Kirkhorn, Tesla’s CFO and “Master of Coin,” is leaving the company after 13 years. He’ll be replaced by chief accounting officer Vaibhav Taneja, who boasts a “solid understanding of US GAAP” on his LinkedIn.
Michael Hildebrand, a former manufacturing exec at Eli Lilly, is now leading the Gigafactory expansion in Nevada for Tesla.
Sandie Hawkins, TikTok’s US general manager for e-commerce, is leaving. Nicolas Le Bourgeois and Marni Levine are taking over her responsibilities and reporting to Bob Kang, ByteDance’s global e-commerce chief.
Brent Hyder, the chief people officer for Salesforce, is leaving. Nathalie Scardino, EVP of recruiting, is his interim replacement.
Lindsey Held Bolton, a former comms director at Meta, is OpenAI’s new head of public relations.
Angela Ahrendts, the former head of Apple’s retail business, has joined Kim Kardashian’s private equity firm SKKY Partners as a senior advisor.

The watercooler

A roundup of what else is going on inside tech companies this week:

The EU has opened an in-depth probe of Adobe’s proposed acquisition of Figma, which means the deal probably isn’t closing anytime soon.
Amazon has started emailing employees who aren’t coming into the office enough.
Even Zoom wants employees back in the office.
Meta is starting to rehire some of the employees it laid off through an alumni portal.
Roblox created a virtual career center inside Roblox for new hires.
Canva was valued at $25 billion in secondary stock sales recently.
Elon Musk’s Neuralink raised $280 million from Peter Thiel’s Founders Fund.