This week, the world received two new open-source models from Chinese AI developer DeepSeek that deliver performance rivaling the leading benchmarks of OpenAI and Anthropic.
We are not surprised, rather we have expected that China’s hardware handicap would lead to architectural breakthroughs since last summer:
“By focusing their AI research on techniques that are less reliant on NVDA-style hardware, they may find novel approaches that leapfrog the current paradigm.
Make no mistake, this will be an uphill battle. NVDA's lead is substantial, and the US sanctions are a formidable obstacle. But if there's one thing we've learned from China's rise as a technological superpower, it's that they are not to be underestimated.”
Citrindex 1-Year Anniversary, Published to CitriniResearch.com June 2024
Ironically, it is the cost efficiency of these models that has spooked investors and provided fodder for those ready to declare the death of the AI trade. But both the bears and bulls are missing the point.
Efficiency is a good thing, and NVIDIA is not AI. NVIDIA simply provides critical hardware to satisfy the immense computing demands of LLMs, and has been the most central and obvious winner of the AI arms race to date.
Since our initial coverage in May 2023, we have primarily focused investment on “picks and shovels” trade – or what we have called “Phase 1” of the AI revolution.
Absent a clear picture of commercialization and product winners, we have opted to invest in what is known – that deep pocketed hyperscalers will spend billions on hardware and datacenter infrastructure. They have, as our performance to date confirms. New upsized capital expenditure guidance from MSFT and META show that the buildout is not slowing down yet.
But it’s becoming more and more difficult to hide behind “we don’t know exactly what AI will look like in the future so we will default to the infrastructure”. Despite our main focus the past two years, our thesis has never been “datacenters”. We can’t ignore the evolution of the technology out of loyalty to past performance. From the beginning, we have argued that after AI infrastructure is developed, LLMs would be commoditized and democratized, enabling the next stage of product development and real-world applications.
We have also expected that the commoditization of LLMs would be a critical marker for the second wave of AI investment -- the transition from picks and shovels to products. Here, we continue to believe China will play a significant role via continued innovations in LLM efficiency or in potential challenges to CUDA or training ASICs.
DeepSeek – directly or indirectly – is an important development in the evolution, and investors should be mindful of its implications.
Don’t take my word for it, here’s Satya Nadella saying as much:
The prospect of cheaper, better, models may truly be a bearish signal for companies that derive value from ever escalating capex estimates, but it is equally bullish for AI commercialization. Indeed, the extraordinary costs of training and (to a much lesser degree) inference has been the largest barrier to profitable AI software. Perhaps the democratization of AI through a top tier open source model ends up expanding the demand to exceed the capacity yet again.
Regardless of this debate, which we will touch on later, we find ourselves with greater confidence that Phase 2 of AI trade has begun.
But before we get into that, let’s discuss what DeepSeek actually is.