DeepSeek exposes a fundamental advantage of China's system: their whole economy is open source
Bullets:
Tech industry insiders are shocked at China's rapid progress in Artificial Intelligence, especially given our export bans on the fastest semiconductors. In the most recent development, DeepSeek introduced their AI model that outperforms the ones developed by Facebook (Meta) and ChatGPT (OpenAI).
DeepSeek's model was developed with less than 9% of the computing hours thought necessary to build such a rich model, and at an astonishing cost of just $6 million.
DeepSeek's announcement followed those of Tencent and Alibaba, who developed their models with slower, older-generation chips, but whose AI outperform rivals from Silicon Valley.
The difference, though, is that these models were all developed using open-source frameworks. And China's entire economy runs the same way: infrastructure, supply chains, university research, and technology are widely shared by hundreds of companies with hundreds of thousands of engineers within similar industries.
The same dynamics that propel China's industries to dominate in other sectors are employed in the fields of AI and semiconductor chip design. To understand how and why China has closed the gap so quickly in technology, knowledge of tech and semiconductors are far less important than the understanding of China's industrial clustering policy.
Report:
Good morning. The entire Chinese economy should be thought of as open-source. New breakthroughs and discoveries are widely shared, quickly, across their industrial sectors and are then quickly adapted to new products and technologies.
Economists and governments have known for centuries about Knowledge Spillover. This is what happens when we put large numbers of people and companies in the same geographic area, working in the same or adjacent industries. In these industrial clusters, innovation happens fast, because when one company does something that is revolutionary, the knowledge is quickly shared.
Silicon Valley is one of the best examples in the United States of this, where new technologies and applications are discovered every day, and companies are always suing each other over who really developed what. And it’s of course impossible to prevent customers from comparing what different suppliers are doing, or to stop employees from talking to their neighbors and friends, or quitting one company to start work at another and take his experience with him.
The same dynamics that built Silicon Valley were put into action here, but at orders of magnitude higher, and everywhere. When China was developing, just 30 years ago, their industrial planners built hundreds of clusters across China, in every single industry. These clusters share resources and logistics and supply chains, and universities were built to supply engineering and research talent.
So China represents by far the biggest and most recent example of industrial scale clustering, and China’s recent history is a boon for researchers and scholars who are attempting now to quantify how knowledge spillovers contribute to innovation and economic growth. Here are a few terrific papers, all recent, and they’re not paywalled and we’ll link to them in the video description. Knowledge Spillover effects from China’s Car manufacturing, which went from zero to the biggest in the world in about 15 years. This one studies spillovers in China’s biggest Superclusters, clusters of clusters of clusters, really--Shanghai, Beijing, and Guangdong.
And this is a good one about how it works in China’s science parks. I’ll quote from it, and the audience for it is economists. Knowledge spillovers create benefits for firms, besides the companies that make the first discoveries or innovations. That results in market failure, because it is a disincentive for firms to do research and development, since they cannot enjoy all the revenue streams that result. But the deliberate building of industrial clusters is how policymakers can overcome the reluctance of companies to do R&D—governments can assume the costs of building the infrastructure, for example, and funding university departments and building supply chains. And clustering dozens, or hundreds, of fast-moving, innovative firms will result in a rising tide, so to speak, that lifts all the boats that are there. Researchers at Company A discover a new process and publish a paper, engineers at Company B read the paper and apply that process to a product they’re building, managers at Company C want to compete against Company B, so they work with engineers from Company A to build new products for C that are even faster than B’s. And so on. A, B, and C are all learning from each other, and everyone is making money while driving costs down, and taking markets away from everyone else NOT in that cluster.
All of China is organized this way. China’s entire economy is open source. And if we understand how China’s economy works, we shouldn’t be surprised when China’s open source economic system is so easily getting around the semiconductor bans that are intended to deny Chinese companies the best artificial intelligence.
But industry insiders were very surprised by this news, that an unknown Chinese start-up company built an AI model that is already better than Facebook’s and OpenAI’s. And the company did it in two months, and spent less than $6 million dollars to develop it.
The AI model is called DeepSeek, built by a company from Hangzhou, and has 671 billion parameters, and they got there using a lot less computing power than the biggest companies have.
This analyst says that DeepSeek had “a joke of a budget”, and needed only 2048 GPU’s and 2 months. It was previously thought that a comparable model would require about 16,000 GPU’s. And DeepSeek required 11 times fewer GPU hours to build. This is an X link to DeepSeek’s report and analytics, it has 5.5 million views:
And the secret, again, is that DeepSeek used open source. CNBC explains how Chinese companies are using open source to build LLM’s to innovate faster, and spread their use. China’s open-source models have strong performance and low cost to serve, and it drives innovation faster because it opens up their models to more developers, including developers across the world. One reason these moves are so important is that the AI models will eventually become the operating systems across industry lines and applications.
What’s more, the Chinese AI models have been developed using old chips, older-generation Nvidia chips. US companies are running the latest and fastest semiconductors in our AI models, and China’s stuck using older chips, but they’re getting them to work better. In November Tencent, a giant tech company here, released their LLM using Nvidia H20 chips, which are far less powerful than the ones our companies are using. It didn’t matter. DeepSeek uses Nvidia H800 chips, which are cheaper but slower than the H100’s, because Nvidia’s H100’s are restricted for export to China. That didn’t matter either. Open source is what matters. China threw open the problem to hundreds of thousands of engineers in dozens of places, and now they’re ahead.
Eric Schmidt was head of Google and was a key advisor on US AI policy. He told policymakers that the semiconductor export restrictions would hold China back. May 2024 was 8 months ago, that’s all, and 8 months ago the former CEO of Google was sure that the US lead in AI was 2 or 3 years, which is forever, and that the new chip bans would freeze China in place, while the US would just race ahead.
Six months later Schmidt gave another speech and said, forget everything he said before. By then, Alibaba’s Qwen model—also open source—was better than anything being done by our companies. Then came Tencent. Alibaba and Tencent are enormous companies, by the way, and if they wanted to develop proprietary AI models instead of going open-source, they could. But they went open-source and got their models faster and cheaper.
And now DeepSeek has done it faster and cheaper still, with less than $6 million dollars. Eric Schmidt has a net worth of over 26 billion dollars, and no doubt he understands tech better than anyone he is giving policy advice to in Washington. But to understand what China did, and how, it’s not important to understand the technology. It’s much more important to understand China. And our top people don’t understand China.
Resources and links:
Chinese start-up DeepSeek launches AI model that outperforms Meta, OpenAI products
Explainer | Chinese cities offer subsidies to boost access to the computing power needed for AI
https://www.scmp.com/economy/china-economy/article/3292478/chinese-cities-offer-subsidies-boost-access-computing-power-needed-ai
https://x.com/karpathy/status/1872362712958906460?lang=en
Introducing DeepSeek-V3!
https://x.com/deepseek_ai/status/1872242657348710721
China wants to dominate in AI — and some of its models are already beating their U.S. rivals
Alibaba's new AI model rivals OpenAI's GPT-4o in coding ability amid open source competition
https://finance.yahoo.com/news/alibabas-ai-model-rivals-openais-093000571.html
The Knowledge Spillover Effect of Multi-Scale Urban Innovation Networks on Industrial Development: Evidence from the Automobile Manufacturing Industry in China
https://www.mdpi.com/2079-8954/12/1/5
A Study of Knowledge Spillovers within Chinese Mega-Economic Zones
KNOWLEDGE SPILLOVERS AND SCIENCE PARKS: EVIDENCE FROM CHINA
https://www.jstor.org/stable/48767564?seq=1
DeepSeek: The New Frontier in Chinese AI Shocks Industry Giants
https://substack.com/home/post/p-153962801?utm_campaign=post&utm_medium=web
How China Is Advancing in AI Despite U.S. Chip Restrictions
https://time.com/7204164/china-ai-advances-chips/
https://www.forbes.com/profile/eric-schmidt/
Map of China’s manufacturing distribution
https://www.berkeleysg.com/china-manufacturing-distribution-map/
You may understand your own industry better than anyone, but if you dont understand China you 're doomed.
Thanks Kevin for the critical thinking insights.
😀 Grounded Facts 😃