上海的月亮: EN — LARRY ROMANOFF: China’s DeepSeek and the Criminal World of American AI

China’s DeepSeek and the Criminal World of American AI

Part 1 – The Emergence

By Larry Romanoff

The Emergence of DeepSeek

DeepSeek’s Innovative Advantages

Everybody was Surprised

Some Bad News – DeepSeek was a “side project”

More Bad News – Stock Prices

Still More Bad News – DeepSeek is free

Yet Still More Bad News – Other Chinese firms are doing this too

This is an essay in three parts. Most readers today are already familiar with the story of the emergence of DeepSeek so I won’t spend much time reiterating the event. This isn’t so much the story of DeepSeek as it is the revelation of the underlying circumstances of American AI development, details which the Western media have conveniently ignored and which no other writers seem to have noticed. This is also an essay documenting that our world is seldom as we imagine it to be. See Parts 2 and 3.

The Emergence of DeepSeek

An unknown Chinese company “ignited panic” in Silicon Valley (and the White House) after releasing a new AI model named DeepSeek that outperforms America’s best. In a recent article, Mike Whitney wrote that “DeepSeek is a nuclear bomb detonated in the heart of Silicon Valley.” He went on to say that it was a challenge (and is really a slap in the face) to the tech experts in the US who thought they were gods and that “their reign would last forever”. [1]

Why the “panic” and the “nuclear bomb” analogy?

Why the “panic” and the “nuclear bomb” analogy? Well, mostly because American AI firms spent a decade or so, and hundreds of billions of dollars to develop their models using hundreds of thousands of the latest and most powerful Graphic Processing chips (GPUs) (at $40,000 each), while DeepSeek was built in only two months, for less than $6 million and with much less-powerful GPUs than the US firms used. DeepSeek uses 97% less power and cost 50 times less to run. And maybe the worst part was that they did it entirely with Chinese talent – no Americans necessary.

Further, DeepSeek scored as high or higher than OpenAI’s o1 on a variety of third-party benchmarks. DeepSeek’s model outperformed Meta’s Llama 3.1, OpenAI’s ChatGPT-4o and Anthropic’s Claude Sonnet 3.5 in accuracy ranging from complex problem-solving to math and coding. To add insult to injury, DeepSeek quickly also released its Version r1, a reasoning model that also outperformed OpenAI’s latest and best o1 in nearly all tests. [2]

Not only that, the American AI firms, with the exception of Facebook (Meta) considered their models “proprietary” and thus closed-source, meaning that users had to pay high or very high fees to use them. OpenAI’s o1, which is available only to paying ChatGPT subscribers of the Plus tier ($20 per month) and more expensive tiers (such as Pro at $200 per month), while enterprise users who want access to the full model must pay fees that can easily run to hundreds of thousands of dollars per year.

But DeepSeek (all versions) was released as fully open source, which means anyone can download and use free of charge, and can also adapt and amend it for their own purposes. [3] This explains why DeepSeek quickly rocketed to the top of apps downloaded on both the Apple Store and on Google, which is a tremendous feat for a company that no one had even heard of a few days before.

DeepSeek can produce AI models that are an order of magnitude more efficient than the current state of the art from OpenAI, Google, Anthropic, and others. “Even by the most cynical projections, it still cost them only a fraction of what others have spent. And it’s free.” Instead of paying OpenAI or Google $20 per month or $200, for their advanced models, you can use DeepSeek and get exactly the same level of results free.” [4]

DeepSeek’s Innovative Advantages

It is the difference between being taught a narrow range of things, and learning independently without restrictions.

DeepSeek surpasses OpenAI’s top model in math and software engineering. It doesn’t use the traditional “supervised learning” that the American models use, in which the model is given data and told how to solve problems. Instead, it uses what is called “reinforcement learning”, which is a brilliant approach that makes the model stumble around until it finds the correct solution and then “learns” from that process. It is the difference between being taught a narrow range of things, and learning independently without restrictions. This alone prompts experts to speculate that AI could evolve beyond human oversight.

A great advantage of DeepSeek is that it is open-source, permitting everyone to use and adapt it to their own needs. Also, DeepSeek reveals its thinking which the American AI models refused to do, from a fear that others could use that information to build their own model. Now, nothing is classified; it is all in the open. [5] One of the most remarkable things about DeepSeek is that it can do what is called “chain of thought”, and it “explains” its reasoning, step by step in its responses. The program actually appears to “think through” the problems, and displays its reasoning processes which are remarkably human in appearance. [6] ChatGPT does this too, but nowhere as well. Also, DeepSeek can even be run on an ordinary computer. [7]

DeepSeek trained its LLM with a mind-boggling 670 billion parameters, but they didn’t “copy” that from OpenAI or anyone else. Partly, they used a very innovative programming approach called “Mixture of Experts”, programming various portions of the large model for specific tasks so that the entire huge model needn’t be accessed for every question on every topic. [8]

To assist the “Mixture of Experts”, they created advanced algorithms that would break down tasks into small manageable bits, so that many things could be processed simultaneously. This kind of optimization requires serious technical expertise, because it isn’t just understanding the software, but also the intricacies of the hardware itself, making the process much faster and infinitely more efficient. And this is what sets DeepSeek apart from the others. DeepSeek looked for elegance and efficiency while the Americans were focused only on raw power.

But more than this, they actually re-programmed the GPUs to accommodate this process. This is speaking directly to the hardware itself. It is extremely efficient, but requires hugely more expertise to do it. Their abstract specifically mentions that the engineers at DeepSeek reconfigured the GPUs, dedicating parts of them to specific tasks. This is far beyond the abilities of most people, and no indication that the “experts” at OpenAI or Meta had the ability to do this. They actually re-designed how the data traffic flows within the GPU itself, which increased the efficiency by orders of magnitude.

Everybody was Surprised

The open-source availability of DeepSeek-R1, its high performance, and the fact that it seemingly “came out of nowhere” to challenge the former leader of generative AI, sent shockwaves throughout Silicon Valley and far beyond. Observers were unanimous in stating that this development was a total surprise, that no one in Silicon Valley or in the US government had any idea that China was doing anything significant in AI and uniformly believed the Chinese were “years behind” the US in development. They were suddenly faced with an accomplished fact; that Chinese researchers had produced an AI model that exceeded or matched the performance of the best that the US had produced, and was “cheaper, more accessible and more transparent.”

The first huge surprise was the cost. An unknown Chinese lab produced a better product with an expense of little more than $5 million, while US firms had collectively spent literally hundreds of billions of dollars. People close to OpenAI’s management claim the company spent a staggering $540 million in 2022 training ChatGPT. Even worse (if things could be worse), the research firm SemiAnalysis said OpenAI is paying as much as $700,000 per day to keep ChatGPT servers up and running, just from the amount of computing resources it requires. [9]

Google’s 2024 expenditures alone were $51 billion. Microsoft put more than $13 billion into OpenAI, which investment may now be lost. The actual running cost is also extremely low. One Chinese AI CEO said, “If you want to build a new model, you can pay Open AI $4.40 per million tokens, or you can pay the Chinese firm only $0.10.” [10] AI users had three huge problems with OpenAI-o1: It was (a) too slow, (b) too expensive, and (c) lacked control for end user/reliance on OpenAI. DeepSeek R1 solved all of these. [11]

Some Bad News – DeepSeek was a “side project”

One aspect of this development that almost no one seemed to notice was that DeepSeek was not an AI firm. The model was developed by the parent company High-Flyer, which is a quant company (a quantitative trading firm), a small hedge fund that uses mathematical and statistical models to develop stock trading strategies. In 2021 everyone laughed at DeepSeek’s founder, Liang WenFeng, spending millions on AI chips, calling it a “rich man’s hobby”, [12][13] assuming Liang was engaged in a pathetic attempt to enter the AI field. But he was instead using the AI chips to build a model for investment trading. Somewhere in that process, they realized they could use what they already had created to also produce a high-level AI model, so they did that. The truth is that DeepSeek was just a little side project by a small Chinese investment hedge fund. [14][15][16]

More Bad News – Stock Prices

The news from China about DeepSeek sent US tech stocks plummeting. Nvidia’s stock had the biggest single-day loss of any company in history, shedding around $600 million in value, and the entire US stock market lost more than $1 trillion – all this in only one day. Many AI and tech stocks were down by 10%, 15% and more, in all Western countries. The US dollar also dropped by 0.5% on the news. All because of the release of a Chinese chatbot. That’s not bad for a small Chinese company that no one had ever heard of.

“And all of this was from fear that a small Chinese company had developed a new AI model in only two months at a small fraction of the cost of the American versions which were infinitely more expensive and required a decade or more to create. This development has “thrown a wrench” into the entire works of the American Investment community.” [17][18] The assumption had always been that the US had an insurmountable lead in AI, which was going to be the “next industrial revolution”, and this potentially changes everything. The Americans obviously have no lead or advantage in AI, which has huge implications for not only investment markets but in geopolitical terms as well.

The large American AI firms led everyone to believe that AI required massive computing resources and expensive hardware, but DeepSeek proved this was not true. One result of this breakthrough was the realisation that tech stocks, not only the AI firms, but companies like Nvidia, were grossly overpriced, perhaps beginning a long-term slide in the stock values of all these companies. [19]

Gary Marcus, US university professor and AI expert: “I think OpenAI is highly overvalued. I think we saw their business model blow up, with DeepSeek giving away for free what they wanted to charge for. Also, DeepSeek is much more open than OpenAI. I think a $157 billion valuation is difficult to justify when you’re losing $5 billion a year. And their product, the large language models, aren’t that reliable; we know that it hallucinates, makes stuff up, makes weird errors. We may see their valuation plummet. [20]

Still More Bad News – DeepSeek is Free

After all, if the free Chinese model can do the same job as well or better, why would you pay the American firms their very high prices for the same thing?

“If you want to build powerful AI models, you need powerful AI chips, and those are made only by Nvidia at the moment. The US government prohibits Nvidia from selling those chips to Chinese firms, so the Chinese compensated by creating an infrastructure that made the training of these models extremely efficient. Thus, they needed less than 1/100th of the power to accomplish the same thing.” Moreover, the announcement of the Chinese model as “open source”, in other words, free, severely threatening the long-term value of the very expensive American models – which may depreciate to nearly zero. After all, if the free Chinese model can do the same job as well or better, why would you pay the American firms their very high prices for the same thing? [21]

And Yet More Bad News – Other Chinese firms are doing this too

To make matters worse, another Chinese company, TikTok’s parent ByteDance, released a new AI reasoning model that also outperforms OpenAI’s o1 in key benchmark test. At the end of January, 2025, Alibaba released its new AI model Qwen 2.5, which is also sending shockwaves through Silicon Valley because it seems to be a much superior model to OpenAI’s best, and is apparently outperforming Meta’s Llama and all the other models on benchmark tests. Like DeepSeek, it is also very inexpensive to run. [22] Another Chinese startup named Moonshot has released its new Kimi, which is claims is on a par with AI’s best. [23]

Mr. Romanoff’s writing has been translated into 34 languages and his articles posted on more than 150 foreign-language news and politics websites in more than 30 countries, as well as more than 100 English language platforms. Larry Romanoff is a retired management consultant and businessman. He has held senior executive positions in international consulting firms, and owned an international import-export business. He has been a visiting professor at Shanghai’s Fudan University, presenting case studies in international affairs to senior EMBA classes. Mr. Romanoff lives in Shanghai and is currently writing a series of ten books generally related to China and the West. He is one of the contributing authors to Cynthia McKinney’s new anthology ‘When China Sneezes’. (Chap. 2 — Dealing with Demons).

His full archive can be seen at

https://www.bluemoonofshanghai.com/ + https://www.moonofshanghai.com/

He can be contacted at: 2186604556@qq.com

NOTES

[1] China’s DeepSeek AI Moves the Capital of Tech from Palo Alto to Hangzhou

https://www.unz.com/mwhitney/chinas-deepseek-ai-moves-the-capital-of-tech-from-palo-alto-to-hangzhou/

[2] How China’s new AI model DeepSeek is threatening U.S. dominance

https://www.cnbc.com/2025/01/24/how-chinas-new-ai-model-deepseek-is-threatening-us-dominance.html

[3] China’s DeepSeek AI Moves the Capital of Tech from Palo Alto to Hangzhou

https://www.unz.com/mwhitney/chinas-deepseek-ai-moves-the-capital-of-tech-from-palo-alto-to-hangzhou/

[4] US needs to rethink

https://www.douyin.com/video/7464877444583935232

[5] Reasoning

https://www.douyin.com/video/7464856751351844106

[6] Reasoning

https://www.douyin.com/video/7464155074827226380

[7] Tech CEOs sound alarm on ByteDance, DeepSeek breakthroughs

https://www.cnbc.com/video/2025/01/23/tech-ceos-sound-alarm-on-bytedance-deepseek-breakthroughs.html?

[8] Some good technical information

https://www.douyin.com/video/7465331936261573928

[9] OpenAI’s ChatGPT is costing the company a shocking amount of money

https://www.tweaktown.com/news/91375/openais-chatgpt-is-costing-the-company-shocking-amount-of-money/index.html

[10] Why China’s DeepSeek is putting America’s AI lead in jeopardy

https://www.cnbc.com/video/2025/01/24/why-chinas-deepseek-is-putting-americas-ai-lead-in-jeopardy.html?

[11] Why everyone in AI is freaking out about DeepSeek

[12] Everyone laughed

https://www.douyin.com/video/7464588677419633920

[13] Rich mans hobby

https://www.douyin.com/video/7464588677419633920

[14] Deep seek

https://www.douyin.com/video/7464679744240422203

[15] Deepseek was for trading

https://www.douyin.com/video/7464889768942390587

[16] Side project

https://www.douyin.com/video/7464585655260073274

[17] Selloff

https://www.douyin.com/video/7464758265059101964

[18] Stocks and dollar dropped;

https://www.douyin.com/video/7464647992138681612

[19] Deepseek

https://www.douyin.com/video/7464779585578650895

[20] Deepseek

https://www.douyin.com/video/7465040429461900580

[21] Deep seek

https://www.douyin.com/video/7464205121250020668

[22] Alibaba model

https://www.douyin.com/video/7465919508784450874

[22] More Chinese AI

https://www.douyin.com/video/7465590719005183295

This article may contain copyrighted material, the use of which has not been specifically authorised by the copyright owner. This content is being made available under the Fair Use doctrine, and is for educational and information purposes only. There is no commercial use of this content.

Other works by this Author

BIOLOGICAL WARFARE IN ACTION

民主，最危险的宗教

Democracy – The Most Dangerous Religion

建立在谎言上的国家–第一卷–美国如何变得富有
NATIONS BUILT ON LIES — VOLUME 1 — How the US Became Rich

美国随笔

Essays on America

美国警察国家》第一卷免费电子书

Police State America Volume One

宣传与媒体

PROPAGANDA and THE MEDIA

BOOKS IN ENGLISH