Skip to main content
News Plus 27 Jan 2025 - 10 min read

‘Sputnik moment’: AI arms race takes new twist as China challenges US hegemony; cheap as chips DeepSeek slows Nvidia juggernaut; broader, faster LLM uptake incoming

By Andrew Birmingham - Martech | Ecom | CX Editor

Big tech and the financial markets took a major hit on Monday in the US and Europe, as a generative AI model out of China touted similar capabilities to OpenAI at a claimed fraction of the cost and energy resource. The knock-on effects rippled through energy infrastructure firms and pension funds, with famed tech investor Marc Andreessen suggesting AI’s “Sputnik moment” has arrived. But others suggest cheaper, less energy-intensive large language models will open up a much broader market – enabling much more rapid and widespread adoption by businesses globally. Share market reaction was brutal with Nvidia losing $560bn in market capitalisation, the largest rout in market history.

What you need to know:

  • A new low cost open-source large language model out of China wiped hundreds of billions of dollars from tech and energy firms overnight.
  • DeepSeek is far less reliant on high grade Nvidia chips than rivals like OpenAI, but apparently offers comparable LLM capability.
  • Hence Nvidia, Microsoft, Amazon, Google, Meta and other tech majors losing value, and firms involved in energy infrastructure likewise hit.
  • But the longer-term upshot could be net positive – with cheaper AI meaning more people use it for more tasks, and prominent tech investors like Marc Andreessen are praising DeepSeekR1 as a significant breakthrough.
  • DeepSeekR1 is available as an open-source model under the MIT license, enabling free commercial and academic use.
  • The service outperformed OpenAI's ChatGPT, Anthropic's Claude and Meta's Lama in independent third party testing.
  • However, DeepSeekR1 has functional limitations, including “hallucinatory experiences” and the usual limitations on content the Chinese Government doesn't like.

DeepSeek R1 is one of the most amazing and impressive breakthroughs I’ve ever seen — and as open source, a profound gift to the world.

Marc Andreessen, General Partner, Andreessen Horowitz

Famed tech investor Marc Andreessen calls it "AI's Sputnik moment." But given the epicentre for the consequences is Silicon Valley not Cape Canaveral, an earthquake analogy is more apt: The sands didn't just shift in Silicon Valley this week, the tectonic plates crunched hard. DeepSeekR1, a large language model built by a Chinese research lab has revealed how it built OpenAI equivalent generative capabilities at a fraction of the cost, it claims, with far lower energy requirements – and with far less reliance on high-grade Nvidia chips.

For customers of generative AI the long-term outlook here is vastly cheaper technology, vastly accelerated innovation, and maybe a serious non-nuclear solution to generative AI's junkie-like addiction to energy consumption and its consequent carbon emissions.

The impacts – as seen by stock market movements that wiped hundreds of billions of dollars off the values of chipmakers, tech giants and energy infrastructure firms and will now have pension fund managers mulling defensive options – are wide. Bloomberg called Nvidia's $560bn decline "The largest rout in market history."

And the story continues to develop this morning with DeepSeek releasing another open-source image-generating AI model, Janus-Pro-7B which outperforms OpenAI's Dall-E and stable diffusion against third-party benchmarks, Linas Beliunas, a board member of Flutterwave, a European fintech.

According to Beliunas, "Janus-Pro is a new series of LLMs with image and text input and image and text output. Runs conveniently in consumer GPUs with 1B and 7B parameters. The best part? It is fully free and the entire project is Open Source. This means you can fine-tune this for any multimodal environment."

And at roughly the same time Salesforce CEO Marc Benioff tweeted, "Deepseek is now #1 on the AppStore, surpassing ChatGPT—no NVIDIA supercomputers or $100M needed. The real treasure of AI isn’t the UI or the model—they’ve become commodities. The true value lies in data and metadata, the oxygen fueling AI’s potential. The future’s fortune? It’s in our data. Deepgold."

In the ensuing chat on Twitter, Benioff was asked if he believed DeepSeek's claims about costs - at which point Elon Musk inserted himself into the conversation, tweeting, "lmao, no" and sounding eerily like all those newspaper publishers and music executives circa 1999 who didn't so much miss the rise of the internet, but steadfastly refuse to accept the possibility of its existence. 

Deep impact

DeepSeek has released its DeepSeek-R1 model as open-source under the MIT license, permitting free commercial and academic use. For customers that potentially translates into high double-digit savings.

But there are some functional limitations – early users Mi3 spoke with reported a “hallucinatory experience” similar to the early days of Chat GPT. Plus, there are the usual limitations of Chinese digital platforms, as a quick search on "Tiananmen Square Massacre" immediately reveals. While financial markets reacted with a flurry of selloffs, it may therefore be more useful currently to think of DeepSeek as a marker on the road to massive LLM commoditisation.

Current market leaders like OpenAI, Anthropic, and Perplexity are primarily closed, proprietary systems that exploit some open-source capabilities.

Habibullah Khan, CEO at digital product design studio Penumbra – which helps build AI-powered products – told Mi3 the best way to understand what DeepSeek has done is to imagine "driving a Ferrari at the cost of a Toyota Corolla. This is what DeepSeek has accomplished".

Per Khan, there are essentially two types of computational tasks related to generative AI: Training and inference. "Training refers to training the AI model on large data sets so that it is essentially “correct” i.e. the difference between its predictions and actual outputs in minimal. Once AI is trained it can be used, that is, generate answers to a user’s queries. OpenAI’s GPT4 cost US$80m to train, DeepSeek cost $6mn". (At least that's the claim, though some analysts question the numbers.)

Inference, meanwhile is where AI models apply learned patterns from the training stage to real-world user queries. "This cost for DeepSeek is also minimal, roughly 1/70th (based on DeepSeek v2 vs ChatGpt4 Turbo), allowing them to offer DeepSeek for free," said Khan. "Building an AI application for a marketing stack using DeepSeek AI is 27x cheaper than ChatGPT. How is this possible? Basically, this is because of US sanctions and the law of unintended consequences and classic Chinese ingenuity."

Forced to make do with cheaper, less powerful chips forced DeekSeek to innovate, Khan added. They ended up figuring out better architecture to train and reason with limited computing power. 

"Another reason is wage costs. While the average OpenAI AI researcher cost is US$150,000-plus – and it is not unusual to find senior researchers being paid US$1,000,000-plus, DeepSeek uses mostly fresh graduates, or those early in their AI career with ability preferred over experience.

"This is as important as DeepSeek’s ability to innovate architecturally. There is a belief that “culture” related to innovation and tech is a Silicon Valley advantage over Chinese similarly innovating. It seems no longer true," Khan added. 

"Lastly, the Chinese are using OpenSource to power DeepSeek which lowers their input costs via licensing and forces them to innovate as the optimisation relies less on hardware and more on software using a collaborative community-based approach."

While DeepSeek has emerged as a talisman for the power of open-source technology, OpenAI’s ChatGPT is primarily built on proprietary software, meaning the core technologies (like its fine-tuned GPT models) are not openly shared. OpenAI does leverage some open-source capabilities such as programming languages like Python, and open-source machine learning libraries like PyTorch, it has recently leaned more towards a closed-source, commercial model.

Deep sixed

According to Charles-Henry Monchau, chief investment officer at Syz Group, a Swiss-based independent investment firm with almost $US30bn under management, "DeepSeek ... unveiled a free, open-source large-language model in late December that it says took only two months and less than $6 million to build, using reduced-capability chips from Nvidia called H800s."

Meanwhile, a comparison published by Cody which provides AI services to the financial industry found, "DeepSeek R1 offers a significantly more affordable option, costing only 2 per cent of what users would spend on OpenAI O1"

The firm says when it tested both OpenAI and DeepSeek for SQL query generation for data analysis using historical SPY investments – one of the largest and most widely traded exchange-traded funds (ETFs) in the world  – "both DeepSeek R1 and OpenAI O1 demonstrated high accuracy. However, R1 showed an edge in cost efficiency, sometimes providing more insightful answers, such as including ratios for better comparisons."

Cody noted that both models excelled in generating algorithmic trading strategies. "DeepSeek R1’s strategies showed promising results, outperforming the S&P 500 and maintaining superior Sharpe and Sortino ratios (risk adjusted ROI calculations) compared to the market."

However, there was also a cautionary note. "DeepSeek R1 isn’t without its challenges. The model occasionally generated invalid SQL queries and experienced timeouts. These issues were often mitigated by R1’s self-correcting logic, but they highlight areas where the model could be improved to match the consistency of more established competitors like OpenAI O1."

But what really seems to have caught the Silicon Valley Broligarchy off guard are third-party benchmarking results which show DeepSeek outperforming OpenAI's Chat GPT,  Anthropic's Claude and Meta's Lama and the release last week of DeepSeekR1, a reasoning model built of bargain basement, open source infrastructure that likewise outperformed OpenAI's o1 model in third-party tests.

I think it's largely due to forcing function. The Chinese have had no chips — so have been forced to radically improve the data architecture itself and the cost of inference to achieve a competitive model. This is why heavily funded startups and heavily funded businesses rarely achieve innovation breakthroughs.

Henry Innis, Cofounder, Mutinex

Mutinex co-founder Henry Innis has hardwired an AI capability into the firm's market mix modelling platform to help marketers interpret data and suggest next best actions. He told Mi3 DeepSeek's innovation carries all the hallmarks of invention driven by necessity.  

"I think it's largely due to forcing function. The Chinese have had no chips — so have been forced to radically improve the data architecture itself and the cost of inference to achieve a competitive model. This is why heavily funded startups and heavily funded businesses rarely achieve innovation breakthroughs, " he said, suggesting start-ups "always do better at innovating with a culture of frugality ... I suspect Silicon Valley has forgotten this lesson in the AI hype."

$6m question

Innis questions DeepSeek's claim around cost, especially when considering the engineering work required for R&D.

"There's a few issues here I am sceptical about. I don't trust any of the nonsense that DeepSeek spent $6mn. Frankly, that sounds like bullshit. You will barely get 12 top-flight engineers at that price. AI is heavily politicised and both the US and China are fighting here. Both sides will be underwriting, to some degree, the cost."

But in terms of optimisation Innis said DeepSeek's use of reinforcement learning (a type of machine learning) to improve the decisions of the model and reward it for successful behaviour looks to be the correct one. "This is (somewhat) an efficiency improvement versus a leap forward. Efficiency is great — it opens up the market further — but it's not a leap forward, and I think that is why it is being open-sourced (so that DeepSeek can land on the map and start pushing a non-US agenda once it has serious breakthroughs)."

While it's tempting to view this through the prism of USA v China in the AI arms race – and there's some fairness to doing that, given the US stock rally and big tech valuations are wrapped up in tech leadership – this is really a story as old as the tech industry itself: open source vs proprietary solutions. To that end, DeepSeek has also released a paper describing how it works.

Challenging assumptions

Peter van der Putten, director AI Lab and Lead Scientist for Pega Systems said that in the grand scheme of things, DeepSeek R1 along with competing reasoning models "challenge the assumption that we have run out of data to create better models as ‘we have used the whole internet’. Reasoning models such as R1, o1 and o3 make clever use of compute at test time, i.e. when you ask the question, rather than just at training time, by aiming to reason towards finding a solution".

According to van der Putten, "From an AI research perspective, the most interesting model actually is the DeepSeek-R1-Zero model, which didn’t end up performing as well as hoped. Like any model, it is pre-trained on human output (i.e. large portions on the web), but from there on it was supposedly trained without explicit so-called supervised fine-tuning, i.e. explicit human feedback but rather using reasoning output to further refine the model. In theory, this could be an avenue towards scaling model performance further."

"This obviously would require balancing efficiency on the other hand, through using parameters at lower precision (quantization), teaching simpler models with more expensive and capable models (distillation), breaking up the network architecture (mixture of experts) etc."

In that sense, we are far from done yet he believes. "Even beyond reasoning at training or test time, once models get deployed in the real world, through online services, in mobile apps or even in physical objectives such as cars or robots, there is endlessly more data – assuming consent – that could be used to further scale these models. And even that is just half of the problem, the other one is where could or should we use these models to drive responsible impact – what is ‘good use’ across all dimensions?"

Which is probably the $6 trillion question.

Unintended consequence

Marc Andreessen, general partner at Andreessen Horowitz, posted on X, "DeepSeek R1 is one of the most amazing and impressive breakthroughs I’ve ever seen — and as open source, a profound gift to the world."

He followed up a few hours later with another post that appeared to be an allusion to the inauguration of Donald Trump as US President, writing, "This week may have been the most important week of the decade, for two totally different reasons."

Trump can at least put his hand on his heart and honestly argue that what appears to be a massive step change out of China, rather than Silicon Valley, is not of his doing.

Instead, it's the law of unintended consequence in action.

Trump's predecessor, Joe Biden, introduced significant export controls targeting China's advanced computing and semiconductor sectors in October 2022 in an attempt to restrict China's access to the kind of critical technologies essential for artificial intelligence and supercomputing development.

As a result, Chinese companies leaned very heavily into open-source development to circumvent the ban.

According to Syz Group's Monchau, "The new developments have raised alarms on whether America’s global lead in artificial intelligence is shrinking and called into question big tech’s massive spend on building AI models and data centres."

The business model of the Bay Area is not what most see. I see regulatory entrepreneurship versus innovation. By accident, sure, some occasional, and random success occurs, but the game is to find legal grey areas, exploit them, then change the law.

Noah Gift, Founder, Pragmatic Labs

Stargate dampener

That big spending was in full view at the end of last week, with the news about the $500bn investment in The Stargate Project, by OpenAI, Oracle, Softbank, and MGX. Those are the kinds are extraordinary investment levels that the performance of DeepSeek in independent third-party benchmark testing now calls into question.

Noah Gift, the founder of Pragmatic AI Labs told Mi3, "The business model of the Bay Area is not what most see. I see regulatory entrepreneurship vs innovation. By accident, sure, some occasional, and random success occurs, but the game is to find legal grey areas, exploit them, then change the law.'

He says that for consumers many of the products that are "disrupting" business are worse than alternatives. "A good example is that you can walk to the light rail station in Seattle and take the train downtown for a few dollars, but an Uber costs $80. Is it innovation or a scam?"

"Similarly, initial 'innovation' is an attempt at regulatory capture vs value. The real value appears to be in [open source] companies like AllenAI, DeepSeek, HuggingFace, and Mistral."

A number of commentators have flagged how the experience of DeepSeek raises questions about the competitive moat for the current LLM leaders.

Per Gift: "I see no competitive advantage – and profit going to zero because anyone can do this."

Gift has been ringing the bell of the potential commoditisation of LLM infrastructure on forums such as LinkedIn for at least 18 months. "I guessed LLMs would be a commodity and see it accelerating even faster than I thought," he said. "In five years, we may not be talking about Artificial General Intelligence at all, simply how to use LLMs in automation."

The author of six books on automation over the last 20 years, Gift is also sceptical about the grandiose claims of companies like OpenAI which is currently trying to raise billions off the back of promises of artificial general intelligence, or AGI, "All I see is just another tool in automation that is interesting, but not even close to AGI."

Energy upside

In the meantime, those claiming that the US has insufficient power to handle OpenAI's ambitions – with Google and others now scrambling to build mini nuclear reactors to deliver the power AI is predicted to require – may have found a solution. Just not the solution they had imagined.

“DeepSeek’s breakthrough signals a shift toward efficiency in AI, which will redefine both energy and AI markets. The opportunities for investors willing to act now are enormous," according to Nigel Green, CEO of global financial advisory deVere Group.

According to Green: "DeepSeek’s model combines cutting-edge algorithms to slash the energy demands of AI training and deployment. This challenges the assumption that AI’s growth is tied to ever-increasing energy consumption. While the market is reacting to short-term uncertainty, efficiency-driven AI models will expand adoption into new markets and industries."

'Efficiency doesn’t shrink demand – it diversifies it ... This means more widespread use, deeper integration, and ultimately, sustained demand for energy solutions/"

Green backed renewable energy firms – particularly those "at the intersection of AI and clean energy" – to be among the ultimate winners.

What do you think?

Search Mi3 Articles