There is a common perception in the world of AI that developing large-scale high-level language models requires significant technological and financial resources. This is one of the main reasons why the US government has pledged to fund the $500 billion Stargate project announced by US President Donald Trump .
But Chinese AI developer DeepSeek has broken that notion. On January 20, 2025, DeepSeek released the R1 LLM, a model that was developed by other vendors at a fraction of the cost of its own development . DeepSeek has released its R1 models under an open source license, allowing them to be used for free.
Within days of its release, the DeepSeek AI assistant – a mobile app that provides a chatbot interface for DeepSeek R1 – OpenAI’s ChatGPT DeepSeek’s surge in usage and popularity has sent stock markets into a tailspin as investors question the value of US-based AI players, including Nvidia . Microsoft, Meta Platforms, Oracle, Broadcom and other tech giants have seen sharp declines as AI valuations have been reassessed.
What is DeepSeek?
DeepSeek is an AI development company based in Hangzhou, China. The company was founded in May 2023 by Liang Wenfeng, a graduate of Zhejiang University. Wenfeng also co-founded High-Flyer, a China-based quantitative hedge fund that owns DeepSeek. Currently, DeepSeek operates as an independent AI research lab under the sponsorship of High-Flyer. DeepSeek’s full funding and valuation have not been publicly disclosed.
DeepSeek focuses on developing open source LLM. The company released its first model in November 2023. The company has iterated on its core LLM several times and built many different versions. However, it wasn’t until January 2025, after releasing the R1 reasoning model, that the company became known worldwide.
The company offers a variety of services for its models, including web interfaces, mobile apps, and API access .
OpenAI and DeepSeek
DeepSeek represents the latest challenge for OpenAI, which solidified its position as an industry leader with the launch of ChatGPT in 2022. OpenAI has helped drive the next generation of AI with their GPT model and o1 .
Both companies take different approaches while developing LLM programs in generative AI.
OpenAI | Deep search. | |
Year of establishment | 2015 | 2023 |
head office | San Francisco, California | Hangzhou, China |
Focus on development. | Extensive AI capabilities | Effective open source models |
Main models | GPT-4o, o1 | DeepSeek-V3၊ DeepSeek-R1 |
Specialized models | Dall-E (image generation), Tik-Tik (voice recognition) |
DeepSeek Coder (programming), Janus Pro (visual modeling) |
API Price (per million tokens) |
o1: $15 (input), $60 (output) | DeepSeek-R1: $0.55 (input), $2.19 (output) |
Open Source Policy | Limitation | Mostly open source. |
Training methods | Supervision, guidance, coordination | Reinforcement learning. |
Development costs | Hundreds of millions of dollars for o1 (estimated), | According to the company, DeepSeek-R1 costs just under $6 million. |
Training innovation at DeepSeek.
DeepSeek uses a different approach to training its R1 models than OpenAI uses. It requires less training time, fewer AI accelerators, and lower development costs. DeepSeek’s goal is to achieve artificial general intelligence , and the company’s improvements in inference capabilities represent a significant advance in the development of AI.
In a research paper, DeepSeek described several innovations the company developed as part of the R1 model,
- Reinforcement learning. DeepSeek uses large-scale reinforcement learning, focusing on reasoning tasks.
- Reward engineering. Researchers have developed a rule-based reward system for modeling learning rather than the commonly used neural reward models. Reward engineering is the process of designing incentive systems that guide AI models to learn during training.
- Distillation. Using efficient knowledge transfer techniques, DeepSeek researchers were able to compress capabilities into models as small as 1.5 billion parameters.
- Emergent Behavioral Networks. DeepSeek’s emerging behavioral innovation is the discovery that complex reasoning patterns can evolve naturally through reinforcement learning without explicit programming.
DeepSeek large language model
Since the company was founded in 2023, DeepSeek has released a series of new generations of AI models. With each new generation, the company has sought to improve both the models’ performance and efficiency.
- DeepSeek Coder was launched in November 2023, and it is the company’s first open source model specifically designed for coding tasks.
- DeepSeek LLM was released in December 2023, and it is the first version of the company’s general-purpose model.
- DeepSeek- V2 was released in May 2024, this is the second release of the company’s LLM, focused on delivering robust performance and reducing training costs.
- DeepSeek-Coder-V2. Released in July 2024, it is a 236 billion-parameter model that provides a correlation window of 128,000 tokens, designed for complex cryptographic challenges.
- DeepSeek- V3 Launched in December 2024, DeepSeek-V3 uses a hybrid expert architecture that can handle multiple tasks . The model has 671 billion parameters with a content length of 128,000.
- DeepSeek-R1. Launched in January 2025, the model is based on DeepSeek-V3 and focuses on advanced inference operations that directly compete with OpenAI’s o1 model in terms of performance while being significantly lower in cost. Like DeepSeek-V3, the model has 671 billion parameters with a subject length of 128,000.
- Janus-Pro-7B Janus-Pro-7B, launched in January 2025, is a visual representation capable of understanding and creating images.
Why is it causing alarm in the United States?
DeepSeek-R1 caused a stir in the United States and led to a sell-off in technology stocks on the stock market. On Monday, January 27, 2025, the Nasdaq Composite fell 3.4% at the market open, while Nvidia fell 17%, losing about $600 billion in market capitalization.
DeepSeek is causing alarm in the United States for several reasons, including:
- Cost disruption. DeepSeek claims to have developed its R1 model for less than $6 million. Low-cost development threatens the business model of US tech companies that have invested billions of dollars in AI. DeepSeek is cheaper for users than OpenAI.
- The technology is a success, despite its limitations. The export of high-performance AI accelerator chips and GPUs from the US to China is restricted. However, DeepSeek has proven that it is possible to develop advanced AI without using the most advanced technology in the US.
- Business model threats : Unlike OpenAI, which is a proprietary technology, DeepSeek is open source and free, challenging the revenue model of US companies that charge monthly fees for AI services.
- Geopolitical concerns: China-based DeepSeek challenges US tech dominance in AI. Tech investor Marc Andreessen has called AI’s “Sputnik moment,” comparing it to the successes of the Soviet space race in the 1950s.
DeepSeek cyberattack
Cyber attackers have not been unaware of DeepSeek’s popularity.
On January 27, 2025, DeepSeek reported a massive malicious attack on its services , forcing the company to temporarily restrict new user registrations. The attack came at a time when DeepSeek’s AI assistant app had surpassed ChatGPT to become the most downloaded app on the Apple App Store.
Despite the attack, DeepSeek maintained service for existing users. The issue continued until January 28, when the company announced that it had identified the issue and released a fix.
Public reports have suggested that DeepSeek was carrying out some form of DDoS attack targeting the company’s API and web chat platform, but DeepSeek has not specified the exact nature of the attack.
DeepSeek data was exposed.
Wiz Research – a team within cloud security provider Wiz Inc. — On January 29, 2025, the discovery of a publicly accessible backend database leaked sensitive information to the web. The information included DeepSeek chat history, backend data, log streams, API keys, and activity details. Shortly after the announcement, DeepSeek took the database offline. It is unclear how long the database had been exposed.
Sean Michael Kerner is an IT consultant, technology enthusiast, and tinkerer. He is best known for creating Token Ring, configuring NetWare, and compiling his own Linux kernel. He advises industry and media organizations on technology issues.