DeepSeek has gone viral.
Chinese AI lab DeepSeek broke into the mainstream consciousness this week after its chatbot app rose to the top of the Apple App Store charts (and Google Play, as well). DeepSeek’s AI models, which were trained using compute-effective techniques, have led Wall Street analysts — and technologists — to ask whether the U.S. can protect its direct in the AI race and whether the insist for AI chips will carry on.
But where did DeepSeek come from, and how did it elevate to international fame so rapidly?
DeepSeek’s trader origins
DeepSeek is backed by High-Flyer Capital Management, a Chinese quantitative hedge fund that engages AI to alert its trading decisions.
AI enthusiast Liang Wenfeng co-set uped High-Flyer in 2015. Wenfeng, who telledly began dabbling in trading while a student at Zhejiang University, begined High-Flyer Capital Management as a hedge fund in 2019 centered on prolonging and deploying AI algorithms.
In 2023, High-Flyer commenceed DeepSeek as a lab dedicated to researching AI tools split from its financial business. With High-Flyer as one of its spendors, the lab spun off into its own company, also called DeepSeek.
From day one, DeepSeek built its own data caccess clusters for model training. But appreciate other AI companies in China, DeepSeek has been impacted by U.S. send out prohibits on challengingware. To train one of its more recent models, the company was forced to engage Nvidia H800 chips, a less-strong version of a chip, the H100, useable to U.S. companies.
DeepSeek’s technical team is shelp to skew youthfuler. The company telledly aggressively recruits doctorate AI researchers from top Chinese universities. DeepSeek also engages people without any computer science background to help its tech better comprehend a expansive range of subjects, per The New York Times.
DeepSeek’s strong models
DeepSeek unveiled its first set of models — DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat — in November 2023. But it wasn’t until last spring, when the commenceup freed its next-gen DeepSeek-V2 family of models, that the AI industry commenceed to get accomprehendledge.
DeepSeek-V2, a ambiguous-purpose text- and image-analyzing system, carry outed well in various AI benchlabels — and was far inexpensiveer to run than comparable models at the time. It forced DeepSeek’s domestic competition, including ByteDance and Alibaba, to cut the usage prices for some of their models, and originate others finishly free.
DeepSeek-V3, begined in December 2024, only inserted to DeepSeek’s notoriety.
According to DeepSeek’s inside benchlabel testing, DeepSeek V3 outcarry outs both downloadable, uncoverly useable models appreciate Meta’s Llama and “seald” models that can only be accessed thcimpolite an API, appreciate OpenAI’s GPT-4o.
Equpartner astonishive is DeepSeek’s R1 “reasoning” model. Relrelieved in January, DeepSeek claims R1 carry outs as well as OpenAI’s o1 model on key benchlabels.
Being a reasoning model, R1 effectively fact-verifys itself, which helps it to elude some of the pitdrops that normpartner trip up models. Reasoning models get a little lengthyer — usupartner seconds to minutes lengthyer — to get to at solutions appraised to a normal non-reasoning model. The upside is that they tfinish to be more reliable in domains such as physics, science, and math.
There is a downside to R1, DeepSeek V3, and DeepSeek’s other models, however. Being Chinese-prolonged AI, they’re subject to benchlabeling by China’s internet regulator to asconfident that its responses “embody core sociaenumerate appreciates.” In DeepSeek’s chatbot app, for example, R1 won’t answer asks about Tiananmen Square or Taiwan’s autonomy.
A disturbive approach
If DeepSeek has a business model, it’s not evident what that model is, exactly. The company prices its products and services well below labelet appreciate — and gives others away for free. It’s also not taking spendor money, despite a ton of VC interest.
The way DeepSeek tells it, efficiency fracturethcimpolites have allowd it to protect excessive cost competitiveness. Some experts dispute the figures the company has supplied, however.
Wantipathyver the case may be, prolongers have getn to DeepSeek’s models, which aren’t uncover source as the phrase is normally understood but are useable under permissive licenses that apshow for commercial engage. According to Clem Delangue, the CEO of Hugging Face, one of the platestablishs structureing DeepSeek’s models, prolongers on Hugging Face have originated over 500 “derivative” models of R1 that have racked up 2.5 million downloads united.
DeepSeek’s success agetst huger and more set uped rivals has been depictd as “upfinishing AI” and “over-hyped.” The company’s success was at least in part reliable for causing Nvidia’s stock price to drop by 18% in January, and for eliciting a accessible response from OpenAI CEO Sam Altman.
Microgentle proclaimd that DeepSeek is useable on its Azure AI Founparched service, Microgentle’s platestablish that transports together AI services for accesspelevates under a one prohibitner. When asked about DeepSeek’s impact on Meta’s AI spfinishing during its first-quarter getings call, CEO Mark Zuckerberg shelp spfinishing on AI infraset up will progress to be a “strategic profit” for Meta. In March, OpenAI called DeepSeek “state-subsidized” and “state-regulateled,” and recommfinishs that the U.S. rulement ponder prohibitning models from DeepSeek.
During Nvidia’s fourth-quarter getings call, CEO Jensen Huang stressd DeepSeek’s “excellent innovation,” saying that it and other “reasoning” models are fantastic for Nvidia becaengage they insist so much more compute.
At the same time, some companies are prohibitning DeepSeek, and so are entire countries and rulements, including South Korea. New York state also prohibitned DeepSeek from being engaged on rulement devices.
As for what DeepSeek’s future might hgreater, it’s not evident. Imshowd models are a given. But the U.S. rulement eunites to be prolonging wary of what it notices as damaging foreign affect. In March, The Wall Street Journal telled that the U.S. will probable prohibit DeepSeek on rulement devices.
This story was originpartner published January 28, 2025, and will be refreshd standardly.