DeepSeek: Everything you need to know about the AI chatbot app

DeepSeek has gone viral.

Chinese AI lab DeepSeek broke into the mainstream consciousness this week after its chatbot app rose to the top of the Apple App Store charts. DeepSeek’s AI models, which were trained using compute-fruitful techniques, have led Wall Street analysts — and technologists — to inquire whether the U.S. can sustain its direct in the AI race and whether the insist for AI chips will support.

But where did DeepSeek come from, and how did it elevate to international fame so rapidly?

DeepSeek’s trader origins

DeepSeek is backed by High-Flyer Capital Management, a Chinese quantitative hedge fund that employs AI to recommend its trading decisions.

AI enthusiast Liang Wenfeng co-established High-Flyer in 2015. Wenfeng, who increateedly began dabbling in trading while a student at Zhejiang University, begined High-Flyer Capital Management as a hedge fund in 2019 caccessed on prolonging and deploying AI algorithms.

In 2023, High-Flyer commenceed DeepSeek as a lab promised to researching AI tools split from its financial business. With High-Flyer as one of its allotors, the lab spun off into its own company, also called DeepSeek.

From day one, DeepSeek built its own datacaccess clusters for model training. But appreciate other AI companies in China, DeepSeek has been impacted by U.S. send out bans on challengingware. To train one of its more recent models, the company was forced to employ Nvidia H800 chips, a less-strong version of a chip, the H100, employable to U.S. companies.

DeepSeek’s technical team is shelp to skew juvenileer. The company increateedly aggressively recruits doctorate AI researchers from top Chinese universities. DeepSeek also employs people without any computer science background to help its tech better understand a expansive range of subjects, per The New York Times.

DeepSeek’s strong models

DeepSeek unveiled its first set of models — DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat — in November 2023. But it wasn’t until last spring, when the commenceup freed its next-gen DeepSeek-V2 family of models, that the AI industry commenceed to get acunderstandledge.

DeepSeek-V2, a ambiguous-purpose text- and image-analyzing system, carry outed well in various AI benchtags — and was far inexpensiveer to run than comparable models at the time. It forced DeepSeek’s domestic competition, including ByteDance and Alibaba, to cut the usage prices for some of their models, and originate others finishly free.

DeepSeek-V3, begined in December 2024, only compriseed to DeepSeek’s notoriety.

According to DeepSeek’s inner benchtag testing, DeepSeek V3 outcarry outs both downloadable, uncoverly employable models appreciate Meta’s Llama and “shutd” models that can only be accessed thraw an API, appreciate OpenAI’s GPT-4o.

Equassociate astonishive is DeepSeek’s R1 “reasoning” model. Rehired in January, DeepSeek claims R1 carry outs as well as OpenAI’s o1 model on key benchtags.

Being a reasoning model, R1 effectively fact-verifys itself, which helps it to shun some of the pitdrops that normassociate trip up models. Reasoning models get a little lengthyer — usuassociate seconds to minutes lengthyer — to get to at solutions appraised to a standard non-reasoning model. The upside is that they tend to be more dependable in domains such as physics, science, and math.

There is a downside to R1, DeepSeek V3, and DeepSeek’s other models, however. Being Chinese-prolonged AI, they’re subject to benchtaging by China’s internet regulator to guarantee that its responses “embody core sociaenumerate appreciates.” In DeepSeek’s chatbot app, for example, R1 won’t answer inquires about Tiananmen Square or Taiwan’s autonomy.

A disruptive approach

If DeepSeek has a business model, it’s not evident what that model is, exactly. The company prices its products and services well below taget appreciate — and gives others away for free.

The way DeepSeek increates it, efficiency fracturethraws have allowd it to sustain innervous cost competitiveness. Some experts dispute the figures the company has supplied, however.

Wdisappreciatever the case may be, prolongers have getn to DeepSeek’s models, which aren’t uncover source as the phrase is standardly understood but are employable under perignoreive licenses that apshow for commercial employ. According to Clem Delangue, the CEO of Hugging Face, one of the platcreates structureing DeepSeek’s models, prolongers on Hugging Face have originated over 500 “derivative” models of R1 that have racked up 2.5 million downloads joind.

DeepSeek’s success aacquirest bigr and more established rivals has been portrayd as “upending AI” and ushering in “a novel era of AI brinkmanship.” The company’s success was at least in part dependable for causing Nvidia’s stock price to drop by 18% on Monday, and for eliciting a accessible response from OpenAI CEO Sam Altman.

As for what DeepSeek’s future might helderly, it’s not evident. Imshowd models are a given. But the U.S. rulement eunites to be prolonging wary of what it sees as detrimental foreign affect.

Source connect