DeepSeek: Chinese AI masterstroke

Share this:

DeepSeek, a relatively unknown Chinese startup based in Hangzhou, has made significant waves in the artificial intelligence world with its DeepSeek-R1 AI model.

This model has reportedly been trained in just two months for under $6 million, a remarkably low cost compared to its competitors.

According to reports, DeepSeek spent around $5.5 million to train its V3 model, a fraction of what US Big Tech companies like Google and OpenAI have invested, with estimates ranging from $70 million to $100 million.

READ ALSO:  Enemy within: Rev. Father Okechukwu killed by man he sheltered for over 10 years

The DeepSeek app has become a sensation, topping the charts as the most downloaded free app on Apple’s App Store in the US, as well as in China.

This rapid rise to prominence has sparked both admiration and concern, with some experts hailing it as a significant breakthrough in AI development.

The model’s capabilities are said to rival those of top US models, and its API costs are substantially lower, at $0.55 per million input tokens and $2.19 per million output tokens, compared to OpenAI’s $15 and $60 respectively.

READ ALSO:  Jamb Denies Scrapping UTME As Admission Requirement, Warns Against Fake News

DeepSeek’s achievement has also triggered a reaction in the US tech industry, with some analysts warning that the company’s success could be a wake-up call for US AI developers.

The news of DeepSeek’s progress led to a $1 trillion selloff in US and European tech stocks, with Nvidia’s market value taking a significant hit.

However, others see DeepSeek’s success as a demonstration of the potential for more efficient AI development, with some experts noting that the true value of AI lies in data and metadata.

READ ALSO:  Osun poll: Peter Obi congratulates Senator Adeleke, Lasun Yusu, LP candidate

The company’s approach to AI development has been described as innovative, utilizing techniques such as reinforced learning and multi-token systems that enable faster and more efficient processing.

DeepSeek’s Mixture-of-Experts language model is also notable, activating only the most relevant parameters for each token, rather than keeping all parameters active.

This approach has allowed DeepSeek to achieve impressive results while minimizing memory usage and costs.

Share this:
RELATED NEWS

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -

Latest NEWS

Trending News

Get Notifications from DDM News Yes please No thanks