“Deep chic model development cost 8.2 billion won exaggerated … The entire cost might be 100 times greater than 100 times.”

-

(Photo = Shutterstock)

Analysts have shown that the event cost of ‘Deep Chic-V3’, which is barely $ 55.7 million, is definitely over $ 500 million (about 72.9 billion won), which is 100 times that of. This is similar amount as Meta’s $ 500 million in developing Rama 3.1.

Semi -Annalism, a semiconductor specialist, announced on Wednesday that the fee of the Deep Chic was exaggerated, and the entire cost of the V3 was “confident that it will exceed $ 500 million.”

Initially, he identified that the fee of investing within the GPU repeatedly by Deep Chic’s parent company, High-Flyer Quantum.

The corporate purchased 10,000 NVIDIA ‘A100’ GPUs in 2021. This was before US export control began, and the A100 was the best performance chip. ‘H100’ was released in 2022 the next yr.

It also said that it has trained the model with the ‘H800’, but Semi -Annalisys will have the option to access various NVIDIA Hopper chips, including 10,000 H800s, 10,000 H100s, that are banned from exports to China, and H20, the present Chinese flagship export chip. I saw it. Actually, Deep Chic boasted that 10,000 GPUs may be used without restrictions in job ads.

The cumulative costs invested by Deep Chic on the server amounted to about $ 1.69 billion, and it costs $ 944 million to operate it.

Deep chic estimates server investment costs for 4 years. The hardware purchase and operation amounted to $ 2.573 billion. (Photo = Sammi Annalrysis)
Deep chic estimates server investment costs for 4 years. The hardware purchase and operation amounted to $ 2.573 billion. (Photo = Sammi Annalrysis)

Labor costs are usually not easy, but promising applicants are known to supply greater than $ 1.3 million. This is way higher than China’s promising startups resembling Moonshot AI. Deep Chic currently has greater than 150 employees and is expanding rapidly.

He also identified that the fee of architecture development for model development was also lost. For instance, ‘Multihead potential Attention (MLA)’ technology, a key technical element of the V3, took just a few months to develop, and the workforce and GPUs are significant.

Thus, the $ 55.7 million that Deep Chic revealed is just the fee of preliminary training of the model, which is barely a part of the model development. He identified that every one vital parts resembling GPU introduction and upgrades, maintenance, R & D, and labor costs are missing. All of that is concluded that it cost greater than $ 500 million.

The Deep Chic-R1, which surpassed O1, identified that the fee itself was not mentioned in any respect.

The R1 was caught up in O1 in three months, and the post -training was decisive, and Deep Chic also focused on RL (reinforcement training). Specifically, it is understood to have used ‘distillation’ technology using O1’s synthetic data. But this will not be mentioned in any respect.

Following the model development cost, the corporate also analyzed the fee of reasoning that fell to 10 minutes and 1 level.

Even though it is a powerful model, there is no such thing as a doubt, but the looks of the V3 is the flow of the best performance model every few months. ‘GPT-4O’, which is in comparison with the V3, is a model that was released seven months ago. As well as, ‘O1’, which is in comparison with ‘R1’, was released in September last yr, three months ago.

It is not uncommon sense that reasoning costs decrease over time attributable to the event of AI technology, and in keeping with estimates, the explanation for reasoning is 4 times per yr by improving algorithms and optimization. Actually, the explanation for reasoning of the GPT-3, which was released in May 2020, has declined 1200 times after 4 years and eight months. As well as, ‘Rama 3.1’, which appeared only two months later than the GPT-4O, had only half of the explanation for reasoning.

It is not uncommon that the fee of reasoning falls half the extent every few months, however the evaluation is emphasized due to the indisputable fact that deep chic got here from toxic China.

MMLU reasoning cost trend (photo = Sammi Annalrysis)
MMLU reasoning cost trend (photo = Sammi Annalrysis)

The exaggeration said that the performance of the V3 can be severe. There may be a greater model in america.

Initially, R1 itself is tough to say that it surpasses O1. The benchmarks released by Deep Chic are a part of the benchmarks that O1 have a superb performance, and R1 has not been ahead of all fields.

The more noticeable model than the R1 emphasized that Google was ‘Considering’, which was released in December. This model is for much longer than the R1 and can be ahead of the benchmark. It was also released without cost in the course of the beta test.

As well as, the ‘O3’ of the open AI, which was announced sooner than R1, is a special class than O1 or R1.

Sammi Annlyseys emphasized that “all of this will not be to scale back the amazing achievements of Deep Chic,” Deep Chic is a start -up that’s fast, clever and sufficient.

Specifically, he added that there’s a reason for the ahead of the large tech -like big tech within the reasoning model.

By Dae -jun Lim, reporter ydj@aitimes.com

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x