Deep chic, the key of developing low -cost models … There is no such thing as a recent fact

-

(Photo = Shutterstock)

Deep Chic released the technique of developing a ‘V3’ model at a much lower cost than its competitors in December last yr. Liangwon Feng Dip Chic founder also participated within the paper, but most of them are already known.

Deep chic is 14 days (local time) ‘Insights on Deep Chic-V3: Consideration of Hardware for Scalability Challenge and AI ArchitectureThe paper titled ‘was published in an internet archive.

Deep Chic-V3 was trained in 2048 NVIDIA ‘H800’ GPUs, and the important thing to this performance was due to the strategy of ‘hardware-software joint design’. The H800 is designed in 2023 for the Chinese market in step with the US export regulation, and it’s reported that Deep Chic and parent company Hi -Flyer have secured a great amount before the ban.

In accordance with the paper, it focused on the efficient structure optimization that thoroughly reflects hardware constraints as a technique to construct a high -performance large language model (LLM) economically.

On this process, memory efficiency, simplification of communication between chips, and performance improvement of AI infrastructure have been made, which contributed to dramatically reducing AI training and reasoning costs. The researchers emphasized that this approach is “a practical blueprint for the following generation AI system innovation.”

That is numerous disclosure through ‘Open Week’ conducted by Deep Chic in February.

Deep Chic-V3 Architecture (Photo = Archive)
Deep Chic-V3 Architecture (Photo = Archive)

It also announced that it has increased efficiency by introducing a MOE structure. As a substitute of running all the model, it is a way of using only a small expert model in accordance with the query.

This was already published within the previous paper, and it’s an architecture that is usually applied by many Chinese AI firms, including Alibaba. The MOE can be a technology that has already been applied by open AI and Mistral.

This paper has attracted attention that it appeared within the early reading status of subsequent models similar to ‘Deep Chic-R2’ and ‘Deep Chic-V4’ quite than the contents. In accordance with sources, the next model was scheduled to be released in May. It even said that it had accelerated the schedule.

Nevertheless, Deep Chic only released ‘Prouver-V2’ as an open source to proof of mathematics issues on the thirtieth of last month.

Within the meantime, competitors’ pursuit has been growing. Alibaba, in addition to Baidu, Huawei, Xiaomi, and Illitech, have released large and small models, and emphasize that they’ve exceeded deep chic performance.

By Park Chan, reporter cpark@aitimes.com

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x