Google also focuses on post-training as a result of slowing LLM performance…attempts to regulate ‘hyperparameters’

-

(Photo = Shutterstock)

Following Open AI, news emerged that Google can also be unable to enhance the performance of its ‘Geminii’ model at the identical rate as before and is searching for other ways to enhance it. As they were unable to secure performance through pre-training like OpenAI, they’re reportedly specializing in recent approaches corresponding to reinforcement learning and enhanced inference as a substitute.

The Information reported on the thirteenth (local time) that although Google invested large-scale computing power and learning data, it was unable to enhance the performance of the Gemini model as expected.

The reason is that although pre-training was conducted by investing more data and computing resources than before, there was little difference in performance. Considering that higher execution costs were incurred, it is definitely correct to view it as a performance regression.

This problem is analyzed to be since the ‘scaling law’ has reached its limit. The scaling law is predicated on the belief that performance will proceed to enhance as more data is provided to the LLM and computing resources are expanded, however it has recently been identified that the effectiveness is decreasing in large language models (LLM).

For Google, this can be a particularly major problem. It is because Gemini has a lower adoption rate in comparison with OpenAI’s ChatGPT model. Google hoped to meet up with Open AI by leveraging its superior computing resources, however it didn’t work out as planned.

OpenAI, which is thought to suffer from the identical problem, is attempting to enhance LLM performance with recent inference technology somewhat than prior training. ‘o1’ applies ‘test-time compute’ technology that improves response quality by allocating additional computing resources and time when the user asks a matter, without changing the model’s pre-training.

Google can also be moving in an analogous direction. Google DeepMind is thought to have recently formed a brand new team in its Gemini division to develop features much like o1.

DeepMind can also be particularly focused on manual improvement of models. This process involves adjusting the ‘hyperparameters’ that should be set to implement an optimal learning model.

Through this, the training rate, number of coaching repetitions (Epoch), weight initialization, etc. might be determined, and the optimal values ​​are found through various settings.

Meanwhile, Google also expected to enhance performance by adding some AI-generated synthetic data to Gemini’s learning data, however it was reported that it didn’t see much of an effect.

OpenAI and other developers are also using synthetic data, but have found that this method has limitations in significantly improving AI model performance.

Reporter Park Chan cpark@aitimes.com

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x