LG AI Research Institute (President Bae Kyung-hoon) announced on the tenth that it had released three models based on ‘EXAONE 3.5’ as open source.
That is an update conducted 4 months after the discharge of the ‘ExaOne 3.0’-based 7.8B model in August.
The reason is that that is the results of collecting feedback from firms, institutions, and academia immediately after the discharge of ExaOne 3.0. It was reported that there have been many requests to reveal ‘models of assorted sizes’ that might be used efficiently based on the aim of use.
The ExaOne 3.5-based models released this time are divided into three types depending on their size. It was emphasized that every one three models demonstrated superior performance in comparison with ‘global models of the identical size’.
The primary is the ‘2.4B model’, an ultra-light model for on-device use. It’s a light-weight model that permits learning and inference even in an on-device environment or low-end GPU.
Next is the 7.8B lightweight model that might be used for general purposes depending on the user’s purpose. He added that the scale is identical because the open source model of the previous version (3.0), however the performance has been further improved.
Lastly, the 32B model is a high-performance model of Frontier AI level. It is alleged to be a strong model for users who prioritize performance.
Specifically, it was emphasized that these three ExaOne 3.5 models have excellent performance and economic feasibility. First, within the pre-training stage, we focused on improving the performance of model answers and reducing infrastructure costs, including removing duplicate data and personally identifiable information.
Within the post-training stage, the main target was on increasing the usability of the model and its ability to perform recent tasks. Through supervisory wonderful tuning (SFT) and direct preference optimization (DPO) methods, the flexibility to comply with instructions was strengthened to reflect user preferences as much as possible.
It was said that an intensive ‘data decontamination’ process was also carried out to extend the reliability of the performance evaluation results. The reason is that while borrowing the decontamination method utilized in the worldwide model, the technique of comparing the dataset used for evaluation and learning data was repeated 10 times and a rigorous benchmark performance evaluation was conducted.
It was revealed that every one models released this time have 32,000 context windows. It was revealed that every model demonstrated the very best performance in ‘Long Context’ processing in comparison with global models of comparable size.
Specifically, it was confirmed that the performance of context understanding of long sentences was at the very best level not only in English but additionally in Korean.
We also focused on usability.
Within the technical report of ExaOne 3.5, performance related to actual usability was described as ‘real world use case’. Using a complete of seven benchmarks, all three models ranked first in the typical rating for ‘compliance with instructions’ ability, surpassing global models of the identical size by a big margin. This also signifies that excellent performance has been secured not only in English but additionally in Korean.
It was reported that math and programming skills also showed excellent performance. Using a complete of 9 single benchmarks, the two.4B model specifically ranked first in the typical rating, showing superior performance in comparison with global models of the identical size. It was reported that the 7.8B model and the 32B model also recorded the very best average scores.
Meanwhile, in Korea, it was reported that they’re discussing applying ExaOne 3.5-based AI solutions to corporate services with their very own software, similar to Polaris Office and Hancom. Specifically, we’re pursuing a technology proof-of-concept (PoC) project to implement ExaOne 3.5-based AI services in Hancom Office, which is very utilized by public institutions.
LG AI researchers announced that they plan to proceed releasing open source models in the long run.
Bae Kyung-hoon, director of LG AI Research Institute, said, “With the intention to discover and stop potential risks upfront, we now have conducted an AI ethics impact assessment process to review risks throughout the whole AI life cycle, and have continued research and development in compliance with LG AI ethical principles.” “We’ll proceed to research ethics in the long run,” he said.
Reporter Jang Se-min semim99@aitimes.com