Evaluation

[신년사] Kim Se-yeop, CEO of Select Star, “We’ll grow right into a total service company focused on AI reliability evaluation.”

Selectstar announced that it should grow right into a 'total AI service company' that's accountable for all stages of artificial intelligence (AI) introduction, from data design to large language model (LLM) verification. The core...

[1월 1주] Leaderboard Season 2, evaluation progressed to 86%… Top overseas developers with ‘Gemma 2’

'Open Ko-LLM Leaderboard Season 2' has entered the official opening countdown, completing the evaluation of 86% of all goal models. Amongst these, the most recent models from overseas developers based on ‘Gemma 2’ took...

Jeonnam Superintendent of Education Kim Dae-jung ranked first in job performance evaluation for 4 consecutive months

Kim Dae-jung, superintendent of South Jeolla Province, recorded a positive evaluation of 61.4% within the October 2024 superintendent job performance evaluation, rating first for 4 consecutive months with an approval rating above 60%, the...

Methods to Create a RAG Evaluation Dataset From Documents

Mechanically create domain-specific datasets in any language using LLMsNevertheless, there are lots of parameters we'd like to set in a RAG pipeline, and researchers are all the time suggesting recent improvements. How will we...

LLM Evaluation Skills Are Easy to Pick Up (Yet Costly to Practice)

Here’s how to not waste your budget on evaluating models and systemsYou possibly can construct a fortress in two ways: Start stacking bricks one above the opposite, or draw an image of the fortress...

Gwangwoon AI Autonomous Driving Competition Held… Coding and Autonomous Driving Evaluation for Middle School Students

Gwangwoon Artificial Intelligence High School and RoboLink announced on the twelfth that they held the '2024 Gwangwoon AI Autonomous Driving Competition' for middle school students. The event, sponsored by Kwangwoon Academy, Seoul Metropolitan City, Seoul...

Human Rights Commission, AI Development Autonomous Evaluation Guidelines… “Expecting AI Human Rights Impact Assessment Laws”

The National Human Rights Commission of Korea (Chairperson Song Doo-hwan) announced on the ninth that it had expressed its opinion to the Minister of Science and ICT that as a way to prevent human...

Top Evaluation Metrics for RAG Failures

If you've been experimenting with large language models (LLMs) for search and retrieval tasks, you've likely come across retrieval augmented generation (RAG) as a method so as to add relevant contextual information to LLM...

Recent posts

Popular categories

ASK ANA