benchmark

AI2 unveils open source LMM ‘Mormo’… “Outperforms GPT-4o by learning from 100 times less data”

https://www.youtube.com/watch?v=spBxYa3eAlA Allen AI Institute (AI2) has launched ‘Molmo’, an open source large multimodal model (LMM) product line. AI2 claimed that its Molmo model learned high-quality data and outperformed OpenAI's 'GPT-4o' within the benchmark. Enterprise Beat...

Apple Releases Benchmark Tool to Determine LLM’s Real Capabilities… “Open Source, Closed, and Far Insufficient”

Apple has released a brand new benchmark tool that measures the actual capabilities of artificial intelligence (AI) in large language models (LLMs). The outcomes of testing major models showed that open source models are...

“AI Agent Benchmarks Are Different from Model Evaluation…Cost is the Key”

A brand new benchmark proposal for artificial intelligence (AI) agents has emerged. The researchers claim that it's difficult to measure agent performance using existing AI model benchmarks, and that a crucial variable called 'cost'...

Rakuten launches LLM with the strongest Japanese performance…recording 69 points in benchmark

Rakuten has launched a big language model (LLM) trained on a large-scale Japanese dataset. The reason is that the tokenizer vocabulary was significantly increased to process complex Japanese characters, and the common rating...

The Death of the Static AI Benchmark

Benchmarking as a Measure of SuccessBenchmarks are sometimes hailed as a trademark of success. They're a celebrated way of measuring progress — whether it’s achieving the sub 4-minute mile or the power to excel...

Cerebras Systems Sets Recent Benchmark in AI Innovation with Launch of the Fastest AI Chip Ever

Cerebras Systems known for constructing massive computer clusters which are used for all types of and scientific tasks.has yet again shattered records within the AI industry by unveiling its latest technological marvel, the...

Inflection AI launches chatbot 'Pi' with IQ added to EQ… “Performance matches that of GPT-4”

Inflection AI, which goals to create emotional and human-like artificial intelligence (AI), has released a recent large-scale language model (LLM) 'inflection-2.5'. It was emphasized that this model was near the performance of OpenAI's...

Benchmark, the storied enterprise firm, sees “traps” in today’s AI funding frenzy: “Don’t be Microsoft”

Yesterday in Helsinki, this editor interviewed 4 of the six general partners at Benchmark, the nearly 30-year-old, Silicon Valley firm that’s known for some notable bets (Uber, Dropbox), paying each general partner the exact...

Recent posts

Popular categories

ASK DUKE