Meta, ‘Rama 3.1’ released… “Beyond the strongest open source, performance matches GPT-4 and Claude”

-

(Photo = Meta)

Meta has released the most important open source AI model ever with 405 billion parameters, ‘Rama 3.1’. It emphasized that this model is comparable to the present best performing models akin to OpenAI’s ‘GPT-4o’ and Antropic’s ‘Claude 3.5 Sonnet’.

VentureBeat reported on the twenty third (local time) that Meta has released the ‘Rama 3.1’ product line, the very best performing open source AI model of all time. The biggest model, ‘Rama 3.1 405B’ with 405 billion parameters, which was first introduced that day, is the most important open source model of all time.

It has been only three months for the reason that small version of ‘Rama 3’ was introduced last April. The ‘Rama 3 8B’ and ‘Rama 3 70B’ models released at the moment were also upgraded to ‘Rama 3.1 8B’ and ‘Rama 3.1 70B’.

The three Rama 3.1 relations have eight common language conversation capabilities, including English, Arabic, Bengali, German, Hindi, Portuguese, Thai, and Spanish.

It was also introduced that it might not only write high-level computer codes, but in addition solve more complex mathematical problems than previous versions.

Greater than 15 trillion tokens were used to coach the Rama 3.1 model, and the context window size was increased by 16 times in comparison with the previous version to 128,000 tokens. 128,000 tokens is roughly the length of a 50-page book.

The Rama 3.1 405B model has been relicensed to permit firms to create synthetic datasets that may be used to coach or fine-tune small open source models.

Rama 3.1 405B model and major AI model benchmark results (Photo = Meta)
Rama 3.1 405B model and major AI model benchmark results (Photo = Meta)

The most important strength of the brand new 405B model is that it has essentially the most powerful performance among the many open source models.

In response to benchmark data, Llama 3.1 405B recorded an accuracy rate of 88.6% on MMLU, a benchmark for measuring reasoning ability, which is on par with OpenAI GPT-4o’s 88.7%.

That is 88.3% ahead of Anthropic’s Claude 3.5 Sonnet. The lightweight model, the Lama 3.1 8B model, also recorded superior performance in comparison with the open source models of the identical class, Google’s ‘Gemma 2 9B’ and Mistral’s ‘Mistral 7B’.

Rama 3.1 8B, 70B models and major AI model benchmark results (Photo = Meta)
Rama 3.1 8B, 70B models and major AI model benchmark results (Photo = Meta)

Nonetheless, despite its high performance, Rama 3.1 has some issues.

To start with, this model is simply too large to simply run on a single GPU or perhaps a dozen GPUs. That is why the collaboration with Nvidia was said to have been helpful within the announcement of Rama 3.1.

NVIDIA is Meta’s principal partner, supplying GPUs. Rama 3.1 was trained on 16,000 NVIDIA H100 GPUs.

Nonetheless, Meta said that the price of using Rama 3.1 is barely half that of GPT-4o, and that they’re working with about 20 firms including Microsoft (MS), Amazon, Google, and Nvidia to make it available to more developers.

It also cannot understand or input images since it shouldn’t be a multimodal model. As an alternative, Meta said it plans to release a multimodal model later this yr.

In fact, it is probably going that its performance will soon be surpassed by higher proprietary models. Google is developing an AI agent called ‘Project Astra’ that handles complex tasks beyond simply generating text or images.

OpenAI can be preparing to release ‘GPT-5’, which can change the usual for big language models (LLMs) depending on the content. Antropic can be training ‘Claude 3.5 Opus’ and developing AI agents.

Additionally it is not known exactly what business license Meta granted for the usage of Llama 3.1. Prior to now, Meta has been criticized for distorting the meaning of open source resulting from the restrictions placed on the license of the Llama model.

For that reason, the term ‘open model’ is increasingly used as an alternative of ‘open source model’. Meta didn’t release the training dataset this time either.

Users can experience Rama 3.1 on Meta’s mobile messenger, WhatsApp (US) and the Meta AI website.

“The Rama 3.1 is a product that may compete with essentially the most advanced models on the market,” said Mark Zuckerberg, CEO of Meta. “We expect future Rama models, starting next yr, to be essentially the most advanced models within the industry.”

Reporter Park Chan cpark@aitimes.com

ASK DUKE

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x