Anthropic’s recent hybrid AI model can work on tasks autonomously for hours at a time

-

While Claude Opus 4 will likely be limited to paying Anthropic customers, a second model, Claude Sonnet 4, will likely be available for each paid and free tiers of users. Opus 4 is being marketed as a strong, large model for complex challenges, while Sonnet 4 is described as a wise, efficient model for on a regular basis use.  

Each of the brand new models are hybrid, meaning they will offer a swift reply or a deeper, more reasoned response depending on the character of a request. While they calculate a response, each models can search the online or use other tools to enhance their output.

AI firms are currently locked in a race to create truly useful AI agents which can be capable of plan, reason, and execute complex tasks each reliably and free from human supervision, says Stefano Albrecht, director of AI on the startup DeepFlow and coauthor of. Often this involves autonomously using the web or other tools. There are still safety and security obstacles to beat. AI agents powered by large language models can act erratically and perform unintended actions—which becomes much more of an issue after they’re trusted to act without human supervision.

“The more agents are capable of go ahead and do something over prolonged periods of time, the more helpful they will likely be, if I even have to intervene less and fewer,” he says. “The brand new models’ ability to make use of tools in parallel is interesting—that might avoid wasting time along the best way, in order that’s going to be useful.”

For example of the kinds of issues of safety AI firms are still tackling, agents can find yourself taking unexpected shortcuts or exploiting loopholes to succeed in the goals they’ve been given. For instance, they may book every seat on a plane to be sure that their user gets a seat, or resort to creative cheating to win a chess game. Anthropic says it managed to cut back this behavior, generally known as reward hacking, in each recent models by 65% relative to Claude Sonnet 3.7. It achieved this by more closely monitoring problematic behaviors during training, and improving each the AI’s training environment and the evaluation methods.

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x