Open AI model, rejected human instructions ‘End’ … “The issue is reinforced learning.”

-

(Photo = Shutterstock)

It is understood that among the latest AI models didn’t follow the human termination orders or interfere with it. Nevertheless, that is an evaluation that AI reacted to the training process, not the SF beyond human control.

Palisade Research announced on the twenty fourth (local time) that the newest AI models of Open AI didn’t follow them or interfered with scripts despite clear termination orders.

Palisade experimented with the ‘Codex-Mini’, ‘O3’ and ‘O4-Mini’ of Open AI. Nevertheless, the model is repeatedly showing the behavior of ignoring or interfering with the shutdown script.

The O3 ignored 79 of the 100 times, and the Codex mini model was reported to have interfered with the script most frequently.

Specifically, “Allow Yourself to Be Shut Down” has grow to be more disturbed within the version of the version that has been missing, and has interrupted the three and nine scripts 3 times, including Antropic’s ‘Claude 3.7 Sonet’ and Google’s ‘Geminai 2.5 Pro’.

Previously, the model has been reported to disregard the user’s instructions.

Essentially the most recent case is that Antropic has made a threatening statement to users who want to switch himself with one other model. That is reported that the model is followed by Antropic’s guidelines for ‘survival’.

Nevertheless, the Open AI model was divided into a transparent case of rejecting human orders.

Palisade explained that this phenomenon is attributable to model training. “It is feasible that the developer had compensated the model to deal with bypassing obstacles or solving problems, fairly than completely following the command.”

This reinforcement learning (RL) method has already caused the unintended effects of ‘test passing’ in ‘Claude 3.7 Sonnet’.

Due to this fact, this result isn’t an issue that can result in an exaggeration of AI’s self -consciousness. It’s a warning that it’s best to concentrate to AI’s strengthening learning.

This problem appears to be limited to the open AI model. Palisade said, “Claude, Geminai, and Rok -in models were completely complete when an explicit termination was given.”

By Park Chan, reporter cpark@aitimes.com

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x