Microsoft, next-generation web automation AI agent ‘Magentic-UI’ unveiled in collaboration with people

-

(Photo = MS)

Microsoft (MS) unveiled an open source artificial intelligence (AI) Agent ‘Magentic-UI’, which collaborates with people on the twenty second (local time).

One in every of the massive problems of the prevailing AI agent is that the work process is opaque. Because the agent just isn’t shared on what plans and what steps they run, the user has little opportunity to intervene or modify.

Specifically, opacity can result in deadly errors in tasks similar to input of payment information or running code. As well as, if the tendency to exclude user control by specializing in complete autonomy, the outcomes were often contrary to expectations.

To resolve this problem, Magentic -ui supports real-time joint planning, execution sharing, and step-by-step user supervision. Magentic-UI is an interface designed for collaborative web work, providing â–² Co-Planning, Co-Tasking, Motion Guards, and Plan Learning.

The co -planning function shows the user prematurely of a series of steps suggested by the AI ​​before the work is executed, and allows users to switch or delete them directly. This permits users to completely control the AI’s plans and adjust them in the way in which they need.

The co -performance function shows the user the method in real time through the work. If obligatory, the user may temporarily suspend or modify a particular step, and sometimes do it as an alternative. This feature provides an environment where you may flexibly intervene during your work.

The behavioral protection function is designed to undergo the user verification process for dangerous tasks which will cause problems because of mistakes or malfunctions, similar to closing the browser tab or submitting a form. This prevents unexpected results and guarantees user safety.

Lastly, the planning learning function helps user work history and feedback to assist establish sophisticated and efficient plans when performing similar tasks in the longer term. In consequence, the Magentic -ui will likely be increasingly customized for the user over time.

Magentic-UI architecture (photo = MS)
Magentic-UI architecture (photo = MS)

These features are driven by modular agent teams that play different roles.

The Orchestrator plays a central role within the establishment and decision -making of overall work and coordinates various sub agents. ‘WebSurfer’ is liable for manipulation on the internet browser and performs pages, clicks, and forms input.

‘CODER’ is in control of the cord run within the sandbox environment and safely handles the script or calculations obligatory for automation. ‘Filesurfer’ plays a task in analyzing and interpreting documents or data, reading various types of files and extracting the obligatory information. These agents work organically and systematically perform the user’s commands.

When a user submits a request, the orchestrator agent creates a step -by -step plan. The user can edit or delete it through the graphic user interface (GUI) and regenerate a brand new plan if obligatory.

Once the plan is confirmed, it is going to be distributed to specialized agents in control of each work. Agents report the outcomes of the performance, and the orchestrator determines whether to proceed with the subsequent step, repeat or user feedback. All this process is displayed in real time to the user, and the user can stop running at any time.

This structure can transcend easy automation and enable adaptive task flow that may address failure. For instance, if a certain step fails with a link error, the orchestrator can immediately revise the plan and proceed the work with the user consent.

GAIA Benchmark (Photo = MS)
GAIA Benchmark (Photo = MS)

Magentic -ui was tested on the GAIA benchmark, including complex document interpretation and web navigation.

30.3%of the 162 tasks were performed when executing alone, however the success rate increased by 71%to 51.9%when the user helped. The success rate rose to 42.6%even in additional complex simulation users.

Interestingly, user intervention was only 10%of all tasks, and on average, the request for help was just one.1 times. This proves that the performance improves even with a small intervention.

Magentic -ui provided the ‘Saved Plans’ gallery to make the strategy used up to now 3 times. This feature is particularly useful for repeated tasks like formation.

Security has also been strengthened. Browser operation and code execution are all within the Docker container, in order that user credentials should not exposed, and the location access might be limited to the allowable list, and all actions might be approved. Microsoft’s Red Team Test has been appropriately blocked in phishing attacks and prompt injection situations.

Magentic-UI was built on Magentic-One, a multi-agent system released last 12 months, and is predicated on Microsoft’s agent framework.

This project GitHubIt’s open to open source, AI AI Foundry LabsIt could even be utilized in.

By Park Chan, reporter cpark@aitimes.com

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x