H Company’s recent Holo2 model takes the lead in UI Localization

-


Ramzi De Coster's avatar

Two months since releasing our first batch of Holo2 models, H Company is back with our largest UI localization model yet: Holo2-235B-A22B Preview. This model achieves a brand new State-of-the-Art (SOTA) record of 78.5% on Screenspot-Pro and 79.0% on OSWorld G.

Available on Hugging Face, Holo2-235B-A22B Preview is a research release focused on UI element localization.

benchmark_table_light (3)

Agentic Localization

High-resolution 4K interfaces are difficult for localization models. Small UI elements will be difficult to pinpoint on a big display. With agentic localization, nevertheless, Holo2 can iteratively refine its predictions, improving accuracy with each step and unlocking 10-20% relative gains across all Holo2 model sizes.

Holo2-235B-A22B’s Performance on ScreenSpot-Pro

Holo2-235B-A22B Preview reaches 70.6% accuracy on ScreenSpot-Pro in a single step. In agent mode, it achieves 78.5% inside 3 steps, setting a brand new state-of-the-art on essentially the most difficult GUI grounding benchmark.

cost_perf_screenspot_pro_light (2)



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x