Papers
arxiv:2602.09082

UI-Venus-1.5 Technical Report

Published on Feb 9
· Submitted by
Zhangxuan Gu
on Feb 11
#3 Paper of the day
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

UI-Venus-1.5 is a unified GUI agent with improved performance through mid-training stages, online reinforcement learning, and model merging techniques.

AI-generated summary

GUI agents have emerged as a powerful paradigm for automating interactions in digital environments, yet achieving both broad generality and consistently strong task performance remains challenging.In this report, we present UI-Venus-1.5, a unified, end-to-end GUI Agent designed for robust real-world applications.The proposed model family comprises two dense variants (2B and 8B) and one mixture-of-experts variant (30B-A3B) to meet various downstream application scenarios.Compared to our previous version, UI-Venus-1.5 introduces three key technical advances: (1) a comprehensive Mid-Training stage leveraging 10 billion tokens across 30+ datasets to establish foundational GUI semantics; (2) Online Reinforcement Learning with full-trajectory rollouts, aligning training objectives with long-horizon, dynamic navigation in large-scale environments; and (3) a single unified GUI Agent constructed via Model Merging, which synthesizes domain-specific models (grounding, web, and mobile) into one cohesive checkpoint. Extensive evaluations demonstrate that UI-Venus-1.5 establishes new state-of-the-art performance on benchmarks such as ScreenSpot-Pro (69.6%), VenusBench-GD (75.0%), and AndroidWorld (77.6%), significantly outperforming previous strong baselines. In addition, UI-Venus-1.5 demonstrates robust navigation capabilities across a variety of Chinese mobile apps, effectively executing user instructions in real-world scenarios. Code: https://github.com/inclusionAI/UI-Venus; Model: https://huggingface.co/collections/inclusionAI/ui-venus

Community

Paper author Paper submitter
edited about 10 hours ago

Is your GUI Agent ready for real work? 🔥

We’ve seen many great previous GUI Agents, but making a "stable assistant" for phones and websites is still hard. There are three main problems:

1️⃣ Knowledge Gap: AI often misses less common icons and doesn't know how specialized apps work.
2️⃣ The Reality Gap: Models that work well in tests often fail during real-life tasks.
3️⃣ Too Complex: Using multi-agent framework usually costs too much.

Enter UI-Venus-1.5 🚀 — The new high-performance, end-to-end GUI Agent from Ant Group!

Unlike old ways, UI-Venus-1.5 is built for real-world use:
📱 All-in-One: One single model for Grounding, Mobile, and Web tasks.
🇨🇳 Real App Support: Full support for 40+ popular Chinese apps, making AI part of daily life.
⚡ Simple & Fast: A clean, end-to-end design for faster and more reliable work.

Check it out and see how AI can truly help you! 🐜✨

Paper author Paper submitter
edited about 10 hours ago

Sign up or log in to comment

Models citing this paper 3

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.09082 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.09082 in a Space README.md to link it from this page.

Collections including this paper 2