Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessWeekend Project: I Built a Full MLOps Pipeline for a Credit Scoring Model (And You Can Too)Hackernoon AIUMich Engineering, School of Information offers AI minors - The Michigan DailyGNews AI educationHuawei gave tough spot to Nvidia in 2025 Chinese AI chip sales race - Huawei CentralGNews AI HuaweiShahed-killing interceptor drones may look simple, but building them to keep up with the threat isn't easyBusiness InsiderHow Strataphy Geothermal Cooling to Manage AI's Energy Demands - cairoscene.comGNews AI energyUber drivers: Your boss knows you're using Tesla's FSD on the jobBusiness InsiderPitchBook: US venture funding surges to record $267B as OpenAI, Anthropic and xAI dominate AI deals - SiliconANGLEGoogle News: OpenAIDetecting Complex Money Laundering Patterns with Incremental and Distributed Graph ModelingarXivSven: Singular Value Descent as a Computationally Efficient Natural Gradient MethodarXivModel Merging via Data-Free Covariance EstimationarXivDySCo: Dynamic Semantic Compression for Effective Long-term Time Series ForecastingarXivSECURE: Stable Early Collision Understanding via Robust Embeddings in Autonomous DrivingarXivBlack Hat USADark ReadingBlack Hat AsiaAI BusinessWeekend Project: I Built a Full MLOps Pipeline for a Credit Scoring Model (And You Can Too)Hackernoon AIUMich Engineering, School of Information offers AI minors - The Michigan DailyGNews AI educationHuawei gave tough spot to Nvidia in 2025 Chinese AI chip sales race - Huawei CentralGNews AI HuaweiShahed-killing interceptor drones may look simple, but building them to keep up with the threat isn't easyBusiness InsiderHow Strataphy Geothermal Cooling to Manage AI's Energy Demands - cairoscene.comGNews AI energyUber drivers: Your boss knows you're using Tesla's FSD on the jobBusiness InsiderPitchBook: US venture funding surges to record $267B as OpenAI, Anthropic and xAI dominate AI deals - SiliconANGLEGoogle News: OpenAIDetecting Complex Money Laundering Patterns with Incremental and Distributed Graph ModelingarXivSven: Singular Value Descent as a Computationally Efficient Natural Gradient MethodarXivModel Merging via Data-Free Covariance EstimationarXivDySCo: Dynamic Semantic Compression for Effective Long-term Time Series ForecastingarXivSECURE: Stable Early Collision Understanding via Robust Embeddings in Autonomous DrivingarXiv
AI NEWS HUBbyEIGENVECTOREigenvector

GUIDE: Resolving Domain Bias in GUI Agents through Real-Time Web Video Retrieval and Plug-and-Play Annotation

arXivby [Submitted on 27 Mar 2026]March 30, 20262 min read1 views
Source Quiz

arXiv:2603.26266v1 Announce Type: new Abstract: Large vision-language models have endowed GUI agents with strong general capabilities for interface understanding and interaction. However, due to insufficient exposure to domain-specific software operation data during training, these agents exhibit significant domain bias - they lack familiarity with the specific operation workflows (planning) and UI element layouts (grounding) of particular applications, limiting their real-world task performance. In this paper, we present GUIDE (GUI Unbiasing via Instructional-Video Driven Expertise), a traini — Rui Xie, Zhi Gao, Chenrui Shi, Zirui Shang, Lu Chen, Qing Li

View PDF HTML (experimental)

Abstract:Large vision-language models have endowed GUI agents with strong general capabilities for interface understanding and interaction. However, due to insufficient exposure to domain-specific software operation data during training, these agents exhibit significant domain bias - they lack familiarity with the specific operation workflows (planning) and UI element layouts (grounding) of particular applications, limiting their real-world task performance. In this paper, we present GUIDE (GUI Unbiasing via Instructional-Video Driven Expertise), a training-free, plug-and-play framework that resolves GUI agent domain bias by autonomously acquiring domain-specific expertise from web tutorial videos through a retrieval-augmented automated annotation pipeline. GUIDE introduces two key innovations. First, a subtitle-driven Video-RAG pipeline unlocks video semantics through subtitle analysis, performing progressive three-stage retrieval - domain classification, topic extraction, and relevance matching - to identify task-relevant tutorial videos. Second, a fully automated annotation pipeline built on an inverse dynamics paradigm feeds consecutive keyframes enhanced with UI element detection into VLMs, inferring the required planning and grounding knowledge that are injected into the agent's corresponding modules to address both manifestations of domain bias. Extensive experiments on OSWorld demonstrate GUIDE's generality as a plug-and-play component for both multi-agent systems and single-model agents. It consistently yields over 5% improvements and reduces execution steps - without modifying any model parameters or architecture - validating GUIDE as an architecture-agnostic enhancement to bridge GUI agent domain bias.

Comments: 28 pages, 8 figures, 7 tables

Subjects:

Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.26266 [cs.AI]

(or arXiv:2603.26266v1 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2603.26266

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Rui Xie [view email] [v1] Fri, 27 Mar 2026 10:33:08 UTC (4,985 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
GUIDE: Reso…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 159 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers