Models model language model announce available analysis trend

Read More, Think More: Revisiting Observation Reduction for Web Agents

arXiv cs.CLby Masafumi Enomoto, Ryoma Obara, Haochen Zhang, Masafumi OyamadaApril 4, 20261 min read0 views

arXiv:2604.01535v1 Announce Type: new Abstract: Web agents based on large language models (LLMs) rely on observations of web pages -- commonly represented as HTML -- as the basis for identifying available actions and planning subsequent steps. Prior work has treated the verbosity of HTML as an obstacle to performance and adopted observation reduction as a standard practice. We revisit this trend and demonstrate that the optimal observation representation depends on model capability and thinking token budget: (1) compact observations (accessibility trees) are preferable for lower-capability models, while detailed observations (HTML) are advantageous for higher-capability models; moreover, increasing thinking tokens further amplifies the benefit of HTML. (2) Our error analysis suggests that

View PDF HTML (experimental)

Abstract:Web agents based on large language models (LLMs) rely on observations of web pages -- commonly represented as HTML -- as the basis for identifying available actions and planning subsequent steps. Prior work has treated the verbosity of HTML as an obstacle to performance and adopted observation reduction as a standard practice. We revisit this trend and demonstrate that the optimal observation representation depends on model capability and thinking token budget: (1) compact observations (accessibility trees) are preferable for lower-capability models, while detailed observations (HTML) are advantageous for higher-capability models; moreover, increasing thinking tokens further amplifies the benefit of HTML. (2) Our error analysis suggests that higher-capability models exploit layout information in HTML for better action grounding, while lower-capability models suffer from increased hallucination under longer inputs. We also find that incorporating observation history improves performance across most models and settings, and a diff-based representation offers a token-efficient alternative. Based on these findings, we suggest practical guidelines: adaptively select observation representations based on model capability and thinking token budget, and incorporate observation history using diff-based representations.

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2604.01535 [cs.CL]

(or arXiv:2604.01535v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2604.01535

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Masafumi Enomoto [view email] [v1] Thu, 2 Apr 2026 02:14:47 UTC (325 KB)

Original source

arXiv cs.CL

https://arxiv.org/abs/2604.01535

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modelannounce

ModelsFresh

ModelReins – The Browser for AI Tools

Article URL: https://modelreins.com Comments URL: https://news.ycombinator.com/item?id=47635601 Points: 3 # Comments: 1

Hacker News AI Top

1mabout 3 hours ago

ProductsLive

AI Agents for Local Business: $500-1,500 Setup + Monthly Retainer

AI Agents for Local Business: $500-1,500 Setup + Monthly Retainer How to build ₹15K-₹75K/chatbot businesses serving Indian SMEs (no coding required) The Opportunity Nobody's Talking About While everyone's fighting over AI side hustles online, there's a goldmine happening offline: Local businesses desperately need AI—but have zero clue how to implement it. Real estate agents, dentists, gyms, restaurants, coaching centers—they're all losing customers because they can't respond to inquiries fast enough. You become the solution. The Business Model: Build AI chatbot once (8-15 hours) Charge setup fee: ₹15K-₹75K Charge monthly retainer: ₹3K-₹15K Maintain: 1-2 hours/month Profit margin: 70-85% Real Example: The Real Estate Bot Deal Client: Real estate agency in Bangalore (3 agents, 50+ properties

Dev.to AI

11m13 minutes ago

ProductsLive

AI Side Hustles for Indians 2026: 10 Ways to Earn ₹50K+/Month

AI Side Hustles for Indians 2026: 10 Ways to Earn ₹50K+/Month Your complete guide to building real income with AI tools in India's booming creator economy The Reality Check Let's cut through the noise. Yes, AI side hustles are exploding in India. But no, you won't wake up rich tomorrow. What you will get is a realistic roadmap to earning ₹50,000-₹2,00,000+ per month using AI tools that cost less than your monthly Zomato order. I've spent the last 6 months researching, testing, and talking to Indians actually making money with AI in 2026. Here's what works right now . 1. AI Reel Editing Service (₹4K-₹15K per client/month) The Opportunity: Every local business, coach, and creator needs Reels. Most hate editing. You become the solution. What You Do: Auto-generate captions using AI Remove back

Dev.to AI

8m11 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 183 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsFresh

ModelReins – The Browser for AI Tools

Article URL: https://modelreins.com Comments URL: https://news.ycombinator.com/item?id=47635601 Points: 3 # Comments: 1

Hacker News AI Top

1mabout 3 hours ago

ModelsLive

Emotion concepts and their function in a large language model

Comments

Hacker News

16mabout 1 hour ago

ModelsLive

Emotion concepts and their function in a large language model

Article URL: https://www.anthropic.com/research/emotion-concepts-function Comments URL: https://news.ycombinator.com/item?id=47636435 Points: 12 # Comments: 1

Hacker News Top

16mabout 1 hour ago

ModelsLive

Building an IBAN Validation API with Hono, SQLite, and MCP published: true

Building an IBAN Validation API with Hono, SQLite, and MCP I recently shipped IBANforge , a free API for IBAN validation and BIC/SWIFT lookup. In this article, I'll walk through the key architectural decisions and share real code from the project. ## Why Hono Over Express When I started IBANforge, I considered Express, Fastify, and Hono. I went with Hono for three reasons: Performance -- Hono is built for edge runtimes and benchmarks significantly faster than Express on Node.js TypeScript-first -- Full type inference on routes, middleware, and context Lightweight middleware -- Built-in CORS, compression, and logging with zero config Here's how the main app comes together: typescript import { Hono } from 'hono'; import { compress } from 'hono/compress'; import { cors } from 'hono/cors'; imp

Dev.to AI

5m14 minutes ago