Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessOpenAI’s Never-Ending Soap Opera - The InformationGoogle News: OpenAITennibot launches Partner V2, its latest robotic tennis ball machineThe Robot ReportI fear Anthropic, OpenAI, and SpaceX IPOs will suck capital out of the market, says Jim Cramer - CNBCGoogle News: OpenAIBusinesses scramble to get noticed by AI searchBBC Technology[D] How's MLX and jax/ pytorch on MacBooks these days?Reddit r/MachineLearningWhich Artificial Intelligence (AI) Supercycle Stock Will Make You Richer Over the Next 10 Years? - The Motley FoolGoogle News: AIOpenAI policy blueprint sparks AI regulation debate - Fox BusinessGNews AI regulationAnthropic Claude AI training model targets AI skills gap | ETIH EdTech News - EdTech Innovation HubGoogle News: ClaudeSamsung flags eightfold jump in Q1 profit as AI chip demand drives up prices - ReutersGNews AI SamsungCNBC s The China Connection newsletter: Why AI isn t replacing jobs in China (yet)CNBC TechnologyA top US shipbuilder is exploring how AI and robots can do some of the hardest jobs on the production floorBusiness InsiderAnonymous Sources Detail Sam Altman’s Alleged Untrustworthiness in New ReportGizmodoBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessOpenAI’s Never-Ending Soap Opera - The InformationGoogle News: OpenAITennibot launches Partner V2, its latest robotic tennis ball machineThe Robot ReportI fear Anthropic, OpenAI, and SpaceX IPOs will suck capital out of the market, says Jim Cramer - CNBCGoogle News: OpenAIBusinesses scramble to get noticed by AI searchBBC Technology[D] How's MLX and jax/ pytorch on MacBooks these days?Reddit r/MachineLearningWhich Artificial Intelligence (AI) Supercycle Stock Will Make You Richer Over the Next 10 Years? - The Motley FoolGoogle News: AIOpenAI policy blueprint sparks AI regulation debate - Fox BusinessGNews AI regulationAnthropic Claude AI training model targets AI skills gap | ETIH EdTech News - EdTech Innovation HubGoogle News: ClaudeSamsung flags eightfold jump in Q1 profit as AI chip demand drives up prices - ReutersGNews AI SamsungCNBC s The China Connection newsletter: Why AI isn t replacing jobs in China (yet)CNBC TechnologyA top US shipbuilder is exploring how AI and robots can do some of the hardest jobs on the production floorBusiness InsiderAnonymous Sources Detail Sam Altman’s Alleged Untrustworthiness in New ReportGizmodo
AI NEWS HUBbyEIGENVECTOREigenvector

Parser-Oriented Structural Refinement for a Stable Layout Interface in Document Parsing

arXiv cs.CVby Fuyuan Liu, Dianyu Yu, He Ren, Nayu Liu, Xiaomian Kang, Delai Qiu, Fa Zhang, Genpeng Zhen, Shengping Liu, Jiaen Liang, Wei Huang, Yining Wang, Junnan ZhuApril 6, 20262 min read0 views
Source Quiz

arXiv:2604.02692v1 Announce Type: new Abstract: Accurate document parsing requires both robust content recognition and a stable parser interface. In explicit Document Layout Analysis (DLA) pipelines, downstream parsers do not consume the full detector output. Instead, they operate on a retained and serialized set of layout instances. However, on dense pages with overlapping regions and ambiguous boundaries, unstable layout hypotheses can make the retained instance set inconsistent with its parser input order, leading to severe downstream parsing errors. To address this issue, we introduce a lightweight structural refinement stage between a DETR-style detector and the parser to stabilize the parser interface. Treating raw detector outputs as a compact hypothesis pool, the proposed module pe

Authors:Fuyuan Liu, Dianyu Yu, He Ren, Nayu Liu, Xiaomian Kang, Delai Qiu, Fa Zhang, Genpeng Zhen, Shengping Liu, Jiaen Liang, Wei Huang, Yining Wang, Junnan Zhu

View PDF HTML (experimental)

Abstract:Accurate document parsing requires both robust content recognition and a stable parser interface. In explicit Document Layout Analysis (DLA) pipelines, downstream parsers do not consume the full detector output. Instead, they operate on a retained and serialized set of layout instances. However, on dense pages with overlapping regions and ambiguous boundaries, unstable layout hypotheses can make the retained instance set inconsistent with its parser input order, leading to severe downstream parsing errors. To address this issue, we introduce a lightweight structural refinement stage between a DETR-style detector and the parser to stabilize the parser interface. Treating raw detector outputs as a compact hypothesis pool, the proposed module performs set-level reasoning over query features, semantic cues, box geometry, and visual evidence. From a shared refined structural state, it jointly determines instance retention, refines box localization, and predicts parser input order before handoff. We further introduce retention-oriented supervision and a difficulty-aware ordering objective to better align the retained instance set and its order with the final parser input, especially on structurally complex pages. Extensive experiments on public benchmarks show that our method consistently improves page-level layout quality. When integrated into a standard end-to-end parsing pipeline, the stabilized parser interface also substantially reduces sequence mismatch, achieving a Reading Order Edit of 0.024 on OmniDocBench.

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2604.02692 [cs.CV]

(or arXiv:2604.02692v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2604.02692

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Fuyuan Liu [view email] [v1] Fri, 3 Apr 2026 03:36:36 UTC (19,087 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

benchmarkannouncefeature

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Parser-Orie…benchmarkannouncefeatureanalysisreasoninginterfacearXiv cs.CV

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 160 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Products