Dual-Stage LLM Framework for Scenario-Centric Semantic Interpretation in Driving Assistance
Hey there, little explorer! Imagine your toy car is super smart, like a robot! 🚗💨
Sometimes, this smart toy car tries to understand what's happening around it, like if a ball rolls into the street or a friend is walking.
Scientists are teaching these smart car brains (we call them "LLMs," like super-duper talking robots!) to be extra careful. They want to make sure the car knows when something is a little bit risky or very, very risky.
This paper is like a special game where they test the smart car brains. They show them pictures and tell them stories about driving, then ask, "Is this safe or not?" 🚦
They found that sometimes, even smart robot brains don't always agree on what's risky. So, they're working hard to teach them to be super clear and safe, especially when people are nearby! It's like making sure your toy car always knows to stop for your teddy bear! 🐻🛑
arXiv:2603.27536v1 Announce Type: new Abstract: Advanced Driver Assistance Systems (ADAS) increasingly rely on learning-based perception, yet safety-relevant failures often arise without component malfunction, driven instead by partial observability and semantic ambiguity in how risk is interpreted and communicated. This paper presents a scenario-centric framework for reproducible auditing of LLM-based risk reasoning in urban driving contexts. Deterministic, temporally bounded scenario windows are constructed from multimodal driving data and evaluated under fixed prompt constraints and a close — Jean Douglas Carvalho, Hugo Taciro Kenji, Ahmad Mohammad Saber, Glaucia Melo, Max Mauro Dias Santos, Deepa Kundur
View PDF HTML (experimental)
Abstract:Advanced Driver Assistance Systems (ADAS) increasingly rely on learning-based perception, yet safety-relevant failures often arise without component malfunction, driven instead by partial observability and semantic ambiguity in how risk is interpreted and communicated. This paper presents a scenario-centric framework for reproducible auditing of LLM-based risk reasoning in urban driving contexts. Deterministic, temporally bounded scenario windows are constructed from multimodal driving data and evaluated under fixed prompt constraints and a closed numeric risk schema, ensuring structured and comparable outputs across models. Experiments on a curated near-people scenario set compare two text-only models and one multimodal model under identical inputs and prompts. Results reveal systematic inter-model divergence in severity assignment, high-risk escalation, evidence use, and causal attribution. Disagreement extends to the interpretation of vulnerable road user presence, indicating that variability often reflects intrinsic semantic indeterminacy rather than isolated model failure. These findings highlight the importance of scenario-centric auditing and explicit ambiguity management when integrating LLM-based reasoning into safety-aligned driver assistance systems.
Subjects:
Artificial Intelligence (cs.AI)
Cite as: arXiv:2603.27536 [cs.AI]
(or arXiv:2603.27536v1 [cs.AI] for this version)
https://doi.org/10.48550/arXiv.2603.27536
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Jean Carvalho [view email] [v1] Sun, 29 Mar 2026 06:20:08 UTC (4,637 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxiv
Delaunay Canopy: Building Wireframe Reconstruction from Airborne LiDAR Point Clouds via Delaunay Graph
arXiv:2604.02497v1 Announce Type: new Abstract: Reconstructing building wireframe from airborne LiDAR point clouds yields a compact, topology-centric representation that enables structural understanding beyond dense meshes. Yet a key limitation persists: conventional methods have failed to achieve accurate wireframe reconstruction in regions afflicted by significant noise, sparsity, or internal corners. This failure stems from the inability to establish an adaptive search space to effectively leverage the rich 3D geometry of large, sparse building point clouds. In this work, we address this challenge with Delaunay Canopy, which utilizes the Delaunay graph as a geometric prior to define a geometrically adaptive search space. Central to our approach is Delaunay Graph Scoring, which not only

SocioEval: A Template-Based Framework for Evaluating Socioeconomic Status Bias in Foundation Models
arXiv:2604.02660v1 Announce Type: new Abstract: As Large Language Models (LLMs) increasingly power decision-making systems across critical domains, understanding and mitigating their biases becomes essential for responsible AI deployment. Although bias assessment frameworks have proliferated for attributes such as race and gender, socioeconomic status bias remains significantly underexplored despite its widespread implications in the real world. We introduce SocioEval, a template-based framework for systematically evaluating socioeconomic bias in foundation models through decision-making tasks. Our hierarchical framework encompasses 8 themes and 18 topics, generating 240 prompts across 6 class-pair combinations. We evaluated 13 frontier LLMs on 3,120 responses using a rigorous three-stage

Speaking of Language: Reflections on Metalanguage Research in NLP
arXiv:2604.02645v1 Announce Type: new Abstract: This work aims to shine a spotlight on the topic of metalanguage. We first define metalanguage, link it to NLP and LLMs, and then discuss our two labs' metalanguage-centered efforts. Finally, we discuss four dimensions of metalanguage and metalinguistic tasks, offering a list of understudied future research directions.
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers

Let's Have a Conversation: Designing and Evaluating LLM Agents for Interactive Optimization
arXiv:2604.02666v1 Announce Type: new Abstract: Optimization is as much about modeling the right problem as solving it. Identifying the right objectives, constraints, and trade-offs demands extensive interaction between researchers and stakeholders. Large language models can empower decision-makers with optimization capabilities through interactive optimization agents that can propose, interpret and refine solutions. However, it is fundamentally harder to evaluate a conversation-based interaction than traditional one-shot approaches. This paper proposes a scalable and replicable methodology for evaluating optimization agents through conversations. We build LLM-powered decision agents that role-play diverse stakeholders, each governed by an internal utility function but communicating like a

Speaking of Language: Reflections on Metalanguage Research in NLP
arXiv:2604.02645v1 Announce Type: new Abstract: This work aims to shine a spotlight on the topic of metalanguage. We first define metalanguage, link it to NLP and LLMs, and then discuss our two labs' metalanguage-centered efforts. Finally, we discuss four dimensions of metalanguage and metalinguistic tasks, offering a list of understudied future research directions.

Neural correlates of perceptual consciousness from within: a narrative review of human intracranial research
arXiv:2510.08736v2 Announce Type: replace Abstract: Despite many years of research, the quest to identify neural correlates of perceptual consciousness (NCC) remains unresolved. One major obstacle lies in methodological limitations: most studies rely on non-invasive neural measures with limited spatial or temporal resolution making it difficult to disentangle proper NCCs from concurrent cognitive processes. Additionally, the relatively low sensitivity of non-invasive neural measures limits the interpretation of null findings in studies targeting proper NCCs. In this review, we discuss how human intracranial recordings can advance the search for NCCs, by offering high spatiotemporal resolution, improved signal sensitivity, and broad cortical and subcortical coverage. We review studies that

Size-structured populations with growth fluctuations: Feynman--Kac formula and decoupling
arXiv:2508.14680v2 Announce Type: replace-cross Abstract: We study a size-structured population model in which individual cells grow at a rate determined by a fluctuating internal variable (e.g., gene expression levels). Many previous models of phenotypically heterogeneous populations can be viewed as special cases of this model, and it has previously been observed that the internal variable decouples from cell size under certain conditions. In this work, we generalize these results and connect them to the Feynman-Kac formula, which yields relationships between the lineage dynamics and population distribution in branching processes. To this end, we derive conditions for decoupling, both in the lineage and population ensemble. When decoupling occurs in both ensembles, the size dynamics can


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!