Models model announce available policy component arxiv

Decision-Centric Design for LLM Systems

ArXiv CS.AIby Wei SunApril 2, 20261 min read0 views

arXiv:2604.00414v1 Announce Type: new Abstract: LLM systems must make control decisions in addition to generating outputs: whether to answer, clarify, retrieve, call tools, repair, or escalate. In many current architectures, these decisions remain implicit within generation, entangling assessment and action in a single model call and making failures hard to inspect, constrain, or repair. We propose a decision-centric framework that separates decision-relevant signals from the policy that maps them to actions, turning control into an explicit and inspectable layer of the system. This separation supports attribution of failures to signal estimation, decision policy, or execution, and enables modular improvement of each component. It unifies familiar single-step settings such as routing and a

View PDF HTML (experimental)

Abstract:LLM systems must make control decisions in addition to generating outputs: whether to answer, clarify, retrieve, call tools, repair, or escalate. In many current architectures, these decisions remain implicit within generation, entangling assessment and action in a single model call and making failures hard to inspect, constrain, or repair. We propose a decision-centric framework that separates decision-relevant signals from the policy that maps them to actions, turning control into an explicit and inspectable layer of the system. This separation supports attribution of failures to signal estimation, decision policy, or execution, and enables modular improvement of each component. It unifies familiar single-step settings such as routing and adaptive inference, and extends naturally to sequential settings in which actions alter the information available before acting. Across three controlled experiments, the framework reduces futile actions, improves task success, and reveals interpretable failure modes. More broadly, it offers a general architectural principle for building more reliable, controllable, and diagnosable LLM systems.

Subjects:

Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Cite as: arXiv:2604.00414 [cs.AI]

(or arXiv:2604.00414v1 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2604.00414

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Wei Sun [view email] [v1] Wed, 1 Apr 2026 02:57:23 UTC (54 KB)

Original source

ArXiv CS.AI

https://arxiv.org/abs/2604.00414

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelannounceavailable

CountriesFresh

40,000 taxi drivers sign up for e-payment systems as new rules come into force

More than 40,000 Hong Kong taxi drivers have adopted major electronic payment platforms AlipayHK and WeChat Pay HK, easing a protracted pain point for mainland Chinese visitors who are expected to arrive in droves during the coming Ching Ming Festival break. AlipayHK, a subsidiary of Ant International, an affiliate of Alibaba Group Holding, which owns the South China Morning Post, said on Thursday that its e-payment system was available to 47,000 taxi drivers. WeChat Pay HK said more than 40,000...

SCMP Tech (Asia AI)

1mabout 4 hours ago

ModelsLive

LLMOps in 2026: The 10 Tools Every Team Must Have

Don’t deploy another model until you check out these essential 2026 LLMOps tools.

KDnuggets

1mabout 1 hour ago

ModelsLive

An interview with Mustafa Suleyman on Microsoft s AI reorg, how revising its OpenAI contract "unlocked [Microsoft s] ability to pursue superintelligence", more (Hayden Field/The Verge)

Hayden Field / The Verge : An interview with Mustafa Suleyman on Microsoft's AI reorg, how revising its OpenAI contract unlocked [Microsoft's] ability to pursue superintelligence , more Its new transcription model is a step towards those goals, says Microsoft AI's Mustafa Suleyman.

Techmeme

1m27 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 151 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsLive

LLMOps in 2026: The 10 Tools Every Team Must Have

Don’t deploy another model until you check out these essential 2026 LLMOps tools.

KDnuggets

1mabout 1 hour ago

ModelsLive

An interview with Mustafa Suleyman on Microsoft s AI reorg, how revising its OpenAI contract "unlocked [Microsoft s] ability to pursue superintelligence", more (Hayden Field/The Verge)

Techmeme

1m27 minutes ago

ModelsLive

Vulkan backend much easier on the CPU and GPU memory than CUDA.

On linux and compiled my own llama.cpp with CUDA support, top would always show one pegged CPU core at 100% when running Qwen3.5-9B-GGUF:Q4_K_M on my potato like RTX A2000 12GB. Also, nvidia-smi would show 11GB+ of GPU memory usage. Speed is ~30 tokens per second. My system fans would spin up when this single core gets pegged which was annoying to listen to. Decided to compile llama.cpp again with Vulkan backend to see if anything would be different. Well it was a big difference when using the exact same model Now, top is only showing one CPU core at about 30% usage and nvidia-smi is only showing 7.2GB of GPU memory usage. Speed is the same at ~30 tokens per second. No longer have my system fan spinning up when running inferencing. Just curious why the GPU memory footprint is lower and CPU

Reddit r/LocalLLaMA

1m27 minutes ago

ModelsLive

Microsoft launches ‘mid-class’ AI model as compute limits bite

Tech giant’s AI chief says it will have the resources to build frontier systems later this year

Financial Times Tech

1mabout 1 hour ago