Posterior Optimization with Clipped Objective for Bridging Efficiency and Stability in Generative Policy Learning
arXiv:2604.01860v1 Announce Type: new Abstract: Expressive generative models have advanced robotic manipulation by capturing complex, multi-modal action distributions over temporally extended trajectories. However, fine-tuning these policies via RL remains challenging due to instability and sample inefficiency. We introduce Posterior Optimization with Clipped Objective (POCO), a principled RL framework that formulates policy improvement as a posterior inference problem tailored for temporal action chunks. Through an Expectation-Maximization procedure, POCO distills a reward-weighted implicit posterior into the policy without likelihood estimation. Furthermore, POCO adopts an offline-to-online paradigm that anchors online exploration to pre-trained priors, and its model-agnostic design scal
View PDF HTML (experimental)
Abstract:Expressive generative models have advanced robotic manipulation by capturing complex, multi-modal action distributions over temporally extended trajectories. However, fine-tuning these policies via RL remains challenging due to instability and sample inefficiency. We introduce Posterior Optimization with Clipped Objective (POCO), a principled RL framework that formulates policy improvement as a posterior inference problem tailored for temporal action chunks. Through an Expectation-Maximization procedure, POCO distills a reward-weighted implicit posterior into the policy without likelihood estimation. Furthermore, POCO adopts an offline-to-online paradigm that anchors online exploration to pre-trained priors, and its model-agnostic design scales to fine-tune large VLA models without architectural modifications. Evaluations across 7 simulation benchmarks and 4 contact-rich real-world tasks demonstrate that POCO prevents catastrophic policy collapse, outperforms SOTA baselines, and achieves a 96.7% success rate on real-world tasks. Videos are available at our project website this https URL.
Subjects:
Robotics (cs.RO)
Cite as: arXiv:2604.01860 [cs.RO]
(or arXiv:2604.01860v1 [cs.RO] for this version)
https://doi.org/10.48550/arXiv.2604.01860
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Yuhui Chen [view email] [v1] Thu, 2 Apr 2026 10:15:47 UTC (11,831 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelbenchmarkannounce
Q A: How Plane Finder set itself up for the long haul
Plane Finder is a sparkling example of what happens when a small team grows with a platform. Launched in 2009, Plane Finder didn’t scale over the years by adding headcount, vendors, or complexity. Instead, founders Jodie and Lee Armstrong made a long-term bet on Apple’s ecosystem — staying native, sticking close to first-party tools, and reading platform signals early. And over time, an app that began as “planes on a map” evolved into a full end-to-end flight-tracking business — one that includes a global network of physical hardware — built and operated by a team of just eight people. We talked to the married founders about their early days, the new design and Liquid Glass, and the challenges of running a global flight tracking network. Plane Finder Available on: iPhone, iPad, Apple Watch

How Infold Games fashioned an open world for Infinity Nikki
Infinity Nikki is a literally glowing example of what video game graphics can be. The fifth in a series of dress-up titles from Infold Games, Infinity Nikki is also the first to embrace elements of RPG action-adventure. But instead of tracking down weapons and battling bad guys, this installment finds its wide-eyed heroine solving puzzles by collecting enchanted outfits found throughout a series of wondrous lands. Infinity Nikki Available on: iPhone, iPad Based in: Singapore Awards: Apple Design Award winner for Visuals and Graphics (2025), App Store Awards Game of the Year finalist (2025), App Store Editors’ Choice Download Infinity Nikki from the App Store > The fashion-forward gameplay still remains, of course. Nikki’s dress sways and waves every step of the way, while capes sparkle and

7 Best AI Coding Assistant Tools in 2026
“The future of coding is not fewer developers. It’s developers with superpowers.” - Andrew Ng, Founder of DeepLearning.AI What is an AI Coding Assistant? An AI coding assistant helps developers write and fix code faster. It works inside a coding editor and gives suggestions as developers type. A real AI coding assistant tool does more than just autocomplete. It can… Suggest code in real time Explain existing code Help fix bugs Refactor messy logic Follow your project style Learn from your repo over time Most live inside IDEs like VS Code. They feel like an intelligent pair programmer who matches your vibe and is always ready to help. However, there are notable differences between AI coding assistants and AI code generators. And this is important. Any size of engineering team can start usin
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Open Source AI
v4.3.2
Changes Gemma 4 support with full tool-calling in the API and UI. 🆕 ik_llama.cpp support : Add ik_llama.cpp as a new backend through new textgen-portable-ik portable builds and a new --ik flag for full installs. ik_llama.cpp is a fork by the author of the imatrix quants, including support for new quant types, significantly more accurate KV cache quantization (via Hadamard KV cache rotation, enabled by default), and optimizations for MoE models and CPU inference. API: Add echo + logprobs for /v1/completions . The completions endpoint now supports the echo and logprobs parameters, returning token-level log probabilities for both prompt and generated tokens. Token IDs are also included in the output via a new top_logprobs_ids field. Further optimize my custom gradio fork, saving up to 50 ms

How to Run Local AI Agents on Consumer‑Grade Hardware: A Practical Guide
How to Run Local AI Agents on Consumer‑Grade Hardware: A Practical Guide Want to run powerful AI agents without the endless API bills of cloud services? The good news is you don’t need a data‑center‑grade workstation. A single modern consumer GPU is enough to host capable 9B‑parameter models like qwen3.5:9b, giving you private, low‑latency inference at a fraction of the cost. This article walks you through the exact hardware specs, VRAM needs, software installation steps, and budget‑friendly upgrade paths so you can get a local agent up and running today—no PhD required. Why a Consumer GPU Is Enough It’s a common myth that you must buy a professional‑grade card (think RTX A6000 or multiple GPUs linked via NVLink) to run LLMs locally. In reality, for 9B‑class models the sweet spot lies in t


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!