Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessNvidia’s AI Powerhouse Rally Ignites Fresh Wall Street Hype - TipRanksGNews AI NVIDIAThe Real Reason OpenAI Shut Sora Down Is a Warning to Every AI Startup - FuturismGoogle News: OpenAIChinese firms market Iran war intelligence ‘exposing’ U.S. forces - The Washington PostGNews AI military[P] Implemented ACT-R cognitive decay and hyperdimensional computing for AI agent memory (open source)Reddit r/MachineLearningtrunk/8c8414e5c03f21b5405acc2fd9115f4448dcd08a: revert https://github.com/pytorch/pytorch/pull/172340 (#179151)PyTorch ReleasesWhite Lake group to host April 14 program on how artificial intelligence works - Shoreline Media GroupGoogle News: AINvidia’s $2 billion Marvell bet is not an investment. It is a toll booth.The Next Web NeuralNvidia’s $2 billion Marvell bet is not an investment. It is a toll booth. - The Next WebGNews AI NVIDIAAI Agents Increase Developer Preparatory Workload - Let's Data ScienceGNews AI IBMNetflix, Meta, IBM speakers discuss AI and their workdays - theregister.comGNews AI IBM[D]Is AI cost tracking/attribution a real problem or just something you deal with later?Reddit r/MachineLearningAnthropic Spots 'Emotion Vectors' Inside Claude That Influence AI BehaviorDecrypt AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessNvidia’s AI Powerhouse Rally Ignites Fresh Wall Street Hype - TipRanksGNews AI NVIDIAThe Real Reason OpenAI Shut Sora Down Is a Warning to Every AI Startup - FuturismGoogle News: OpenAIChinese firms market Iran war intelligence ‘exposing’ U.S. forces - The Washington PostGNews AI military[P] Implemented ACT-R cognitive decay and hyperdimensional computing for AI agent memory (open source)Reddit r/MachineLearningtrunk/8c8414e5c03f21b5405acc2fd9115f4448dcd08a: revert https://github.com/pytorch/pytorch/pull/172340 (#179151)PyTorch ReleasesWhite Lake group to host April 14 program on how artificial intelligence works - Shoreline Media GroupGoogle News: AINvidia’s $2 billion Marvell bet is not an investment. It is a toll booth.The Next Web NeuralNvidia’s $2 billion Marvell bet is not an investment. It is a toll booth. - The Next WebGNews AI NVIDIAAI Agents Increase Developer Preparatory Workload - Let's Data ScienceGNews AI IBMNetflix, Meta, IBM speakers discuss AI and their workdays - theregister.comGNews AI IBM[D]Is AI cost tracking/attribution a real problem or just something you deal with later?Reddit r/MachineLearningAnthropic Spots 'Emotion Vectors' Inside Claude That Influence AI BehaviorDecrypt AI
AI NEWS HUBbyEIGENVECTOREigenvector

Open-Domain Safety Policy Construction

arXiv cs.CLby Di Wu, Siyue Liu, Zixiang Ji, Ya-Liang Chang, Zhe-Yu Liu, Andrew Pleffer, Kai-Wei ChangApril 4, 20261 min read0 views
Source Quiz

arXiv:2604.01354v1 Announce Type: new Abstract: Moderation layers are increasingly a core component of many products built on user- or model-generated content. However, drafting and maintaining domain-specific safety policies remains costly. We present Deep Policy Research (DPR), a minimal agentic system that drafts a full content moderation policy based on only human-written seed domain information. DPR uses a single web search tool and lightweight scaffolding to iteratively propose search queries, distill diverse web sources into policy rules, and organize rules into an indexed document. We evaluate DPR on (1) the OpenAI undesired content benchmark across five domains with two compact reader LLMs and (2) an in-house multimodal advertisement moderation benchmark. DPR consistently outperfo

View PDF HTML (experimental)

Abstract:Moderation layers are increasingly a core component of many products built on user- or model-generated content. However, drafting and maintaining domain-specific safety policies remains costly. We present Deep Policy Research (DPR), a minimal agentic system that drafts a full content moderation policy based on only human-written seed domain information. DPR uses a single web search tool and lightweight scaffolding to iteratively propose search queries, distill diverse web sources into policy rules, and organize rules into an indexed document. We evaluate DPR on (1) the OpenAI undesired content benchmark across five domains with two compact reader LLMs and (2) an in-house multimodal advertisement moderation benchmark. DPR consistently outperforms definition-only and in-context learning baselines, and in our end-to-end setting it is competitive with expert-written policy sections in several domains. Moreover, under the same seed specification and evaluation protocol, DPR outperforms a general-purpose deep research system, suggesting that a task-specific, structured research loop can be more effective than generic web research for policy drafting. We release our experiment code at this https URL.

Comments: EACL 2026 (Findings)

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2604.01354 [cs.CL]

(or arXiv:2604.01354v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2604.01354

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Di Wu [view email] [v1] Wed, 1 Apr 2026 20:07:34 UTC (640 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelbenchmarkrelease

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Open-Domain…modelbenchmarkreleaseannounceproductvaluationarXiv cs.CL

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 152 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!