Search AI News

Find articles across all categories and topics

Filter:

Filter by tag

392 results for "alignment"

Sort by:

Fusion and Alignment Enhancement with Large Language Models for Tail-item Sequential Recommendation

arXiv:2604.03688v1 Announce Type: new Abstract: Sequential Recommendation (SR) learns user preferences from their historical interaction sequences and provides personalized suggestions. In real-world scenarios, most items exhibit sparse interactions, known as the tail-item problem. This issue limits the model's ability to accurately capture item transition patterns. To tackle this, large language models (LLMs) offer a promising solution by capturing semantic relationships between items. Despite previous efforts to leverage LLM-derived embeddings for enriching tail items, they still face the following limitations: 1) They struggle to effectively fuse collaborative signals with semantic knowledge, leading to suboptimal item embedding quality. 2) Existing methods overlook the structural incon

arXiv cs.IR

2mabout 2 hours ago

ModelsLive

XAttnRes: Cross-Stage Attention Residuals for Medical Image Segmentation

arXiv:2604.03297v1 Announce Type: new Abstract: In the field of Large Language Models (LLMs), Attention Residuals have recently demonstrated that learned, selective aggregation over all preceding layer outputs can outperform fixed residual connections. We propose Cross-Stage Attention Residuals (XAttnRes), a mechanism that maintains a global feature history pool accumulating both encoder and decoder stage outputs. Through lightweight pseudo-query attention, each stage selectively aggregates from all preceding representations. To bridge the gap between the same-dimensional Transformer layers in LLMs and the multi-scale encoder-decoder stages in segmentation networks, XAttnRes introduces spatial alignment and channel projection steps that handle cross-resolution features with negligible over

arXiv cs.CV

1mabout 2 hours ago

ModelsLive

Structural Rigidity and the 57-Token Predictive Window: A Physical Framework for Inference-Layer Governability in Large Language Models

arXiv:2604.03524v1 Announce Type: new Abstract: Current AI safety relies on behavioral monitoring and post-training alignment, yet empirical measurement shows these approaches produce no detectable pre-commitment signal in a majority of instruction-tuned models tested. We present an energy-based governance framework connecting transformer inference dynamics to constraint-satisfaction models of neural computation, and apply it to a seven-model cohort across five geometric regimes. Using trajectory tension (rho = ||a|| / ||v||), we identify a 57-token pre-commitment window in Phi-3-mini-4k-instruct under greedy decoding on arithmetic constraint probes. This result is model-specific, task-specific, and configuration-specific, demonstrating that pre-commitment signals can exist but are not uni

arXiv cs.AI

2mabout 1 hour ago

ModelsLive

Evaluating Artificial Intelligence Through a Christian Understanding of Human Flourishing

arXiv:2604.03356v1 Announce Type: new Abstract: Artificial intelligence (AI) alignment is fundamentally a formation problem, not only a safety problem. As Large Language Models (LLMs) increasingly mediate moral deliberation and spiritual inquiry, they do more than provide information; they function as instruments of digital catechesis, actively shaping and ordering human understanding, decision-making, and moral reflection. To make this formative influence visible and measurable, we introduce the Flourishing AI Benchmark: Christian Single-Turn (FAI-C-ST), a framework designed to evaluate Frontier Model responses against a Christian understanding of human flourishing across seven dimensions. By comparing 20 Frontier Models against both pluralistic and Christian-specific criteria, we show th

ArXiv CS.AI

1mabout 2 hours ago

ReleasesLive

Decomposing Communication Gain and Delay Cost Under Cross-Timestep Delays in Cooperative Multi-Agent Reinforcement Learning

arXiv:2604.03785v1 Announce Type: cross Abstract: Communication is essential for coordination in \emph{cooperative} multi-agent reinforcement learning under partial observability, yet \emph{cross-timestep} delays cause messages to arrive multiple timesteps after generation, inducing temporal misalignment and making information stale when consumed. We formalize this setting as a delayed-communication partially observable Markov game (DeComm-POMG) and decompose a message's effect into \emph{communication gain} and \emph{delay cost}, yielding the Communication Gain and Delay Cost (CGDC) metric. We further establish a value-loss bound showing that the degradation induced by delayed messages is upper-bounded by a discounted accumulation of an information gap between the action distributions ind

arXiv cs.MA

1m43 minutes ago

ModelsLive

The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models

arXiv:2604.04155v1 Announce Type: cross Abstract: Foundation models for biology and physics optimize predictive accuracy, but their internal representations systematically fail to preserve the continuous geometry of the systems they model. We identify the root cause: the Geometric Alignment Tax, an intrinsic cost of forcing continuous manifolds through discrete categorical bottlenecks. Controlled ablations on synthetic dynamical systems demonstrate that replacing cross-entropy with a continuous head on an identical encoder reduces geometric distortion by up to 8.5x, while learned codebooks exhibit a non-monotonic double bind where finer quantization worsens geometry despite improving reconstruction. Under continuous objectives, three architectures differ by 1.3x; under discrete tokenizatio

arXiv stat.ML

2m1about 1 hour ago

ProductsLive

Diffusion Path Alignment for Long-Range Motion Generation and Domain Transitions

arXiv:2604.03310v1 Announce Type: new Abstract: Long-range human movement generation remains a central challenge in computer vision and graphics. Generating coherent transitions across semantically distinct motion domains remains largely unexplored. This capability is particularly important for applications such as dance choreography, where movements must fluidly transition across diverse stylistic and semantic motifs. We propose a simple and effective inference-time optimization framework inspired by diffusion-based stochastic optimal control. Specifically, a control-energy objective that explicitly regularizes the transition trajectories of a pretrained diffusion model. We show that optimizing this objective at inference time yields transitions with fidelity and temporal coherence. This

arXiv cs.CV

1m44 minutes ago

ModelsLive

Generative Unsupervised Downscaling of Climate Models via Domain Alignment: Application to Wind Fields

arXiv:2604.03341v1 Announce Type: cross Abstract: General Circulation Models (GCMs) are widely used for future climate projections, but their coarse spatial resolution and systematic biases limit their direct use for impact studies. This limitation is particularly critical for wind-related applications, such as wind energy, which require spatially coherent, multivariate, and physically plausible near-surface wind fields. Classical statistical downscaling and bias correction methods partly address this issue. Still, they struggle to preserve spatial structure, inter-variable consistency, and robustness under climate change, especially in high-dimensional settings. Recent advances in generative machine learning offer new opportunities for downscaling and bias correction, eliminating the need

arXiv stat.ML

2mabout 1 hour ago

ReleasesLive

Causality-Based Scores Alignment in Explainable Data Management

arXiv:2503.14469v5 Announce Type: replace Abstract: Different attribution scores have been proposed to quantify the relevance of database tuples for query answering in databases; e.g. Causal Responsibility, the Shapley Value, the Banzhaf Power-Index, and the Causal Effect. They have been analyzed in isolation. This work is a first investigation of score alignment depending on the query and the database; i.e. on whether they induce compatible rankings of tuples. We concentrate mostly on causality-based scores; and provide a syntactic dichotomy result for queries: on one side, pairs of scores are always aligned, on the other, they are not always aligned. It turns out that the presence of exogenous tuples makes a crucial difference in this regard.

arXiv cs.DB

1m9 minutes ago

Frontier ResearchLive

DRIFT: Deep Restoration, ISP Fusion, and Tone-mapping

arXiv:2604.03402v1 Announce Type: new Abstract: Smartphone cameras have gained immense popularity with the adoption of high-resolution and high-dynamic range imaging. As a result, high-performance camera Image Signal Processors (ISPs) are crucial in generating high-quality images for the end user while keeping computational costs low. In this paper, we propose DRIFT (Deep Restoration, ISP Fusion, and Tone-mapping): an efficient AI mobile camera pipeline that generates high quality RGB images from hand-held raw captures. The first stage of DRIFT is a Multi-Frame Processing (MFP) network that is trained using a adversarial perceptual loss to perform multi-frame alignment, denoising, demosaicing, and super-resolution. Then, the output of DRIFT-MFP is processed by a novel deep-learning based t

arXiv eess.IV

1mabout 2 hours ago

Analyst NewsFresh

Inside Omega

This is a philosophical thought experiment which aims to explore what I consider to be the crux of many alignment problems: That of the unrescuability of moral internalism , which basically says we have not been able to rescue the philosophical view that a necessary, intrinsic connection exists between moral judgments and motivation. If one could rescue moral internalism, in theory, they would have a perfectly good argument for any rational self-interested intelligence to not engage in broad scale moral harm. Therefore I think it is a linchpin meta-philosophical challenge. I don't claim to have a theorem, but I believe that one potential domain worth investigating is arguments which induce indexical uncertainty in an agent. Essentially, forms of leveraging undecidability to cause an agent

LessWrong AI

8mabout 11 hours ago

CountriesFresh

US-UAE AI working group's first meeting 'deepens alignment' - thenationalnews.com

US-UAE AI working group's first meeting 'deepens alignment' thenationalnews.com

GNews AI UAE

1mabout 7 hours ago

Frontier ResearchFresh

OpenAI launches safety fellowship to advance AI alignment research and assess effects on society - Firstpost

OpenAI launches safety fellowship to advance AI alignment research and assess effects on society Firstpost

Google News: AI Safety

1mabout 11 hours ago

ReleasesFresh

AIs can now often do massive easy-to-verify SWE tasks and I've updated towards shorter timelines

I've recently updated towards substantially shorter AI timelines and much faster progress in some areas. [1] The largest updates I've made are (1) an almost 2x higher probability of full AI R&D automation by EOY 2028 (I'm now a bit below 30% [2] while I was previously expecting around 15% ; my guesses are pretty reflectively unstable) and (2) I expect much stronger short-term performance on massive and pretty difficult but easy-and-cheap-to-verify software engineering (SWE) tasks that don't require that much novel ideation [3] . For instance, I expect that by EOY 2026, AIs will have a 50%-reliability [4] time horizon of years to decades on reasonably difficult easy-and-cheap-to-verify SWE tasks that don't require much ideation (while the high reliability—for instance, 90%—time horizon will

AI Alignment Forum

26mabout 6 hours ago

Frontier ResearchRecent

Announcing the OpenAI Safety Fellowship

A pilot program to support independent safety and alignment research and develop the next generation of talent

OpenAI Blog

1mabout 13 hours ago

Research PapersFresh

Relative Density Ratio Optimization for Stable and Statistically Consistent Model Alignment

Aligning language models with human preferences is essential for ensuring their safety and reliability. Although most existing approaches assume specific human preference models such as the Bradley-Terry model, this assumption may fail to accurately capture true human preferences, and consequently, these methods lack statistical consistency, i.e., the guarantee that language models converge to the true human preference as the number of samples increases. In contrast, direct density ratio optimization (DDRO) achieves statistical consistency without assuming any human preference models. DDRO mod — Hiroshi Takahashi, Tomoharu Iwata, Atsutoshi Kumagai

arXiv

10mabout 2 hours ago

Page 1 of 25