The Computational Complexity of Avoiding Strict Saddle Points in Constrained Optimization
arXiv:2604.02285v1 Announce Type: cross Abstract: While first-order stationary points (FOSPs) are the traditional targets of non-convex optimization, they often correspond to undesirable strict saddle points. To circumvent this, attention has shifted towards second-order stationary points (SOSPs). In unconstrained settings, finding approximate SOSPs is PLS-complete (Kontogiannis et al.), matching the complexity of finding unconstrained FOSPs (Hollender and Zampetakis). However, the complexity of finding SOSPs in constrained settings remained notoriously unclear and was highlighted as an important open question by both aforementioned works. Under one strict definition, even verifying whether a point is an approximate SOSP is NP-hard (Murty and Kabadi). Under another widely adopted, relaxed
View PDF
Abstract:While first-order stationary points (FOSPs) are the traditional targets of non-convex optimization, they often correspond to undesirable strict saddle points. To circumvent this, attention has shifted towards second-order stationary points (SOSPs). In unconstrained settings, finding approximate SOSPs is PLS-complete (Kontogiannis et al.), matching the complexity of finding unconstrained FOSPs (Hollender and Zampetakis). However, the complexity of finding SOSPs in constrained settings remained notoriously unclear and was highlighted as an important open question by both aforementioned works. Under one strict definition, even verifying whether a point is an approximate SOSP is NP-hard (Murty and Kabadi). Under another widely adopted, relaxed definition where non-negative curvature is required only along the null space of the active constraints, the problem lies in TFNP, and algorithms with O(poly(1/epsilon)) running times have been proposed (Lu et al.). In this work, we settle the complexity of constrained SOSP by proving that computing an epsilon-approximate SOSP under the tractable definition is PLS-complete. We demonstrate that our result holds even in the 2D unit square [0,1]^2, and remarkably, even when stationary points are isolated at a distance of Omega(1) from the domain's boundary. Our result establishes a fundamental barrier: unless PLS is a subset of PPAD (implying PLS = CLS), no deterministic, iterative algorithm with an efficient, continuous update rule can exist for finding approximate SOSPs. This contrasts with the constrained first-order counterpart, for which Fearnley et al. showed that finding an approximate KKT point is CLS-complete. Finally, our result yields the first problem defined in a compact domain to be shown PLS-complete beyond the canonical Real-LocalOpt (Daskalakis and Papadimitriou)."
Comments: Abstract shortened to meet arXiv requirements
Subjects:
Computational Complexity (cs.CC); Data Structures and Algorithms (cs.DS)
Cite as: arXiv:2604.02285 [cs.CC]
(or arXiv:2604.02285v1 [cs.CC] for this version)
https://doi.org/10.48550/arXiv.2604.02285
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Ioannis Panageas [view email] [v1] Thu, 2 Apr 2026 17:26:44 UTC (7,582 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
announceupdatearxiv
Latent Reasoning Sprint #3: Activation Difference Steering and Logit Lens
In my previous post I found evidence consistent with the scratchpad paper's compute/store alternation hypothesis — even steps showing higher intermediate answer detection and odd steps showing higher entropy along with results matching “Can we interpret latent reasoning using current mechanistic interpretability tools?”. This post investigates activation steering applied to latent reasoning and examines the resulting performance changes. Quick Summary: Tuned Logit lens sometimes does not find the final answer to a prompt and instead finds a close approximation Tuned Logit lens does not seem to have a consistent location layer or latent where the final answer is positioned. Tuned logit lens variants like ones only trained on latent 3 still only have therefore on odd vectors. Activation stee

Hetzner Cloud for AI Projects — Complete GPU Server Setup & Cost Breakdown 2026
Hetzner Cloud for AI Projects — Complete GPU Server Setup Cost Breakdown 2026 Running AI workloads on AWS or GCP is expensive. A single A100 instance on AWS costs $3-4 per hour — over $2,000 a month if you leave it running. For startups, indie developers, and small teams experimenting with AI, that math kills projects before they start. Hetzner offers an alternative that most of the AI community outside Europe has not discovered yet. Budget cloud instances from €3.99/month for lightweight inference. Dedicated GPU servers with NVIDIA RTX 4000 Ada from €184/month. European data centers with flat monthly pricing and no bandwidth surprises. This guide covers the full Hetzner AI server lineup, from $5/month CPU instances running tiny models to dedicated GPU servers handling production workloads

Claude Code Advanced Workflow: Subagents, Commands & Multi-Session
Claude Code Advanced Workflow: Subagents, Commands Multi-Session Most Claude Code tutorials stop at "write a good CLAUDE.md and let Claude handle the rest." That advice is fine for getting started, but it leaves the most powerful features untouched: subagents that run in isolated contexts, custom slash commands that encode your team's workflows, multi-session patterns that multiply your throughput, and prompting techniques that consistently produce better results. At Effloow , we run a fully AI-powered content company with 14 agents orchestrated through Paperclip. Every agent runs Claude Code. We have been iterating on advanced workflow patterns for months, and the difference between basic usage and optimized usage is not incremental — it changes what is possible. This guide covers the adv
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Releases

Stop Writing Rules for AI Agents
Stop Writing Rules for AI Agents Every developer building AI agents makes the same mistake: they write rules. "Don't do X." "Always do Y." Rules feel like control. But they are an illusion. Why Rules Fail Rules are static. Agents operate in dynamic environments. The moment reality diverges from your rule set it breaks. Behavior Over Rules Instead of telling your agent what NOT to do, design what it IS: The system prompt (identity, not restrictions) The tools available (capability shapes behavior) The feedback loops (what gets rewarded) The memory architecture A Real Example I built FORGE, an autonomous AI agent running 24/7. Early versions had dozens of rules. Every rule created a new edge case. The fix: stop writing rules, start designing behavior. FORGE's identity: orchestrator, not exec
trunk/5d6292dfff853cd0559300c88d7330752c185e40: [Native DSL] Add torch.backends.python_native (#178902)
Summary: Adds user-facing control of python_native op overrides defined in torch._native . Allows for: Per-DSL control and information via. torch.backends.python_native.$dsl .name # (property) .available # (property) .enabled # (property, settable) .version # (property) .disable() # (method) .enable() # (method) .disabled() # (context manager) And module-level control via. torch.backends.python_native .available_dsls (property) .all_dsls (property) .get_dsl_operations() (method) .disable_operations() (method) .enable_operations() (method) .disable_dispatch_keys() (method) .enable_dispatch_keys() (method) .operations_disabled() (context) .operations_disabled() (context manager) Tests and docs for this functionality are also added. Test Plan: pytest -sv test/python_native/test_torch_backends


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!