CliffSearch: Structured Agentic Co-Evolution over Theory and Code for Scientific Algorithm Discovery
Summary: arXiv:2604.01210v1 Announce Type: cross
Abstract: Scientific algorithm discovery is an iterative process characterized by the proposal, implementation, stress-testing, and revision of hypotheses. While current large language model (LLM)-guided search systems enhance the speed of proposal generation, they frequently fall short in adequately representing scientific structure by focusing primarily on code-only artifacts, which may lack robust correctness and originality criteria. In response, we introduce CliffSearch, an agentic evolutionary framework that utilizes LLM agents to implement core evolutionary operators—pair selection, crossover, mutation, and review. This framework is designed based on three guiding principles:
- Structured Scientific Artifacts: Each node within the framework represents a structured scientific artifact that can be instantiated in either a theory + code format or a code-only mode.
- Reviewer Judgments: The assessments of reviewers regarding correctness and originality are treated as primary selection criteria, in addition to the optimization of the benchmark metric of interest.
- Split Mutation Pathways: Mutation is bifurcated into exploration and correction pathways, each with distinct objectives. Exploration mutation incorporates ideas from related scientific domains to enhance novelty, while correction mutation focuses on targeted, evidence-guided repairs based on reviewer feedback concerning theory, code, benchmark results, and runtime errors.
To demonstrate the effectiveness of the CliffSearch framework, we present findings from three benchmark-based studies:
- Transformer Hyper-Connection Evolution: This study explores the evolution of hyper-connection mechanisms within transformer architectures.
- Optimizer Discovery: This research focuses on discovering optimizers for a specified nanoGPT stack.
- Native-Optimizer Ablation: This study investigates a smaller-scale ablation of native optimizers to analyze their performance and characteristics.
Across these diverse settings, CliffSearch utilizes a unified discovery loop that facilitates explicit metric direction and reproducible persistence. Additionally, it allows for reviewer-gated comparisons of discovery outcomes under controlled search conditions. The overall aim of this framework is to cultivate a discovery workflow that emphasizes scientific interpretability and correctness while optimizing task metrics within defined novelty constraints, rather than merely maximizing the throughput of candidate solutions.
For those interested in exploring the full run artifacts, interactive visualizations, and the best node exports from the reported studies, please visit CliffSearch.
