DeepReviewer 2.0: Auditable AI System for Scientific Peer Review

DeepReviewer 2.0: A Traceable Agentic System for Auditable Scientific Peer Review

Summary: arXiv:2604.09590v1 Announce Type: new

Abstract

Automated peer review is often framed as generating fluent critique; however, it is essential for reviewers and area chairs to have judgments that they can audit. This includes understanding where a concern applies, the evidence that supports it, and the concrete follow-up that is required. DeepReviewer 2.0 is a process-controlled agentic review system designed around an output contract, which produces a traceable review package featuring anchored annotations, localized evidence, and executable follow-up actions. It exports only after fulfilling minimum traceability and coverage budgets.

Key Features of DeepReviewer 2.0

DeepReviewer 2.0 introduces several innovative features that set it apart from traditional peer review systems:

Manuscript-only Claim-Evidence-Risk Ledger: The system first constructs a ledger that maps claims made in the manuscript to supporting evidence and associated risks.
Verification Agenda: DeepReviewer 2.0 creates a verification agenda that guides the review process, ensuring that the critiques are focused and relevant.
Agenda-driven Retrieval: The system performs targeted retrieval of information based on the verification agenda, enhancing the accuracy and relevance of critiques.
Anchored Critiques: Critiques are generated with anchored references to evidence, making it easier for reviewers to track and audit the review process.
Export Gate: The system only exports the review package once it meets predefined traceability and coverage standards, ensuring high-quality outputs.

Performance Analysis

In a comprehensive study involving 134 submissions to ICLR 2025 under three fixed protocols, an un-finetuned 196B model utilizing DeepReviewer 2.0 demonstrated superior performance. Key findings from the analysis include:

Improved strict major-issue coverage: 37.26% compared to 23.57% for the competing model Gemini-3.1-Pro-preview.
Outperformed human review committee: DeepReviewer 2.0 won 71.63% of micro-averaged blind comparisons against a human review committee.
Ranked first among automatic systems within the tested pool.

Positioning and Future Directions

DeepReviewer 2.0 is positioned as an assistive tool for the peer review process rather than a decision-making proxy. This distinction emphasizes the importance of human oversight in critical review stages. However, researchers acknowledge the existence of gaps, particularly in areas requiring ethics-sensitive checks.

In conclusion, DeepReviewer 2.0 represents a significant advancement in automated peer review technology, providing a structured, auditable approach to scientific critique. As the field continues to evolve, ongoing improvements and ethical considerations will be crucial to ensuring the integrity and reliability of the peer review process.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

DeepReviewer 2.0: Auditable AI System for Scientific Peer Review

DeepReviewer 2.0: A Traceable Agentic System for Auditable Scientific Peer Review

Abstract

Key Features of DeepReviewer 2.0

Performance Analysis

Positioning and Future Directions

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related