FactReview: AI-Powered Evidence-Based Peer Review System

FactReview: Evidence-Grounded Reviews with Literature Positioning and Execution-Based Claim Verification

Summary: arXiv:2604.04074v1 Announce Type: new

Abstract: Peer review in machine learning is under growing pressure from rising submission volume and limited reviewer time. Most LLM-based reviewing systems read only the manuscript and generate comments from the paper’s own narrative. This makes their outputs sensitive to presentation quality and leaves them weak when the evidence needed for review lies in related work or released code. We present FactReview, an evidence-grounded reviewing system that combines claim extraction, literature positioning, and execution-based claim verification.

Introduction

The landscape of peer review in the field of machine learning is evolving rapidly, driven by an increase in submissions and a scarcity of available reviewers. This situation has prompted the development of innovative solutions to enhance the effectiveness of the review process. One such solution is FactReview, which aims to provide a more comprehensive and evidence-based approach to manuscript evaluation.

Overview of FactReview

FactReview operates through three critical components:

Claim Extraction: The system identifies major claims and reported results within the submitted manuscript.
Literature Positioning: FactReview retrieves and analyzes related work to clarify the technical position of the paper in the broader research context.
Execution-Based Claim Verification: When code is available, the system executes the released code under defined parameters to verify central empirical claims.

Review Process

Upon receiving a manuscript, FactReview generates a concise review along with an evidence report. Each major claim is assigned one of five labels:

Supported
Supported by the paper
Partially supported
In conflict
Inconclusive

This labeling system allows for a nuanced understanding of how well the claims stand up to scrutiny based on the available evidence.

Case Study: CompGCN

In a case study involving CompGCN, FactReview successfully reproduced results that closely matched the reported outcomes for link prediction and node classification tasks. However, it also revealed that the paper’s broader performance claims were not entirely accurate. Specifically, for the MUTAG graph classification task, the reproduced result was 88.4%, while the strongest baseline reported in the paper was 92.6%. This analysis led to the classification of the claim as only partially supported.

Implications for AI in Peer Review

The findings from the CompGCN case demonstrate that AI can play a valuable role in peer review, not as a final arbiter, but as a powerful tool for evidence gathering. By assisting reviewers in producing more grounded assessments, FactReview enhances the overall quality and reliability of the review process.

Conclusion

As the demands on peer review continue to grow, systems like FactReview present promising avenues for improving the efficiency and effectiveness of the evaluation process in machine learning. By combining advanced techniques for claim verification with a thorough literature analysis, FactReview sets a new standard for evidence-based reviews.

For more information and access to the code, visit FactReview GitHub Repository.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

FactReview: AI-Powered Evidence-Based Peer Review System

FactReview: Evidence-Grounded Reviews with Literature Positioning and Execution-Based Claim Verification

Introduction

Overview of FactReview

Review Process

Case Study: CompGCN

Implications for AI in Peer Review

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related