Code Sharing In Prediction Model Research: A Scoping Review
The importance of analytical code in the realm of predictive modeling cannot be overstated. A recent study published on arXiv (arXiv:2604.06212v1) highlights the critical role that code availability plays in reproducing diagnostic and prognostic prediction model research. Despite its significance, the prevalence of accessible code in published literature remains alarmingly limited. This article aims to summarize the findings of a scoping review that quantifies current code-sharing practices, ultimately contributing to the development of TRIPOD-Code, an extension of the TRIPOD reporting guideline focused specifically on code sharing.
Key Findings of the Scoping Review
The scoping review was conducted on PubMed-indexed articles that cited TRIPOD or TRIPOD+AI as of August 11, 2025. The study focused on articles that developed, updated, or validated multivariable prediction models, and utilized a large language model-assisted pipeline for screening articles and extracting relevant code availability statements and links to repositories.
The findings from the review revealed the following:
- Out of 3,967 eligible articles, only 12.2% included statements related to code sharing.
- The rate of code sharing has increased over time, reaching 15.8% in 2025.
- Studies citing TRIPOD+AI demonstrated a higher prevalence of code sharing compared to those citing TRIPOD alone.
- The frequency of code sharing varied significantly across different journals and countries.
Assessment of Repositories
The review also examined the quality and structure of the repositories associated with the shared code. A substantial heterogeneity was observed in the reproducibility features of these repositories:
- 80.5% of the repositories contained a README file, which is essential for understanding the code.
- Only 37.6% specified dependencies required for running the code, and a mere 21.6% of these were version-constrained.
- 42.4% of repositories were modular, making it easier for users to utilize specific parts of the code.
Conclusion and Future Directions
The scoping review underscores that while code sharing in prediction model research is gradually increasing, it remains relatively uncommon. Moreover, when code is shared, it often does not meet the standards necessary for reusability. These findings serve as an empirical baseline for the development of the TRIPOD-Code extension, highlighting the need for clearer expectations that extend beyond mere code availability. Key areas for improvement include comprehensive documentation, clear specification of dependencies, proper licensing, and a well-structured executable format.
As the field of predictive modeling continues to evolve, the establishment of robust code-sharing practices will be imperative for enhancing reproducibility and fostering collaborative research efforts.
