MedStruct-S Benchmark for OCR Clinical Report Extraction

MedStruct-S: A Benchmark for Key Discovery, Key-Conditioned QA and Semi-Structured Extraction from OCR Clinical Reports

In the rapidly evolving field of healthcare technology, the extraction of semi-structured information from Optical Character Recognition (OCR)-derived clinical reports has emerged as a critical task. Efficiently reconstructing patients’ longitudinal medical histories necessitates a robust approach to three integral tasks: field-header (key) discovery, key-conditioned question answering (QA), and end-to-end key-value pair extraction. However, existing evaluation methods often fail to account for two significant challenges: the heterogeneous and incompletely known key representations, and the noise introduced by OCR processes. This inadequacy complicates the evaluation of model robustness in real-world applications.

Introducing MedStruct-S

To address these challenges, a team of researchers has introduced MedStruct-S, a benchmark specifically designed to evaluate the aforementioned tasks under conditions of unknown keys and OCR noise. This new benchmark comprises 3,582 annotated real-world clinical report pages, providing a solid foundation for assessing various models in semi-structured information extraction scenarios.

Key Features of MedStruct-S

Comprehensive Dataset: The dataset includes real-world clinical report pages, making it relevant for practical applications.
Focus on OCR Noise: MedStruct-S is tailored to evaluate the effects of OCR-induced errors, a common issue in clinical document processing.
Evaluation of Multiple Paradigms: The benchmark allows for the comparison of encoder-only and decoder-only models, offering insights into their performance across various tasks.

Benchmarking Results

Using the MedStruct-S benchmark, the team conducted an extensive evaluation of two representative paradigms: encoder-only sequence labeling with post-processing and decoder-only structured generation. The evaluation encompassed four encoder-only and five decoder-only models, with parameters ranging from 0.11 billion to 103 billion.

The results revealed several key findings:

Performance of Encoder-Only Models: Encoder-only models demonstrated superior performance in non-null-value key-conditioned QA tasks, despite their smaller size compared to decoder-only models.
Comparison of Similar Scales: When comparing models of comparable parameter sizes, encoder-only models consistently outperformed their decoder-only counterparts.
Overall Results: Without controlling for model scale, fine-tuned decoder-only models achieved the strongest overall results.

Conclusion

The introduction of MedStruct-S marks a pivotal step forward in the field of semi-structured information extraction from clinical reports. Its focus on real-world conditions, including unknown keys and OCR noise, provides a reliable framework for evaluating model performance. The findings from the benchmarking exercise not only highlight the strengths of different model architectures but also offer a practical basis for selecting and comparing models across various semi-structured information extraction applications. As healthcare continues to embrace technology, benchmarks like MedStruct-S will play an essential role in advancing the capabilities of AI in clinical settings.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

MedStruct-S Benchmark for OCR Clinical Report Extraction

MedStruct-S: A Benchmark for Key Discovery, Key-Conditioned QA and Semi-Structured Extraction from OCR Clinical Reports

Introducing MedStruct-S

Key Features of MedStruct-S

Benchmarking Results

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related