RadLite: Efficient CPU Radiology AI with LoRA Fine-Tuning

RadLite: Multi-Task LoRA Fine-Tuning of Small Language Models for CPU-Deployable Radiology AI

Recent advancements in artificial intelligence have shown the potential of large language models (LLMs) in the field of radiology. However, their practical deployment remains a challenge due to significant computational requirements, particularly in resource-constrained clinical environments. In response to this issue, researchers have introduced RadLite, a framework that leverages small language models (SLMs) with 3-4 billion parameters and fine-tunes them using Low-Rank Adaptation (LoRA) techniques. This innovative approach enables effective performance across multiple radiology-related tasks while ensuring compatibility with consumer-grade CPUs.

Overview of the Research

In their study, the researchers trained two models, Qwen2.5-3B-Instruct and Qwen3-4B, on an extensive dataset comprising 162,000 samples across nine radiology tasks. These tasks included:

RADS classification across 10 systems
Impression generation
Temporal comparison
Radiology Natural Language Inference (NLI)
Named Entity Recognition (NER)
Abnormality detection
N/M staging
Radiology Question and Answering (Q&A)

The dataset was compiled from 12 public sources, ensuring a robust foundation for evaluating the models. Each model underwent testing on up to 500 held-out samples per task using standardized metrics to ensure consistent and reliable performance assessments.

Key Findings

Through comprehensive evaluation, the researchers uncovered several significant findings:

LoRA Fine-Tuning Effectiveness: The application of LoRA fine-tuning resulted in substantial performance improvements compared to zero-shot baselines. Notably, RADS accuracy increased by 53%, NLI performance improved by 60%, and N-staging accuracy rose by 89%.
Model Complementarity: The two models exhibited complementary strengths, with Qwen2.5 showcasing superior performance in structured generation tasks, while Qwen3 excelled in extractive tasks.
Oracle Ensemble Performance: A task-outed oracle ensemble that combined both models achieved the highest performance across all evaluated tasks, highlighting the benefits of collaborative approaches in AI.
Few-Shot Prompting Limitations: The study revealed that few-shot prompting with fine-tuned models could negatively impact performance, indicating that LoRA adaptation is more effective than in-context learning for specialized applications.
CPU Deployment Viability: The models were quantized to GGUF format, resulting in sizes ranging from 1.8 to 2.4 GB, allowing for CPU deployment at speeds of 4-8 tokens per second on standard consumer hardware.

Conclusion

The RadLite framework exemplifies the potential of small, efficiently fine-tuned models to serve as practical multi-task radiology AI assistants, fully deployable on consumer hardware without the need for GPU resources. This breakthrough not only paves the way for enhanced accessibility in clinical environments but also demonstrates the significant capabilities of SLMs in addressing complex radiological tasks. Researchers and practitioners interested in utilizing these models can access the code and additional resources at https://github.com/RadioX-Labs/RadLite.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

RadLite: Efficient CPU Radiology AI with LoRA Fine-Tuning

RadLite: Multi-Task LoRA Fine-Tuning of Small Language Models for CPU-Deployable Radiology AI

Overview of the Research

Key Findings

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related