From Physician Expertise to Clinical Agents: Preserving, Standardizing, and Scaling Physicians’ Medical Expertise with Lightweight LLM
Medicine, as an empirical discipline, continuously evolves through extensive observation and the complex realities of clinical practice. The process through which physicians develop their diagnostic and therapeutic skills is intricate, heavily reliant on cycles of application, reflection, and continuous improvement. However, despite these efforts, the outcomes of medical practice can vary significantly, leading to challenges in the transmission and scalability of master physicians’ knowledge systems. This inconsistency contributes to the scarcity of high-quality clinical expertise, which is critical in providing effective patient care.
To tackle this pressing issue, a new framework named Med-Shicheng has been proposed. This innovative system is designed to allow large language models (LLMs) to learn from and replicate the diagnostic and therapeutic philosophies of distinguished physicians. By standardizing these methodologies, Med-Shicheng aims to bridge the gap between individual physician expertise and broader clinical applications.
The Med-Shicheng Framework
Built on the foundation of the Tianyi model, Med-Shicheng encompasses five distinct stages to facilitate the systematic learning and transfer of knowledge from five National Masters of Chinese Medicine, or distinguished Traditional Chinese Medicine (TCM) physicians. This process involves several key components:
- Multi-source Material Curation: The framework begins with the collection and curation of diverse materials that reflect the expertise of the five selected physicians.
- Model Training: A single model is trained to internalize the unique knowledge systems of these physicians across various tasks.
- Task Diversity: Seven distinct tasks are identified, including:
- Etiology-pathogenesis analysis
- Syndrome diagnosis
- Treatment principle selection
- Prescription generation
- Prescription explanation
- Symptom evolution with regimen adjustment
- Clinical advice
Technical Implementation and Performance
Med-Shicheng has been implemented using the Qwen2.5-1.5B-Base model, allowing it to operate efficiently on resource-constrained GPUs. Remarkably, the performance of Med-Shicheng is found to be comparable to existing models such as DeepSeek-R1 and GPT-5, showcasing its potential in high-quality clinical applications.
Reliability and Evaluation
An essential aspect of this framework is the evaluation of its reliability, particularly in comparing LLMs as evaluators against traditional physician assessments. The automated judging mechanism employed in Med-Shicheng effectively tracks overall trends in medical data. However, it has been observed to exhibit biases when making fine-grained individualized distinctions. This finding underscores the ongoing need for physician involvement in decision-making processes, especially when ground truth data is not readily available and emphasizes the importance of domain-adapted judge models in enhancing the overall reliability of medical evaluations.
In summary, the Med-Shicheng framework represents a significant advancement in the integration of artificial intelligence into medical practice. By preserving and standardizing the expertise of master physicians, it not only aims to enhance the quality of care but also seeks to democratize access to high-level medical knowledge, paving the way for improved patient outcomes across diverse clinical settings.
