DERM-3R: A Resource-Efficient Multimodal Agents Framework for Dermatologic Diagnosis and Treatment in Real-World Clinical Settings
Dermatologic diseases pose a significant and escalating global health challenge, impacting billions of individuals and severely diminishing their quality of life. Although contemporary treatments can effectively manage acute symptoms, they often fall short in delivering favorable long-term outcomes due to their focus on single-target therapies, recurrent treatment regimens, and insufficient consideration of accompanying systemic comorbidities.
Traditional Chinese Medicine (TCM) offers a holistic approach through the principles of syndrome differentiation and personalized treatment. However, the application of TCM in dermatology is hindered by various factors, including non-standardized knowledge, incomplete multimodal patient records, and the limited scalability of expert reasoning.
Introducing DERM-3R
In response to these challenges, we introduce DERM-3R, a resource-efficient multimodal agent framework designed to facilitate TCM dermatologic diagnosis and treatment, even in scenarios with constrained data and computational resources. DERM-3R is based on real-world clinical workflows and reformulates the decision-making process into three fundamental issues:
- Fine-Grained Lesion Recognition: This involves accurately identifying dermatological lesions to support diagnosis.
- Multi-View Lesion Representation: This component models pathogenesis at a specialist level by integrating multiple perspectives on the lesions.
- Holistic Reasoning for Syndrome Differentiation and Treatment Planning: This aspect focuses on synthesizing information to make comprehensive treatment plans that consider individual patient needs.
The Components of DERM-3R
DERM-3R consists of three collaborative agents, each targeting a specific component of the proposed diagnostic and treatment pipeline:
- DERM-Rec: Focuses on lesion recognition and identification.
- DERM-Rep: Handles the representation of lesions from multiple perspectives to enhance understanding of their nature.
- DERM-Reason: Engages in holistic reasoning to differentiate syndromes and formulate tailored treatment plans.
Performance and Evaluation
Built on a lightweight multimodal large language model (LLM) and partially fine-tuned using 103 real-world TCM psoriasis cases, DERM-3R demonstrates strong performance across various dermatologic reasoning tasks. Evaluations utilizing automatic metrics, LLM-as-a-judge techniques, and assessments by physicians indicate that DERM-3R not only matches but often surpasses larger general-purpose multimodal models, despite operating with minimal data and parameter updates.
Conclusion
The promising results of DERM-3R suggest that structured, domain-aware multi-agent modeling can provide a practical and efficient alternative to brute-force scaling methods for addressing complex clinical tasks in dermatology and integrative medicine. This innovative framework has the potential to enhance the diagnosis and treatment of dermatologic diseases, ultimately improving patient outcomes and advancing the field of dermatology.
