A Parameter-Efficient Transfer Learning Approach through Multitask Prompt Distillation and Decomposition for Clinical NLP
Summary: arXiv:2604.06650v1 Announce Type: cross
Abstract: Existing prompt-based fine-tuning methods typically learn task-specific prompts independently, imposing significant computing and storage overhead at scale when deploying multiple clinical natural language processing (NLP) systems. We present a multitask prompt distillation and decomposition framework that learns a single shared metaprompt from 21 diverse clinical source tasks and adapts it to unseen target tasks with fewer than 0.05% trainable parameters.
Introduction
Clinical natural language processing (NLP) has seen significant advancements, yet the deployment of multiple systems continues to challenge researchers and practitioners due to the high computational costs associated with training task-specific models. The traditional approach of prompt-based fine-tuning results in a burden of computational and storage overhead. To address this, a novel framework for multitask prompt distillation and decomposition has been developed.
Framework Overview
The proposed framework aims to streamline the process of adapting NLP models to various clinical tasks by learning a single shared metaprompt. This metaprompt is derived from 21 different clinical source tasks, allowing the model to adapt to unseen target tasks with minimal parameters—specifically, fewer than 0.05% trainable parameters. This efficiency not only reduces the resource requirements but also enhances scalability.
Evaluation and Results
The effectiveness of the framework was evaluated across five different clinical NLP task types:
- Named Entity Recognition
- Relation Extraction
- Question Answering
- Natural Language Inference
- Summarization
Ten held-out target datasets were utilized to assess the model’s performance, using three backbone models: LLaMA 3.1 8B, Meditron3 8B, and gpt-oss 20B. The results indicated a consistent outperformance of the proposed framework over existing methods:
- Outperformed LoRA by 1.5% to 1.7% while leveraging significantly fewer parameters.
- Surpassed single-task prompt tuning by 6.1% to 6.6%.
- The gpt-oss 20B model achieved the highest overall performance, particularly excelling in clinical reasoning tasks.
Transferability and Implications
One of the standout features of this framework is its ability to demonstrate strong zero- and few-shot performance, highlighting the improved transferability of the shared prompt representation. This characteristic suggests that the model can effectively generalize across various clinical tasks, making it a valuable tool for practitioners in the field of clinical NLP.
Conclusion
The multitask prompt distillation and decomposition framework represents a significant step forward in the field of clinical NLP. By reducing the parameter burden while improving performance across multiple tasks, this approach not only facilitates easier deployment but also enhances the capabilities of NLP systems in clinical settings. Future work will involve further refinement of the framework and exploring its application in additional healthcare-related tasks.
