CRAFT: Reducing Forgetting in Continual Learning for LLMs

CRAFT: Forgetting-Aware Intervention-Based Adaptation for Continual Learning

In the rapidly evolving landscape of artificial intelligence, large language models (LLMs) represent a significant advancement in the ability to process and generate human-like text. However, one of the critical challenges in utilizing LLMs is the phenomenon known as catastrophic forgetting, where a model loses previously acquired knowledge upon learning new tasks. A recent paper titled “CRAFT: Forgetting-Aware Intervention-Based Adaptation for Continual Learning” proposes a novel approach to address this issue.

Published on arXiv under the identifier 2605.05732v1, CRAFT offers a continual learning framework that focuses on learning low-rank interventions on hidden representations rather than updating model weights directly. This innovative methodology aims to enhance the model’s ability to adapt to new tasks while mitigating the risks of forgetting previously learned information.

The Three Stages of CRAFT

CRAFT operates through a structured three-stage process that facilitates effective learning while minimizing forgetting:

Task Routing: The first stage involves routing each task to a group of similar tasks. This is determined based on output-distribution divergence, allowing the model to effectively categorize tasks that share common characteristics.
Fine-Tuning with KL Divergence: In the second stage, the model undergoes fine-tuning, guided by a Kullback-Leibler (KL) divergence against the group’s prior state. This crucial step directly controls the extent of forgetting and influences the convergence of the model’s performance on the new task.
Merging Interventions: The final stage entails merging the interventions for the updated task into the shared representation. This is also achieved using the KL signal, creating a cohesive adaptation strategy that integrates new knowledge without sacrificing prior learning.

Benefits of the CRAFT Framework

The introduction of CRAFT marks a significant advancement in the field of continual learning for LLMs. The framework’s design integrates routing, regularization, and merging into a single KL-based objective, offering several key benefits:

Improved Performance: CRAFT demonstrates enhanced overall performance compared to existing LoRA-based approaches, showcasing its effectiveness across a variety of benchmarks and model scales.
Reduced Forgetting: One of the most notable advantages of CRAFT is its ability to significantly reduce the incidence of catastrophic forgetting, allowing LLMs to retain previously acquired knowledge while learning new tasks.
Robustness to Task Ordering: CRAFT’s design ensures that performance remains stable regardless of the order in which tasks are presented, a common challenge in continual learning scenarios.

Conclusion

The CRAFT framework presents a scalable and principled approach to continual learning in large language models. By controlling adaptation in representation space and being guided by output-space divergence, CRAFT opens new avenues for research and application in the field of artificial intelligence. As LLMs continue to evolve and expand their capabilities, frameworks like CRAFT will play a pivotal role in ensuring they can learn continuously without compromising their foundational knowledge.

For researchers and practitioners in the field, CRAFT represents not only a technical achievement but also a blueprint for future advancements in the pursuit of more resilient and adaptable AI systems.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

CRAFT: Reducing Forgetting in Continual Learning for LLMs

CRAFT: Forgetting-Aware Intervention-Based Adaptation for Continual Learning

The Three Stages of CRAFT

Benefits of the CRAFT Framework

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related