Selective Forgetting in Large Reasoning Models for Privacy

Date:

Selective Forgetting for Large Reasoning Models

Summary: arXiv:2604.03571v1 Announce Type: new

Large Reasoning Models (LRMs) have gained prominence due to their ability to generate structured chains of thought (CoTs) before arriving at final answers. While this capability enhances their reasoning processes, it also exposes them to significant vulnerabilities, particularly in terms of knowledge leakage through intermediate reasoning steps. The retention of sensitive information from training data, including copyrighted and private content, has raised serious ethical and legal concerns within the field.

In response to these pressing issues, selective forgetting, commonly referred to as machine unlearning, has surfaced as a potential solution for LRMs. However, it is important to note that existing unlearning methods predominantly focus on the final answers generated by these models. This limitation can lead to a deterioration of the overall reasoning abilities of LRMs following the unlearning process. Furthermore, the direct application of unlearning techniques on the entire CoT may inadvertently impair the general reasoning capabilities of these models.

Challenges in LRM Unlearning

The primary challenge facing LRM unlearning lies in achieving precise removal of targeted knowledge while simultaneously preserving the integrity of general reasoning capabilities. The delicate balance between unlearning sensitive information and maintaining the model’s performance is crucial.

Proposed Framework for Selective Forgetting

To bridge the gap between unlearning and reasoning integrity, our research introduces a novel framework aimed at selectively removing sensitive reasoning components without compromising general reasoning capabilities. The key features of our approach include:

  • Multiple LLMs with Retrieval-Augmented Generation (RAG): Our framework leverages multiple large language models equipped with RAG to analyze CoT traces effectively.
  • Identification of Forget-Relevant Segments: The framework identifies segments within the reasoning chains that require unlearning, focusing on sensitive content.
  • Replacement with Benign Placeholders: Sensitive components are replaced with benign placeholders that maintain the logical structure of the reasoning chain.
  • Feature Replacement Unlearning Loss: We introduce a new loss function that suppresses the probability of generating forgotten content while reinforcing the generation of structurally valid replacements.

Empirical Validation

To validate the efficacy of our proposed method, we conducted extensive experiments on both synthetic and medical datasets. The results confirm the desired properties of our selective forgetting framework, demonstrating that it effectively removes sensitive information while preserving the model’s reasoning capabilities.

In conclusion, our research presents a significant step toward addressing the ethical and legal challenges associated with Large Reasoning Models. By implementing selective forgetting, we can foster a more responsible deployment of these powerful AI systems, ensuring that they operate within ethical boundaries while retaining their robust reasoning capabilities.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.