Rethinking Privacy in LLMs with Information Sufficiency

Say Something Else: Rethinking Contextual Privacy as Information Sufficiency

Summary: arXiv:2604.06409v1 Announce Type: cross

Abstract

In the age of Large Language Models (LLMs), the ability of these agents to draft messages on behalf of users raises significant concerns about privacy. Users often find themselves oversharing sensitive information and exhibit varying perceptions of what constitutes private data. Traditional privacy mechanisms primarily focus on two strategies: suppression, which involves omitting sensitive information, and generalization, which replaces specific details with broader abstractions. However, these methods have been evaluated largely through isolated messages, thus failing to capture the complexities of real-world communication. This article aims to redefine privacy-preserving communication by introducing the concept of an Information Sufficiency (IS) task, alongside a novel approach called free-text pseudonymization. This method substitutes sensitive attributes with functionally equivalent alternatives. Furthermore, we propose a conversational evaluation protocol that tests these strategies under realistic multi-turn interactions.

Key Findings

We conducted an extensive evaluation across 792 scenarios that encompassed three types of power relations: institutional, peer, and intimate.
The scenarios were categorized into three sensitivity categories: discrimination risk, social cost, and boundary issues.
Seven leading LLMs were assessed for their performance in maintaining privacy, focusing on two key aspects: covertness and utility.
Our findings revealed that pseudonymization consistently provided the best balance between privacy and utility across various contexts.
Additionally, evaluations based on single-message interactions significantly underestimated the potential leakage of sensitive information, with generalization strategies losing up to 16.3 percentage points of privacy when subjected to follow-up inquiries.

Implications for Future Research

The introduction of Information Sufficiency as a framework for evaluating privacy in LLM communications emphasizes the need for a more nuanced understanding of user interactions. Current methods that rely on isolated message evaluations fail to account for the dynamic nature of conversations, which can lead to unintended disclosures of sensitive information. This suggests that future research should focus on developing more sophisticated models that can adapt to the complexities of human communication.

Conclusion

As LLMs become increasingly integrated into everyday communication, the challenge of protecting user privacy while maintaining effective communication is paramount. The concept of Information Sufficiency, along with strategies such as free-text pseudonymization, offers a promising direction for enhancing privacy measures in LLM applications. By adopting a multi-turn evaluation protocol, we can ensure that privacy-preserving technologies evolve in tandem with the communicative needs of users, ultimately fostering a safer digital environment.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Rethinking Privacy in LLMs with Information Sufficiency

Say Something Else: Rethinking Contextual Privacy as Information Sufficiency

Abstract

Key Findings

Implications for Future Research

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related