Polarization by Default: Auditing Recommendation Bias in LLM-Based Content Curation
Summary
Large Language Models (LLMs) have become a cornerstone in the way content is curated and ranked across various platforms. However, the biases that influence their recommendations remain poorly understood. The research, detailed in arXiv:2604.15937v1, seeks to illuminate the biases present in LLMs from three major providers: OpenAI, Anthropic, and Google.
Abstract
LLMs are increasingly deployed to curate and rank human-created content, yet the nature and structure of their biases in these tasks remains poorly understood. This study presents a controlled simulation mapping content selection biases across three major LLM providers on real social media datasets from Twitter/X, Bluesky, and Reddit. By employing six prompting strategies—general, popular, engaging, informative, controversial, and neutral—we analyzed 540,000 simulated top-10 selections from pools of 100 posts across 54 experimental conditions.
Key Findings
- Bias Variation: The study revealed that biases differ significantly in how structural they are and how sensitive they are to prompting strategies.
- Polarization Amplification: Across all configurations, polarization is consistently amplified, indicating a systemic issue in how LLMs curate content.
- Toxicity Handling: The research found a strong inversion in toxicity handling between engagement-focused and information-focused prompts.
- Sentiment Bias: Sentiment biases were predominantly negative, suggesting a trend toward curating less favorable content.
Provider Comparisons
The research provides insights into the distinct behaviors exhibited by different LLM providers:
- GPT-4o Mini: This model displayed the most consistent behavior across various prompts, indicating reliability in its content selection.
- Claude and Gemini: Both models exhibited high adaptability in handling toxicity, making them potentially more suitable for content curation in sensitive contexts.
- Gemini: Notably, Gemini showed the strongest preference for negative sentiment, raising concerns about the nature of content being promoted.
Political Leaning Bias
An intriguing aspect of the research was the analysis of political leaning biases on Twitter/X, where author demographics could be inferred from profile bios. The findings revealed that:
- Left-leaning authors were systematically over-represented in curation outputs, despite right-leaning authors forming the plurality of the dataset.
- This pattern persisted across different prompting strategies, indicating a potential bias in how political content is curated by LLMs.
Conclusion
The study underscores the need for further investigation into the biases inherent in LLMs and their implications for content curation. As these models become more integrated into social media ecosystems, understanding their biases will be crucial for ensuring fair and balanced content distribution.
