How Source Labels Bias Trust in Humans and LLM Judges

Date:

Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge

Summary: arXiv:2604.05593v1 Announce Type: new

Abstract: Large language models (LLMs) are increasingly used as automated evaluators (LLM-as-a-Judge). This work challenges its reliability by showing that trust judgments by LLMs are biased by disclosed source labels. Using a counterfactual design, we find that both humans and LLM judges assign higher trust to information labeled as human-authored than to the same content labeled as AI-generated. Eye-tracking data reveal that humans rely heavily on source labels as heuristic cues for judgments. We analyze LLM internal states during judgment. Across label conditions, models allocate denser attention to the label region than the content region, and this label dominance is stronger under Human labels than AI labels, consistent with the human gaze patterns. Besides, decision uncertainty measured by logits is higher under AI labels than Human labels. These results indicate that the source label is a salient heuristic cue for both humans and LLMs. It raises validity concerns for label-sensitive LLM-as-a-Judge evaluation, and we cautiously raise that aligning models with human preferences may propagate human heuristic reliance into models, motivating debiased evaluation and alignment.

Introduction

The advent of large language models (LLMs) has transformed the landscape of automated evaluations, making them integral to various applications. However, the reliability of these models as evaluators has come under scrutiny, particularly regarding their trustworthiness in decision-making processes.

Key Findings

  • Heuristic Reliance: Both humans and LLMs demonstrate a significant reliance on source labels when assessing trustworthiness. This reliance highlights a shared heuristic approach in the evaluation process.
  • Label Impact: The study found that information labeled as human-authored garnered higher trust scores compared to identical content labeled as AI-generated. This discrepancy underscores the influence of perceived authorship on judgment.
  • Eye-Tracking Insights: Eye-tracking data revealed that human judges focus more on source labels than content during evaluation. This behavior indicates that humans prioritize label information as a cue for trust assessment.
  • LLM Attention Analysis: Internal state analysis of LLMs showed a stronger attention allocation to label areas rather than content areas, particularly when evaluating human-authored labels. This finding aligns with human gaze patterns, suggesting similar cognitive processing between humans and LLMs.
  • Decision Uncertainty: The study measured decision uncertainty using logits, revealing higher uncertainty associated with AI labels compared to human labels, further emphasizing the role of source labeling in trust assessments.

Implications

The findings of this research raise critical questions about the validity of LLMs as judges, especially in contexts sensitive to source labeling. The reliance on heuristic cues may not only affect human judgments but could also propagate biases into LLM evaluations.

As LLMs are increasingly aligned with human preferences, there is a risk that these models might inadvertently adopt human heuristic reliance, perpetuating existing biases. This highlights the need for robust debiasing strategies in both model training and evaluation processes.

Conclusion

The study underscores the importance of critically examining how source labels influence trust assessments by humans and LLMs. With the growing reliance on LLMs for automated evaluations, addressing these biases is essential to ensure fair and accurate outcomes in various applications.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.