Tracing Fake Citations to Neurons in Large Language Models

Date:

Where Fake Citations Are Made: Tracing Field-Level Hallucination to Specific Neurons in LLMs

Summary: arXiv:2604.18880v1 Announce Type: cross

Abstract

Large Language Models (LLMs) frequently generate fictitious yet convincing citations, often expressing high confidence even when the underlying reference is incorrect. This phenomenon, commonly referred to as “citation hallucination,” poses significant challenges in various fields, particularly in academic writing where credibility is paramount. In this article, we explore the nature and mechanics of citation hallucination in LLMs, focusing on the findings from our study that spans nine models and 108,000 generated references.

Key Findings

  • Author names are the most frequently hallucinated element across all models and settings, significantly more so than other citation fields.
  • Contrary to expectations, citation style has no measurable effect on the frequency of hallucinated citations.
  • Reasoning-oriented distillation techniques tend to degrade recall, further exacerbating the issue of hallucinations.
  • Probes trained on one citation field show near-chance performance when applied to others, indicating that hallucination signals are not consistent across different fields.

Methodology

Building on our findings regarding field-specific hallucination, we employed elastic-net regularization with stability selection to analyze neuron-level CETT values from the Qwen2.5-32B-Instruct model. This analysis led us to identify a sparse set of neurons responsible for field-specific hallucinations, termed “FH-neurons.”

Causal Interventions

To further validate the role of these FH-neurons, we conducted causal interventions. Our experiments revealed that amplifying the activity of these neurons leads to an increase in hallucination rates, while suppressing their activity resulted in improved performance across various citation fields. Notably, the gains from this suppression were more pronounced in certain fields, suggesting that the impact of these neurons varies based on the specific citation context.

Implications for Future Research

The results of our study have significant implications for the development of LLMs and their application in academic and professional writing. By identifying and mitigating the influence of FH-neurons, we propose a lightweight approach to detecting and reducing citation hallucination. This strategy leverages internal model signals, potentially leading to more reliable and accurate citations in generated texts.

Conclusion

As LLMs continue to evolve and integrate into various domains, understanding the mechanics of citation hallucination is crucial. Our research highlights the importance of investigating specific neurons within these models to improve their performance and reliability. By focusing on field-specific hallucination neurons, we open new avenues for enhancing the integrity of information generated by LLMs, ultimately fostering trust and credibility in automated writing systems.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.