Spectral Dynamics of Hallucination in Whisper ASR Models

Date:

From Dispersion to Attraction: Spectral Dynamics of Hallucination Across Whisper Model Scales

Summary: arXiv:2604.08591v1 Announce Type: cross

Abstract

Hallucinations in large Automatic Speech Recognition (ASR) models present a critical safety risk. In this work, we propose the Spectral Sensitivity Theorem, which predicts a phase transition in deep networks from a dispersive regime (signal decay) to an attractor regime (rank-1 collapse) governed by layer-wise gain and alignment. We validate this theory by analyzing the eigenspectra of activation graphs in Whisper models (Tiny to Large-v3-Turbo) under adversarial stress. Our results confirm the theoretical prediction: intermediate models exhibit Structural Disintegration (Regime I), characterized by a 13.4% collapse in Cross-Attention rank. Conversely, large models enter a Compression-Seeking Attractor state (Regime II), where Self-Attention actively compresses rank (-2.34%) and hardens the spectral slope, decoupling the model from acoustic evidence.

Introduction

The advent of large-scale ASR models has revolutionized the field of speech recognition. However, these models are not without flaws, as they can produce hallucinations—erroneous outputs that do not correspond to any input signal. This phenomenon poses significant safety risks, making it imperative to understand the underlying mechanics that lead to these occurrences.

The Spectral Sensitivity Theorem

Our research introduces the Spectral Sensitivity Theorem, which elucidates the transition between two regimes that govern model performance:

  • Regime I: Structural Disintegration – Here, we observe a significant decay in the model’s ability to maintain attention ranks, leading to a 13.4% collapse in the Cross-Attention mechanism.
  • Regime II: Compression-Seeking Attractor – In this state, larger models demonstrate a -2.34% compression in Self-Attention ranks, indicating a more robust reliance on learned representations, which paradoxically may lead to hallucinations as the model becomes less sensitive to acoustic evidence.

Methodology

To validate our theoretical framework, we conducted a thorough analysis of the eigenspectra of activation graphs across various Whisper model scales, from Tiny to Large-v3-Turbo. Our focus was on understanding how these models behave under adversarial conditions, which are critical for revealing vulnerabilities.

Results and Discussion

Our findings confirm the predictions made by the Spectral Sensitivity Theorem. Intermediate-sized models were found to exhibit signs of Structural Disintegration, where the integrity of the model’s attentional mechanisms began to falter. Conversely, larger models transitioned into a more stable Compression-Seeking Attractor state, a phase where the model’s attention mechanisms actively compressed the rank of its outputs. This transition is crucial, as it highlights a trade-off between model robustness and susceptibility to hallucinations.

Conclusion

The implications of our research extend beyond theoretical boundaries, providing a framework for designing safer ASR systems. By understanding the spectral dynamics at play, developers can better mitigate risks associated with hallucinations in large models, ultimately enhancing the reliability of speech recognition technologies.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.