Entropy and Attention Dynamics in Small Language Models: A Trace-Level Structural Analysis on the TruthfulQA Benchmark
Summary: arXiv:2604.03589v1 Announce Type: new
Abstract: Small language models (SLMs) have been increasingly deployed in edge devices and other resource-constrained settings. However, these models make confident mispredictions and produce unstable output, making them risky for factual and decision-critical tasks. Current evaluation methodology relies on final accuracy or hallucination rates without explaining how internal model behavior affects outputs. Specifically, how entropy evolves during decoding, how attention is distributed across layers, and how hidden representations contribute to uncertainty, logical inconsistencies, and misinformation propagation are often overlooked. Consequently, this study introduces a trace-level analysis of entropy and attention dynamics in SLMs evaluated with the TruthfulQA dataset. Four models with parameter ranges of 1B-1.7B parameters were examined via token-level output entropy, attention entropy, head dispersion, and hidden-state representation. The results reflect three model classifications by entropy patterns. Deterministic models (DeepSeek-1.5B and LLaMA-1B): output entropy decreases over time. Exploratory models (Gemma-1B): with increasing entropy, and balanced models (Qwen-1.7B): have moderate and stable entropy. Also, each group has distinctively different hidden-state movement and attention dispersion patterns. The analysis demonstrates that truthfulness in SLMs emerges from structured entropy and attention dynamics. Monitoring and optimizing these internal uncertainty patterns can guide the design of a more reliable, hallucination-aware, and application-specific edge SLMs.
Introduction
In recent years, small language models (SLMs) have gained traction due to their lightweight nature, making them suitable for deployment in edge devices. However, their propensity for confident yet incorrect predictions poses challenges, particularly in scenarios demanding high reliability and accuracy. This article delves into a novel study that investigates the internal mechanics of SLMs, focusing on entropy and attention dynamics.
Key Findings
The study utilized the TruthfulQA benchmark to conduct a structural analysis of four distinct SLMs, revealing critical insights into their operational characteristics. The models analyzed include:
- DeepSeek-1.5B (Deterministic)
- LLaMA-1B (Deterministic)
- Gemma-1B (Exploratory)
- Qwen-1.7B (Balanced)
The results indicate three classifications of models based on their entropy patterns:
- Deterministic Models: Both DeepSeek-1.5B and LLaMA-1B exhibit a decrease in output entropy over time, suggesting a more stable output as decoding progresses.
- Exploratory Models: Gemma-1B shows an increasing trend in output entropy, reflecting a more uncertain and exploratory approach to language generation.
- Balanced Models: Qwen-1.7B maintains a moderate and stable entropy, striking a balance between determinism and exploration.
Implications for Model Design
The findings from this analysis emphasize the importance of understanding internal uncertainty patterns within SLMs. By monitoring and optimizing entropy and attention dynamics, developers can enhance the reliability of these models, particularly in applications where factual accuracy is paramount.
As SLMs continue to evolve, this study serves as a foundational step towards designing more effective, hallucination-aware models suitable for real-world applications. The insights gained from entropy and attention dynamics could pave the way for future research, ultimately leading to more trustworthy AI systems.
