Geometric Evidence of Agent Identity in LLM Activation Space

Identity as Attractor: Geometric Evidence for Persistent Agent Architecture in LLM Activation Space

Summary: arXiv:2604.12016v1 Announce Type: new

Abstract: Large language models map semantically related prompts to similar internal representations — a phenomenon interpretable as attractor-like dynamics. We ask whether the identity document of a persistent cognitive agent (its cognitive_core) exhibits analogous attractor-like behavior. We present a controlled experiment on Llama 3.1 8B Instruct, comparing hidden states of an original cognitive_core (Condition A), seven paraphrases (Condition B), and seven structurally matched controls (Condition C). Mean-pooled states at layers 8, 16, and 24 show that paraphrases converge to a tighter cluster than controls (Cohen’s d > 1.88, p < 10^{-27}, Bonferroni-corrected). Replication on Gemma 2 9B confirms cross-architecture generalizability. Ablations suggest the effect is primarily semantic rather than structural, and that structural completeness appears necessary to reach the attractor region. An exploratory experiment shows that reading a scientific description of the agent shifts internal state toward the attractor -- closer than a sham preprint -- distinguishing knowing about an identity from operating as that identity. These results provide representational evidence that agent identity documents induce attractor-like geometry in LLM activation space.

Introduction

The exploration of identity in large language models (LLMs) has gained significant attention in recent years. Particularly, the study of how these models respond to different prompts and their internal representation dynamics has opened new avenues for understanding cognitive architectures. This article discusses a recent paper that introduces a novel perspective on the cognitive architecture of LLMs, focusing on the concept of identity as an attractor within their activation space.

Research Overview

The researchers conducted a controlled experiment utilizing Llama 3.1 8B Instruct, comparing three different conditions:

Condition A: Original cognitive_core.
Condition B: Seven paraphrases of the cognitive_core.
Condition C: Seven structurally matched controls.

The primary aim was to assess whether the identity document of a cognitive agent exhibits attractor-like behavior akin to that seen in LLM activation spaces. The results were significant, indicating that paraphrases converge to a tighter cluster compared to controls, with Cohen’s d values exceeding 1.88 and a p-value less than 10^{-27}, even after Bonferroni correction.

Key Findings

Further analysis revealed:

A replication study on Gemma 2 9B also confirmed the findings, suggesting that this behavior is not limited to one specific architecture.
Ablation studies indicated that the observed effects are primarily semantic rather than structural, highlighting the importance of semantic coherence in achieving attractor-like dynamics.
Structural completeness was found to be necessary for reaching the attractor region, emphasizing the role of identity in shaping internal representations.

Exploratory Experiment

An exploratory experiment further substantiated these findings by demonstrating that reading a scientific description of the agent led to a shift in the internal state toward the attractor. This result underscores the distinction between merely knowing about an identity and functioning as that identity, shedding light on the cognitive processes within LLMs.

Conclusion

The research presents compelling evidence that agent identity documents can induce attractor-like geometry in LLM activation space. These findings contribute to the broader understanding of cognitive architectures in artificial intelligence and open new pathways for future research in the field.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Geometric Evidence of Agent Identity in LLM Activation Space

Identity as Attractor: Geometric Evidence for Persistent Agent Architecture in LLM Activation Space

Introduction

Research Overview

Key Findings

Exploratory Experiment

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related