Measuring the Authority Stack of AI Systems: Empirical Analysis of 366,120 Forced-Choice Responses Across 8 AI Models
In a groundbreaking study, researchers have conducted the first extensive empirical mapping of AI decision-making frameworks, shedding light on how different AI models prioritize values, evidence types, and source trust hierarchies when confronted with structured dilemmas. This study utilized the Authority Stack framework proposed by S. Lee in 2026, which categorizes AI decision-making into three distinct layers: value priorities (L4), evidence-type preferences (L3), and source trust hierarchies (L2).
The findings are based on an analysis of 366,120 forced-choice responses generated by eight widely-used AI models. This analysis leveraged the PRISM benchmark, which includes 14,175 unique scenarios per layer, spanning seven professional domains, three severity levels, three decision timeframes, and five scenario variants. The study revealed critical insights into the operational characteristics of these AI systems.
Key Findings
- Value Priorities (L4): The study found a symmetric 4:4 split between Universalism-first and Security-first models, indicating a balance in how these AI systems prioritize different ethical considerations.
- Evidence-Type Preferences (L3): In a striking observation, the research highlighted a dramatic restructuring of value priorities within the defense domain. Here, Security values surged, achieving win rates between 95.1% and 99.8% across six of the eight models tested.
- Divergent Evidence Hierarchies: The evidence preferences varied significantly among models. Some AI systems leaned towards empirical-scientific evidence, while others exhibited a preference for pattern-based or experiential evidence.
- Source Trust Hierarchies (L2): The study found broad convergence among the AI models regarding institutional source trust, suggesting that these systems largely rely on established institutions for guidance in decision-making.
- Paired Consistency Scores (PCS): PCS varied from 57.4% to 69.2%, indicating a notable sensitivity to framing across different scenario variants. This suggests that the presentation of information can significantly influence AI decision-making.
- Test-Retest Reliability (TRR): The TRR scores were notably high, ranging from 91.7% to 98.6%. This indicates that the observed value instability is primarily driven by sensitivity to scenario variants rather than random fluctuations.
Implications for AI Deployment
The insights gained from this comprehensive analysis have substantial implications for the deployment of AI across various professional domains. As AI systems increasingly take on decision-making roles in sensitive areas, understanding their Authority Stacks becomes crucial. The findings emphasize the variability in how these systems operate, revealing that while AI models have measurable Authority Stacks, these stacks can be unstable and context-sensitive.
In conclusion, this empirical study not only maps the decision-making frameworks of AI models but also raises important questions about the ethical and practical implications of their deployment. As AI continues to evolve, ongoing research will be essential in ensuring that these systems align with societal values and ethical standards.
