Empirical Analysis of AI Authority Stacks Across 8 Models

Date:

Measuring the Authority Stack of AI Systems: Empirical Analysis of 366,120 Forced-Choice Responses Across 8 AI Models

In a groundbreaking study, researchers have conducted the first extensive empirical mapping of AI decision-making frameworks, shedding light on how different AI models prioritize values, evidence types, and source trust hierarchies when confronted with structured dilemmas. This study utilized the Authority Stack framework proposed by S. Lee in 2026, which categorizes AI decision-making into three distinct layers: value priorities (L4), evidence-type preferences (L3), and source trust hierarchies (L2).

The findings are based on an analysis of 366,120 forced-choice responses generated by eight widely-used AI models. This analysis leveraged the PRISM benchmark, which includes 14,175 unique scenarios per layer, spanning seven professional domains, three severity levels, three decision timeframes, and five scenario variants. The study revealed critical insights into the operational characteristics of these AI systems.

Key Findings

  • Value Priorities (L4): The study found a symmetric 4:4 split between Universalism-first and Security-first models, indicating a balance in how these AI systems prioritize different ethical considerations.
  • Evidence-Type Preferences (L3): In a striking observation, the research highlighted a dramatic restructuring of value priorities within the defense domain. Here, Security values surged, achieving win rates between 95.1% and 99.8% across six of the eight models tested.
  • Divergent Evidence Hierarchies: The evidence preferences varied significantly among models. Some AI systems leaned towards empirical-scientific evidence, while others exhibited a preference for pattern-based or experiential evidence.
  • Source Trust Hierarchies (L2): The study found broad convergence among the AI models regarding institutional source trust, suggesting that these systems largely rely on established institutions for guidance in decision-making.
  • Paired Consistency Scores (PCS): PCS varied from 57.4% to 69.2%, indicating a notable sensitivity to framing across different scenario variants. This suggests that the presentation of information can significantly influence AI decision-making.
  • Test-Retest Reliability (TRR): The TRR scores were notably high, ranging from 91.7% to 98.6%. This indicates that the observed value instability is primarily driven by sensitivity to scenario variants rather than random fluctuations.

Implications for AI Deployment

The insights gained from this comprehensive analysis have substantial implications for the deployment of AI across various professional domains. As AI systems increasingly take on decision-making roles in sensitive areas, understanding their Authority Stacks becomes crucial. The findings emphasize the variability in how these systems operate, revealing that while AI models have measurable Authority Stacks, these stacks can be unstable and context-sensitive.

In conclusion, this empirical study not only maps the decision-making frameworks of AI models but also raises important questions about the ethical and practical implications of their deployment. As AI continues to evolve, ongoing research will be essential in ensuring that these systems align with societal values and ethical standards.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.