Automated Analysis of Global AI Safety Initiatives: A Taxonomy-Driven LLM Approach
Summary: arXiv:2604.03533v1 Announce Type: new
Abstract
We present an automated crosswalk framework that compares an AI safety policy document pair under a shared taxonomy of activities.
Using the activity categories defined in Activity Map on AI Safety as fixed aspects, the system extracts and maps relevant activities,
then produces for each aspect a short summary for each document, a brief comparison, and a similarity score.
We assess the stability and validity of LLM-based crosswalk analysis across public policy documents.
Using five large language models, we perform crosswalks on ten publicly available documents and visualize mean similarity scores with a heatmap.
The results show that model choice substantially affects the crosswalk outcomes, and that some document pairs yield high disagreements across models.
A human evaluation by three experts on two document pairs shows high inter-annotator agreement,
while model scores still differ from human judgments. These findings support comparative inspection of policy documents.
Introduction
The growing complexity of artificial intelligence (AI) technologies necessitates robust safety policies to mitigate potential risks.
Recent initiatives have sought to standardize safety protocols across various jurisdictions. However, comparative analysis of these policies
remains challenging due to the diverse frameworks and terminologies employed.
This article discusses an innovative approach using large language models (LLMs) to automate the evaluation and comparison of AI safety initiatives.
Methodology
Our study employs a crosswalk framework that leverages a fixed taxonomy derived from the Activity Map on AI Safety.
The methodology involves several key steps:
- Document Selection: Ten publicly available AI safety policy documents were selected for analysis.
- Activity Mapping: Relevant activities from each document were extracted and categorized according to the predefined taxonomy.
- Similarity Scoring: For each category, a summary and a similarity score were generated using five different LLMs.
- Human Evaluation: Three experts evaluated two document pairs to assess the accuracy of the model outputs.
Results
The analysis revealed that the choice of language model significantly influenced the outcomes of the crosswalk analysis.
Some document pairs demonstrated substantial discrepancies in similarity scores, highlighting the variability in LLM performance.
The human evaluation yielded a high level of agreement among experts, indicating that while models provide useful insights,
they may not fully capture human-level understanding of policy nuances.
Conclusion
This study underscores the potential of LLMs in automating the comparative analysis of AI safety policies.
Despite variations in model performance, the findings advocate for the integration of AI-driven tools in policy evaluation.
Future research should focus on refining these models and expanding the taxonomy to encompass emerging AI safety concerns.
Implications for AI Policy
As AI technologies continue to evolve, the need for coherent safety policies becomes increasingly urgent.
By employing automated systems for policy comparison, stakeholders can ensure a more systematic approach to AI governance.
This work sets a foundation for further exploration into how AI can enhance policy analysis and compliance monitoring.
