Resource-constrained Amazons Chess Decision Framework Integrating Large Language Models and Graph Attention
Abstract: Artificial intelligence has advanced significantly through the development of intelligent game-playing systems, providing rigorous testbeds for decision-making, strategic planning, and adaptive learning. However, resource-constrained environments pose critical challenges, as conventional deep learning methods heavily rely on extensive datasets and computational resources.
In this paper, we propose a lightweight hybrid framework for the Game of the Amazons, which explores the paradigm of weak-to-strong generalization by integrating the structural reasoning of graph-based learning with the generative capabilities of large language models.
Framework Overview
Specifically, we leverage a Graph Attention Autoencoder to inform a multi-step Monte Carlo Tree Search, utilize a Stochastic Graph Genetic Algorithm to optimize evaluation signals, and harness GPT-4o-mini to generate synthetic training data. Unlike traditional approaches that rely on expert demonstrations, our framework learns from noisy and imperfect supervision.
Key Components
- Graph Attention Autoencoder: This component serves as a structural filter, effectively denoising the outputs generated by the large language model.
- Multi-step Monte Carlo Tree Search: This method enhances decision-making by allowing the AI to explore multiple potential future moves before settling on the best course of action.
- Stochastic Graph Genetic Algorithm: This algorithm optimizes evaluation signals, allowing the system to adapt and improve its performance based on varying conditions.
- GPT-4o-mini: By generating synthetic training data, this model aids in training the AI in a resource-efficient manner without the need for vast datasets.
Experimental Results
Experiments conducted on a 10×10 Amazons board revealed that our hybrid approach achieves a remarkable improvement in decision accuracy ranging from 15% to 56% when compared to baseline models. Additionally, our framework significantly outperforms its teacher model, GPT-4o-mini. Notably, it achieved a competitive win rate of 45.0% with N=30 nodes and an impressive 66.5% at only N=50 nodes.
Conclusion
These results confirm the feasibility of evolving specialized, high-performance game AI from general-purpose foundation models, even under stringent computational constraints. This innovative approach not only enhances the capabilities of AI in game settings but also opens the door for future research in resource-efficient AI development.
