LudoBench evaluates large language models' strategic reasoning using 480 spot-based Ludo scenarios, revealing key insights into AI decision-making behavior...
Discover CuraLight, an AI framework using reinforcement learning and LLMs to optimize traffic signals, reducing congestion and travel time effectively.
Explore how source labels influence trust assessments by humans and large language models, revealing shared biases and the need for debiased evaluations.