Discover HM-Bench, the first benchmark evaluating multimodal large language models on hyperspectral remote sensing data for advanced spectral analysis.
Discover HiL-Bench, a benchmark measuring AI agents' ability to know when to ask for help in uncertain tasks, improving decision-making and performance.