Discover SciMDR, a large-scale dataset enhancing AI's ability to reason across scientific multimodal documents with 300K QA pairs and expert benchmarks.
Discover AdaRubric, a dynamic rubric system that adapts to tasks for accurate evaluation and improved training of LLM agents across diverse benchmarks.