Discover how the REL benchmark evaluates relational reasoning in large language models, revealing key insights on their performance with complex relations.
Discover a novel permutation-invariant approach to table reasoning that enhances retrieval stability and overcomes layout biases in large language models.
Discover how agent identity acts as an attractor in LLM activation space, revealing persistent cognitive architecture across models like Llama and Gemma.
Study assesses large language models as AI tutors in Nepal's K-10 curriculum, highlighting gaps in pedagogy and cultural relevance for low-resource educati...