Reduce biases in reward models using causally motivated inference-time interventions to improve alignment with human preferences without losing performance...
Explore how instruction complexity drives positional collapse in LLMs during adversarial evaluations, impacting model response strategies and accuracy.
Discover Comet-H, a system that synchronizes language models to improve research software development and reduce errors like hallucination and desynchroniz...