Discover Slash, a training-free method to boost structural attention in LLMs, improving graph reasoning without costly fine-tuning or complex adapters.
Discover Rubric-based On-policy Distillation (ROPD), a scalable AI model alignment method that outperforms traditional logit-based techniques with 10x effi...