Discover Spectral Compact Training, a memory-efficient method enabling large language model training on consumer hardware with truncated SVD and Stiefel QR...
Discover TED, a training-free knowledge distillation method that boosts multimodal reasoning performance without costly model updates or large datasets.