InstrAct improves action-centric understanding in instructional videos by addressing static bias and noisy data, boosting Video Foundation Models' accuracy...
Discover VideoStir, a novel framework enhancing long video analysis via spatio-temporal structure and intent-aware retrieval-augmented generation (RAG).