Discover Dynin-Omni, the first unified omnimodal diffusion model integrating text, image, speech, and video for advanced AI understanding and generation.
Discover JAL-Turn, a lightweight model for real-time, robust turn-taking detection in full-duplex spoken dialogue systems using acoustic and linguistic cue...
Explore how dialectal variation in Newcastle English impacts ASR accuracy, revealing social patterns and phonetic challenges in speech recognition technolo...