Exploring the Benefits of Tokenization of Discrete Acoustic UnitsAvihu DekelRaul Fernandez2024INTERSPEECH 2024
Exploring the limits of decoder-only models trained on public speech recognition corporaAnkit GuptaGeorge Saonet al.2024INTERSPEECH 2024
M2 ASR: Multilingual Multi-task Automatic Speech Recognition via Multi-objective OptimizationA SaifLisha Chenet al.2024INTERSPEECH 2024
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and TranslationAndrew RouditchenkoYuan Gonget al.2024INTERSPEECH 2024
Low Bitrate High-Quality RVQGAN-based Discrete Speech TokenizerSlava ShechtmanAvihu Dekel2024INTERSPEECH 2024