Cascaded multilingual audio-visual learning from videos
Andrew Rouditchenko, Angie Boggust, et al.
INTERSPEECH 2021
In this paper we describe our submission to the SMART 2021 Answer Type Prediction task. We propose a BERT based solution to the problem. The proposed approach relies on type embeddings obtained based on the type names. It allows our model to predict types at test time that were not seen during training. Analysis of the training dataset reveals the presence of noise in the labels. Therefore, we develop a label augmentation scheme to reduce the noise in the annotations and increase the quality of the training data. Our model trained on the de-noised data achieves 0.986 accuracy on the answer category prediction task and 0.825 and 0.790 NDCG@5 and NDCG@10 respectively on the test sets.
Andrew Rouditchenko, Angie Boggust, et al.
INTERSPEECH 2021
Saiteja Utpala, Alex Gu, et al.
NAACL 2024
Gosia Lazuka, Andreea Simona Anghel, et al.
SC 2024
Gabriele Picco, Lam Thanh Hoang, et al.
EMNLP 2021