Active learning for BERT: An empirical study
Liat Ein-Dor, Alon Halfon, et al.
EMNLP 2020
The process of obtaining high quality labeled data for natural language understanding tasks is often slow, error-prone, complicated and expensive. With the vast usage of neural networks, this issue becomes more notorious since these networks require a large amount of labeled data to produce satisfactory results. We propose a methodology to blend high quality but scarce labeled data with noisy but abundant weak labeled data during the training of neural networks. Experiments in the context of topic-dependent evidence detection with two forms of weak labeled data show the advantages of the blending scheme. In addition, we provide a manually annotated data set for the task of topic-dependent evidence detection.
Liat Ein-Dor, Alon Halfon, et al.
EMNLP 2020
Jean-Philippe Bernardy, Shalom Lappin, et al.
ACL 2018
Liat Ein-Dor, Ilya Shnayderman, et al.
AAAI 2022
Cicero dos Santos, Igor Melnyk, et al.
ACL 2018