☆ 4.7 Article

Using error decay prediction to overcome practical issues of deep active learning for named entity recognition

MACHINE LEARNING (2020)

Journal

MACHINE LEARNING

Volume 109, Issue 9-10, Pages 1749-1778

Publisher

SPRINGER

DOI: 10.1007/s10994-020-05897-1

Keywords

Active learning; Transparency; Robustness to labeling noise; Black-box models; Clustering; Named entity recognition

Funding

Center for Data Science
Center for Intelligent Information Retrieval
Chan Zuckerberg Initiative
Collaborative RD Fund
National Science Foundation (NSF) [DMR-1534431, IIS-1514053]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Existing deep active learning algorithms achieve impressive sampling efficiency on natural language processing tasks. However, they exhibit several weaknesses in practice, including (a) inability to use uncertainty sampling with black-box models, (b) lack of robustness to labeling noise, and (c) lack of transparency. In response, we propose a transparent batch active sampling framework by estimating the error decay curves of multiple feature-defined subsets of the data. Experiments on four named entity recognition (NER) tasks demonstrate that the proposed methods significantly outperform diversification-based methods for black-box NER taggers, and can make the sampling process more robust to labeling noise when combined with uncertainty-based methods. Furthermore, the analysis of experimental results sheds light on the weaknesses of different active sampling strategies, and when traditional uncertainty-based or diversification-based methods can be expected to work well.

Using error decay prediction to overcome practical issues of deep active learning for named entity recognition

Journal

MACHINE LEARNING

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Using error decay prediction to overcome practical issues of deep active learning for named entity recognition

Journal

MACHINE LEARNING

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper