4.7 Article

Zero-shot cross-lingual transfer language selection using linguistic similarity

Journal

INFORMATION PROCESSING & MANAGEMENT
Volume 60, Issue 3, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.ipm.2022.103250

Keywords

Multilingual natural language processing; Zero-shot learning; Transfer learning; Linguistics; Language similarity

Ask authors/readers for more resources

This study focuses on the selection of transfer languages for different NLP tasks. It proposes using linguistic similarity metrics to measure language distance and choose the optimal transfer language. The study demonstrates that linguistic similarity is correlated with cross-lingual transfer performance and that there is a statistically significant difference in choosing the optimal transfer source language. The results show the potential for leveraging knowledge from high-resource languages to improve language applications with limited data.
We study the selection of transfer languages for different Natural Language Processing tasks, specifically sentiment analysis, named entity recognition and dependency parsing. In order to select an optimal transfer language, we propose to utilize different linguistic similarity metrics to measure the distance between languages and make the choice of transfer language based on this information instead of relying on intuition. We demonstrate that linguistic similarity correlates with cross-lingual transfer performance for all of the proposed tasks. We also show that there is a statistically significant difference in choosing the optimal language as the transfer source instead of English. This allows us to select a more suitable transfer language which can be used to better leverage knowledge from high-resource languages in order to improve the performance of language applications lacking data. For the study, we used datasets from eight different languages from three language families.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available