Journal
SENSORS
Volume 22, Issue 12, Pages -Publisher
MDPI
DOI: 10.3390/s22124420
Keywords
text classification; few-shot learning; news categorization; feature selection
Funding
- Key-Area Research and Development Program of Guangdong Province [2019B010153002]
- Key Program of NSFC-Guangdong Joint Funds [U1801263]
- Science and Technology Projects of Guangzhou [202007040006]
- Program of Marine Economy Development (Six Marine Industries) Special Foundation of Department of Natural Resources of Guangdong Province [GDNRC [2020]056]
- Top Youth Talent Project of Zhujiang Talent Program [2019QN01X516]
- National Key RD project [2019YFB1705503]
- R& D projects in key areas of Guangdong Province [2018B010109007]
- Guangdong Provincial Key Laboratory of Cyber-Physical Systems [2020B1212060069]
Ask authors/readers for more resources
In this paper, the authors propose SumFS, a meta-learning framework for text classification that utilizes extractive summarization and improved local vocabulary features to achieve domain adaptation with limited labeled data. Experimental results show that SumFS can reduce input features while maintaining or improving accuracy, and significantly decrease training time.
Meta-learning frameworks have been proposed to generalize machine learning models for domain adaptation without sufficient label data in computer vision. However, text classification with meta-learning is less investigated. In this paper, we propose SumFS to find global top-ranked sentences by extractive summary and improve the local vocabulary category features. The SumFS consists of three modules: (1) an unsupervised text summarizer that removes redundant information; (2) a weighting generator that associates feature words with attention scores to weight the lexical representations of words; (3) a regular meta-learning framework that trains with limited labeled data using a ridge regression classifier. In addition, a marine news dataset was established with limited label data. The performance of the algorithm was tested on THUCnews, Fudan, and marine news datasets. Experiments show that the SumFS can maintain or even improve accuracy while reducing input features. Moreover, the training time of each epoch is reduced by more than 50%.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available