期刊
SOFT COMPUTING
卷 23, 期 5, 页码 1557-1572出版社
SPRINGER
DOI: 10.1007/s00500-017-2879-x
关键词
Active learning; Classification; Cost; k-Nearest neighbors; Tri-partition
资金
- National Natural Science Foundation of China [61379089]
- Natural Science Foundation of Department of Education of Sichuan Province [16ZA0060]
Active learning differs from the training-testing scenario in that class labels can be obtained upon request. It is widely employed in applications where the labeling of instances incurs a heavy manual cost. In this paper, we propose a new algorithm called tri-partition active learning through k-nearest neighbors (TALK). The optimization objective is to minimize the total teacher and misclassification costs. First, a k-nearest neighbors classifier is employed to divide unlabeled instances into three disjoint regions. Region I contains instances for which the expected misclassification cost is lower than the teacher cost, Region II contains instances to be labeled by human experts, and Region III contains the remaining instances. Various strategies are designed to determine which instances are in Region II. Second, instances in Regions I and II are labeled and added to the training set, and the tri-partition process is repeated until all instances have been labeled. Experiments are undertaken on eight University of California, Irvine, datasets using different cost settings. Compared with the state-of-the-art cost-sensitive classification and active learning algorithms, our new algorithm generally exhibits a lower total cost.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据