4.6 Article

Multi-label thresholding for cost-sensitive classification

Journal

NEUROCOMPUTING
Volume 436, Issue -, Pages 232-247

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2020.12.004

Keywords

Multi-label classification; Cost-sensitive learning; Threshold choice methods; Global threshold; Context; Misclassification costs

Funding

  1. Deanship of Scientific Research (DSR) , King Abdulaziz University, Saudi Arabia, Jeddah [J1596121440]

Ask authors/readers for more resources

This paper investigates cost-sensitive classification methods for multi-label classification, adopting a simple but general thresholding method that is applicable to most classification algorithms. It explores the choice of single and multiple thresholds and proposes cost curves and scatter diagrams for performance evaluation. Experimental evaluation on 13 multi-label datasets demonstrates that adjusting a global threshold instead of per-label threshold does not lead to significant performance loss.
Multi-label classification associates each instance with a set of labels which reflects the nature of a wide range of real-world applications. However, existing approaches assume that all labels have the same misclassification cost, whereas in real-world problems different types of misclassification errors have different costs, which are generally unknown in the training context or might change from one context to another. Thus, there is a demand for cost-sensitive classification methods that minimise the average misclassification cost rather than error rates or counts. In this paper, we adopt a simple yet general method, called thresholding, which applies to most classification algorithms to adapt them to cost-sensitive multi-label classification. This paper investigates current threshold choice approaches for multi-label classification. It explores the choice of single and multiple thresholds and extends some of the current techniques to support multi-label problems. Moreover, it proposes cost curves and scatter diagrams for performance evaluation in the multi-label setting. Experimental evaluation on 13 multi-label datasets demonstrates that there is no significant loss by adjusting a global threshold rather than a per-label threshold considering different misclassification costs across labels. Although tuning multiple thresholds is the obvious solution, the global threshold can also be valid. Multi-label classification associates each instance with a set of labels which reflects the nature of a wide range of real-world applications. However, existing approaches assume that all labels have the same misclassification cost, whereas in real-world problems different types of misclassification errors have different costs, which are generally unknown in the training context or might change from one context to another. Thus, there is a demand for cost-sensitive classification methods that minimise the average misclassification cost rather than error rates or counts. In this paper, we adopt a simple yet general method, called thresholding, which applies to most classification algorithms to adapt them to cost-sensitive multi-label classification. This paper investigates current threshold choice approaches for multi-label classification. It explores the choice of single and multiple thresholds and extends some of the current techniques to support multi-label problems. Moreover, it proposes cost curves and scatter diagrams for performance evaluation in the multi-label setting. Experimental evaluation on 13 multi-label datasets demonstrates that there is no significant loss by adjusting a global threshold rather than a per-label threshold considering different misclassification costs across labels. Although tuning multiple thresholds is the obvious solution, the global threshold can also be valid. (c) 2020 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available