4.4 Article

The importance of the label hierarchy in hierarchical multi-label classification

Journal

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS
Volume 45, Issue 2, Pages 247-271

Publisher

SPRINGER
DOI: 10.1007/s10844-014-0347-y

Keywords

Predictive clustering trees; Ensemble methods; Hierarchical multi-label classification; Habitat modelling; Text classification; Image classification; Gene function prediction

Funding

  1. European Commission [ICT-2013-612944]

Ask authors/readers for more resources

We address the task of hierarchical multi-label classification (HMC). HMC is a task of structured output prediction where the classes are organized into a hierarchy and an instance may belong to multiple classes. In many problems, such as gene function prediction or prediction of ecological community structure, classes inherently follow these constraints. The potential for application of HMC was recognized by many researchers and several such methods were proposed and demonstrated to achieve good predictive performances in the past. However, there is no clear understanding when is favorable to consider such relationships (hierarchical and multi-label) among classes, and when this presents unnecessary burden for classification methods. To this end, we perform a detailed comparative study over 8 datasets that have HMC properties. We investigate two important influences in HMC: the multiple labels per example and the information about the hierarchy. More specifically, we consider four machine learning tasks: multi-label classification, hierarchical multi-label classification, single-label classification and hierarchical single-label classification. To construct the predictive models, we use predictive clustering trees (a generalized form of decision trees), which are able to tackle each of the modelling tasks listed. Moreover, we investigate whether the influence of the hierarchy and the multiple labels carries over for ensemble models. For each of the tasks, we construct a single tree and two ensembles (random forest and bagging). The results reveal that the hierarchy and the multiple labels do help to obtain a better single tree model, while this is not preserved for the ensemble models.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available