☆ 4.1 Article

Boruta - A System for Feature Selection

FUNDAMENTA INFORMATICAE (2010)

Journal

FUNDAMENTA INFORMATICAE

Volume 101, Issue 4, Pages 271-286

Publisher

IOS PRESS

DOI: 10.3233/FI-2010-288

Keywords

Funding

ICM, University of Warsaw [G34-5]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Machine learning methods are often used to classify objects described by hundreds of attributes; in many applications of this kind a great fraction of attributes may be totally irrelevant to the classification problem. Even more, usually one cannot decide a priori which attributes are relevant. In this paper we present an improved version of the algorithm for identification of the full set of truly important variables in an information system. It is an extension of the random forest method which utilises the importance measure generated by the original algorithm. It compares, in the iterative fashion, the importances of original attributes with importances of their randomised copies. We analyse performance of the algorithm on several examples of synthetic data, as well as on a biologically important problem, namely on identification of the sequence motifs that are important for aptameric activity of short RNA sequences.

Boruta - A System for Feature Selection

Journal

FUNDAMENTA INFORMATICAE

Publisher

IOS PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Boruta - A System for Feature Selection

Journal

FUNDAMENTA INFORMATICAE

Publisher

IOS PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper