☆ 4.1 Article

Boruta - A System for Feature Selection

FUNDAMENTA INFORMATICAE (2010)

期刊

FUNDAMENTA INFORMATICAE

卷 101, 期 4, 页码 271-286

出版社

IOS PRESS

DOI: 10.3233/FI-2010-288

关键词

类别

Computer Science, Software Engineering Mathematics, Applied

资金

ICM, University of Warsaw [G34-5]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Machine learning methods are often used to classify objects described by hundreds of attributes; in many applications of this kind a great fraction of attributes may be totally irrelevant to the classification problem. Even more, usually one cannot decide a priori which attributes are relevant. In this paper we present an improved version of the algorithm for identification of the full set of truly important variables in an information system. It is an extension of the random forest method which utilises the importance measure generated by the original algorithm. It compares, in the iterative fashion, the importances of original attributes with importances of their randomised copies. We analyse performance of the algorithm on several examples of synthetic data, as well as on a biologically important problem, namely on identification of the sequence motifs that are important for aptameric activity of short RNA sequences.

Boruta - A System for Feature Selection

期刊

FUNDAMENTA INFORMATICAE

出版社

IOS PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Boruta - A System for Feature Selection

期刊

FUNDAMENTA INFORMATICAE

出版社

IOS PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文