4.7 Article

A Cascade Deep Forest Model for Breast Cancer Subtype Classification Using Multi-Omics Data

期刊

MATHEMATICS
卷 9, 期 13, 页码 -

出版社

MDPI
DOI: 10.3390/math9131574

关键词

METABRIC dataset; breast cancer subtyping; deep forest; multi-omics data

向作者/读者索取更多资源

Automated diagnosis systems aim to reduce diagnosis costs with maintained efficiency. The cascade Deep Forest ensemble model shows competitive classification accuracy with other techniques, particularly for imbalanced training sets, by utilizing cascade ensemble decision trees to learn hyper-representations. The use of gene expression data alone with the cascade Deep Forest classifier achieves comparable accuracy to other techniques with higher computational performance, with times recorded around 5-7 seconds.
Automated diagnosis systems aim to reduce the cost of diagnosis while maintaining the same efficiency. Many methods have been used for breast cancer subtype classification. Some use single data source, while others integrate many data sources, the case that results in reduced computational performance as opposed to accuracy. Breast cancer data, especially biological data, is known for its imbalance, with lack of extensive amounts of histopathological images as biological data. Recent studies have shown that cascade Deep Forest ensemble model achieves a competitive classification accuracy compared with other alternatives, such as the general ensemble learning methods and the conventional deep neural networks (DNNs), especially for imbalanced training sets, through learning hyper-representations through using cascade ensemble decision trees. In this work, a cascade Deep Forest is employed to classify breast cancer subtypes, IntClust and Pam50, using multi-omics datasets and different configurations. The results obtained recorded an accuracy of 83.45% for 5 subtypes and 77.55% for 10 subtypes. The significance of this work is that it is shown that using gene expression data alone with the cascade Deep Forest classifier achieves comparable accuracy to other techniques with higher computational performance, where the time recorded is about 5 s for 10 subtypes, and 7 s for 5 subtypes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据