☆ 4.7 Article

Development of an absolute assignment predictor for triple-negative breast cancer subtyping using machine learning approaches

COMPUTERS IN BIOLOGY AND MEDICINE (2021)

期刊

COMPUTERS IN BIOLOGY AND MEDICINE

卷 129, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.compbiomed.2020.104171

关键词

Machine learning; Prediction models; TNBC subtype; Transcriptomics data; Variable selection

类别

Biology Computer Science, Interdisciplinary Applications Engineering, Biomedical Mathematical & Computational Biology

资金

Programme operationnel regional Fonds europeen de developpement regional (FEDER) - Fonds social europeen (FSE) Pays de la Loire [PL0015129]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Recent studies have shown that TNBC can be divided into three subtypes, but current prediction methods are limited by batch effects and dataset dependency. This study established an absolute predictor for intra-patient diagnosis using machine learning algorithms, selecting the best model through cross-validation with a specific subset of indicators, with results indicating that the GB model based on these indicators performed the best.

Triple-negative breast cancer (TNBC) heterogeneity represents one of the main obstacles to precision medicine for this disease. Recent concordant transcriptomics studies have shown that TNBC could be divided into at least three subtypes with potential therapeutic implications. Although a few studies have been conducted to predict TNBC subtype using transcriptomics data, the subtyping was partially sensitive and limited by batch effect and dependence on a given dataset, which may penalize the switch to routine diagnostic testing. Therefore, we sought to build an absolute predictor (i.e., intra-patient diagnosis) based on machine learning algorithms with a limited number of probes. To that end, we started by introducing probe binary comparison for each patient (indicators). We based the predictive analysis on this transformed data. Probe selection was first involved combining both filter and wrapper methods for variable selection using cross-validation. We tested three prediction models (random forest, gradient boosting [GB], and extreme gradient boosting) using this optimal subset of indicators as inputs. Nested cross-validation consistently allowed us to choose the best model. The results showed that the fifty selected indicators highlighted the biological characteristics associated with each TNBC subtype. The GB based on this subset of indicators performs better than other models.

Development of an absolute assignment predictor for triple-negative breast cancer subtyping using machine learning approaches

期刊

COMPUTERS IN BIOLOGY AND MEDICINE

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Development of an absolute assignment predictor for triple-negative breast cancer subtyping using machine learning approaches

期刊

COMPUTERS IN BIOLOGY AND MEDICINE

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文