4.5 Article

A Machine Learning Classifier for Assigning Individual Patients With Systemic Sclerosis to Intrinsic Molecular Subsets

期刊

ARTHRITIS & RHEUMATOLOGY
卷 71, 期 10, 页码 1701-1710

出版社

WILEY
DOI: 10.1002/art.40898

关键词

-

资金

  1. Scleroderma Research Foundation
  2. Burroughs-Wellcome Big Data in the Life Sciences Training Program
  3. NIH [5T32LM012204-03]
  4. Dr. Ralph and Marian Falk Medical Research Trust

向作者/读者索取更多资源

Objective High-throughput gene expression profiling of tissue samples from patients with systemic sclerosis (SSc) has identified 4 intrinsic gene expression subsets: inflammatory, fibroproliferative, normal-like, and limited. Prior methods required agglomerative clustering of many samples. In order to classify individual patients in clinical trials or for diagnostic purposes, supervised methods that can assign single samples to molecular subsets are required. We undertook this study to introduce a novel machine learning classifier as a robust accurate intrinsic subset predictor. Methods Three independent gene expression cohorts were curated and merged to create a data set covering 297 skin biopsy samples from 102 unique patients and controls, which was used to train a machine learning algorithm. We performed external validation using 3 independent SSc cohorts, including a gene expression data set generated by an independent laboratory on a different microarray platform. In total, 413 skin biopsy samples from 213 individuals were analyzed in the training and testing cohorts. Results Repeated cross-fold validation identified consistent and discriminative markers using multinomial elastic net, performing with an average classification accuracy of 87.1% with high sensitivity and specificity. In external validation, the classifier achieved an average accuracy of 85.4%. Reanalyzing data from a previous study, we identified subsets of patients that represent the canonical inflammatory, fibroproliferative, and normal-like subsets. Conclusion We developed a highly accurate classifier for SSc molecular subsets for individual patient samples. The method can be used in SSc clinical trials to identify an intrinsic subset on individual samples. Our method provides a robust data-driven approach to aid clinical decision-making and interpretation of heterogeneous molecular information in SSc patients.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据