4.7 Article Proceedings Paper

Complementary feature selection from alternative splicing events and gene expression for phenotype prediction

期刊

BIOINFORMATICS
卷 32, 期 17, 页码 421-429

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btw430

关键词

-

资金

  1. NCI NIH HHS [P30 CA138313] Funding Source: Medline
  2. NINDS NIH HHS [R01 NS085161] Funding Source: Medline

向作者/读者索取更多资源

Motivation: A central task of bioinformatics is to develop sensitive and specific means of providing medical prognoses from biomarker patterns. Common methods to predict phenotypes in RNA-Seq datasets utilize machine learning algorithms trained via gene expression. Isoforms, however, generated from alternative splicing, may provide a novel and complementary set of transcripts for phenotype prediction. In contrast to gene expression, the number of isoforms increases significantly due to numerous alternative splicing patterns, resulting in a prioritization problem for many machine learning algorithms. This study identifies the empirically optimal methods of transcript quantification, feature engineering and filtering steps using phenotype prediction accuracy as a metric. At the same time, the complementary nature of gene and isoform data is analyzed and the feasibility of identifying isoforms as biomarker candidates is examined. Results: Isoform features are complementary to gene features, providing non-redundant information and enhanced predictive power when prioritized and filtered. A univariate filtering algorithm, which selects up to the N highest ranking features for phenotype prediction is described and evaluated in this study. An empirical comparison of pipelines for isoform quantification is reported by performing cross-validation prediction tests with datasets from human non-small cell lung cancer (NSCLC) patients, human patients with chronic obstructive pulmonary disease (COPD) and amyo-trophic lateral sclerosis (ALS) transgenic mice, each including samples of diseased and non-diseased phenotypes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据