4.0 Article

Factorial study of the RNA-seq computational workflow identifies biases as technical gene signatures

期刊

NAR GENOMICS AND BIOINFORMATICS
卷 2, 期 2, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/nargab/lqaa043

关键词

-

资金

  1. Natural Sciences and Engineering Research Council of Canada (NSERC) [RGPIN-2018-05412]
  2. FRQNT
  3. NSERC
  4. Fonds de Recherche du Quebec Sante (FRQS)

向作者/读者索取更多资源

RNA-seq is a modular experimental and computational approach aiming in identifying and quantifying RNA molecules. The modularity of the RNA-seq technology enables adaptation of the protocol to develop new ways to explore RNA biology, but this modularity also brings forth the importance of methodological thoroughness. Liberty of approach comes with the responsibility of choices, and such choices must be informed. Here, we present an approach that identifies gene group-specific quantification biases in current RNA-seq software and references by processing datasets using diverse RNA-seq computational pipelines, and by decomposing these expression datasets with an independent component analysis matrix factorization method. By exploring the RNA-seq pipeline using this systemic approach, we identify genome annotations as a design choice that affects to the same extent quantification results as does the choice of aligners and quantifiers. We also show that the different choices in RNA-seq methodology are not independent, identifying interactions between genome annotations and quantification software. Genes were mainly affected by differences in their sequence, by overlapping genes and genes with similar sequence. Our approach offers an explanation for the observed biases by identifying the common features used differently by the software and references, therefore providing leads for the betterment of RNA-seq methodology.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.0
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据