4.1 Article

A community effort to identify and correct mislabeled samples in proteogenomic studies

期刊

PATTERNS
卷 2, 期 5, 页码 -

出版社

CELL PRESS
DOI: 10.1016/j.patter.2021.100245

关键词

-

资金

  1. National Cancer Institute CPTAC awards [U24CA210954, U24CA210993]
  2. Cancer Prevention & Research Institutes of Texas (CPRIT) [RR160027]
  3. McNair Medical Institute at The Robert and Janice McNair Foundation
  4. Wright State University

向作者/读者索取更多资源

Sample mislabeling or misannotation is a common problem in scientific research, especially in large-scale, multi-omic studies. A crowdsourced challenge was organized to identify and correct mislabels, resulting in the development of an open-source software named COSMO with high accuracy and robustness in multi-omic datasets.
Sample mislabeling or misannotation hasbeen a long-standing problemin scientific research, particularly prevalent in large-scale, multi-omic studies due to the complexity of multi-omic workflows. There exists an urgent need for implementing quality controls to automatically screen for and correct sample mislabels or misannotations inmulti-omic studies. Here, we describe a crowdsourced precisionFDA NCI-CPTAC Multi-omics Enabled SampleMislabeling Correction Challenge, which provides a framework for systematic benchmarking and evaluation of mislabel identification and correction methods for integrative proteogenomic studies. The challenge received a large number of submissions from domestic and international data scientists, with highly variable performance observed across the submitted methods. Post-challenge collaboration between the top-performing teamsandthe challenge organizers has createdanopen-source software, COSMO, withdemonstrated high accuracy and robustness in mislabeling identification and correction in simulated and real multi-omic datasets.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.1
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据