4.8 Article

Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks

期刊

NATURE MACHINE INTELLIGENCE
卷 2, 期 9, 页码 540-+

出版社

NATURE PORTFOLIO
DOI: 10.1038/s42256-020-0222-1

关键词

-

资金

  1. Biotechnology and Biological Sciences Research Council [BB/L002817/1]
  2. European Research Council Advanced Grant 'ProCovar' [695558]
  3. Francis Crick Institute
  4. Cancer Research UK [FC001002]
  5. UK Medical Research Council [FC001002]
  6. Wellcome Trust [FC001002]
  7. European Research Council (ERC) [695558] Funding Source: European Research Council (ERC)

向作者/读者索取更多资源

Protein function prediction is a challenging but important task in bioinformatics. Many prediction methods have been developed, but are still limited by the bottleneck on training sample quantity. Therefore, it is valuable to develop a data augmentation method that can generate high-quality synthetic samples to further improve the accuracy of prediction methods. In this work, we propose a novel generative adversarial networks-based method, FFPred-GAN, to accurately learn the high-dimensional distributions of protein sequence-based biophysical features and also generate high-quality synthetic protein feature samples. The experimental results suggest that the synthetic protein feature samples are successful in improving the prediction accuracy for all three domains of Gene Ontology through augmentation of the original training protein feature samples. Training machine learning models to predict the function of proteins is limited by the availability of only a small amount of labelled training data. Training can be improved by employing generative adversarial networks to generate additional synthetic protein samples.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据