4.6 Article

Applications of random forest feature selection for fine-scale genetic population assignment

期刊

EVOLUTIONARY APPLICATIONS
卷 11, 期 2, 页码 153-165

出版社

WILEY
DOI: 10.1111/eva.12524

关键词

conservation genetics; fisheries management; individual assignment; random forest; SNP selection

资金

  1. Natural Sciences and Engineering Research Council of Canada (NSERC)
  2. Nova Scotia Graduate Scholarship (NSGS)
  3. Canada Graduate Scholarship (CGS-M)
  4. Labrador Institute (Atlantic Canada Opportunities Agency)
  5. Labrador Institute (Department of Business, Tourism, Culture and Rural Development)
  6. Olin Fellowships (Atlantic Salmon Federation)

向作者/读者索取更多资源

Genetic population assignment used to inform wildlife management and conservation efforts requires panels of highly informative genetic markers and sensitive assignment tests. We explored the utility of machine learning algorithms (random forest, regularized random forest and guided regularized random forest) compared with F-ST ranking for selection of single nucleotide polymorphisms (SNP) for fine scale population assignment. We applied these methods to an unpublished SNP data set for Atlantic salmon (Salmo salar) and a published SNP data set for Alaskan Chinook salmon (Oncorhynchus tshawytscha). In each species, we identified the minimum panel size required to obtain a self assignment accuracy of at least 90% using each method to create panels of 50-700 markers Panels of SNPs identified using random forest based methods performed up to 7.8 and 11.2 percentage points better than FST selected panels of similar size for the Atlantic salmon and Chinook salmon data, respectively. Self assignment accuracy >= 90% was obtained with panels of 670 and 384 SNPs for each data set, respectively, a level of accuracy never reached for these species using F-ST selected panels. Our results demonstrate a role for machine learning approaches in marker selection across large genomic data sets to improve assignment for management and conservation of exploited populations.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据