4.7 Article

ddRADseq-mediated detection of genetic variants in sugarcane

期刊

PLANT MOLECULAR BIOLOGY
卷 111, 期 1-2, 页码 205-219

出版社

SPRINGER
DOI: 10.1007/s11103-022-01322-4

关键词

Genotyping by sequencing; Single nucleotide polymorphism; Saccharum hybrids; Polyploid genome; Sugarcane sequencing

向作者/读者索取更多资源

The study optimized the identification of SNPs in sugarcane and recommended the use of long read size and paired-end reads, medium sequencing coverage, and Illumina NovaSeq6000 platform. Functional analysis showed that most of the SNPs landed within regulatory regions. The protocol demonstrated robustness in analyzing replicated genotypes.
Key message The article presents an optimization of the key parameters for the identification of SNPs in sugarcane using a GBS protocol based on two Illumina NextSeq and NovaSeq platforms. Sugarcane (Saccharum sp.), a world-wide known feedstock for sugar production, bioethanol, and energy, has an extremely complex genome, being highly polyploid and aneuploid. A double-digestion restriction site-associated DNA sequencing protocol (ddRADseq) was tested in four commercial sugarcane hybrids and one high-fibre biotype for the detection of single nucleotide polymorphisms (SNPs). In this work we tested two Illumina sequencing platforms, read size (70 vs. 150 bp), different sequencing coverage per individual (medium and high coverage), and single-reads versus paired-end reads. We also explored different variant calling strategies (with and without reference genome) and filtering schemes [combining two minor allele frequencies (MAFs) with three depth of coverage thresholds]. For the discovery of a large number of novel SNPs in sugarcane, we recommend longer size and paired-end reads, medium sequencing coverage per individual and Illumina platform NovaSeq6000 for a cost-effective approach, and filter parameters of lower MAF and higher depth coverages thresholds. Although the de novo analysis retrieved more SNPs, the reference-based method allows downstream characterization of variants. For the two best performing matrices, the number of SNPs per chromosome correlated positively with chromosome length, demonstrating the presence of variants throughout the genome. Multivariate comparisons, with both matrices, showed closer relationships among commercial hybrids than with the high-fibre biotype. Functional analysis of the SNPs demonstrated that more than half of them landed within regulatory regions, whereas the other half affected coding, intergenic and intronic regions. Allelic distances values were lower than 0.07 when analysing two replicated genotypes, confirming the protocol robustness.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据