4.7 Article

Comparative analysis of genome sequences of the two cultivated tetraploid cottons, Gossypium hirsutum (L.) and G. barbadense (L.)

期刊

INDUSTRIAL CROPS AND PRODUCTS
卷 196, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.indcrop.2023.116471

关键词

Cotton genome; Genome assembly quality; Comparative genomic; Gene index; Centromeric region

向作者/读者索取更多资源

With advancements in sequencing technology and high-performance computing systems, sequencing and assembling complex genomes has become easier. However, the assembly quality of cotton genomes can vary, posing challenges when selecting appropriate genomes for genetic analysis and comparing results. This study comprehensively evaluates and compares the assembly accuracy, completeness, and contiguity of multiple versions of two cultivated cotton species. Structural errors introduced during genome assembly are identified, and gene relationships between annotations from multiple genomes are defined. The results and resources provided by this study contribute to the field of cotton genomics.
With innovations in sequencing technology and the progress of high-performance computing systems, it is now relatively straightforward to sequence and assemble complex genomes. Many genomes from multiple cotton species have been released in recent years, with the highly homozygous standard genetic lines of two cultivated allotetraploid cottons, i.e., Gossypium hirsutum TM-1 and G. barbadense 3-79, assembled multiple times by different research groups using diverse sequencing technologies. The assembly quality among these genomes is variable, even between multiple accessions or versions of the same species, which can generate both confusion in choosing the appropriate genome for genetic analysis and obstacles when comparing results among the different reference genomes. Accordingly, an assessment of the many cotton genome sequences is necessary to facilitate both choice of genome sequence and comparisons between different versions or species. Here we comprehen-sively assess and compare genome assembly accuracy, completeness, and contiguity for nine G. hirsutum as-semblies and four G. barbadense assemblies using multiple analysis strategies with the same criteria. We identify centromeric regions and several large-scale inversions among genomes from the same accession, indicating structural errors introduced during sequence ordering and orientation in G. hirsutum and G. barbadense genome assembly. Gene relationships between annotations from multiple genomes are defined within and across species, and the results are available at the Cotton Paralogs Groups Search website (https://ihope.shinyapps.io/cotton-Paralogs/), a convenient resource for converting gene ids and comparing annotations between different genome versions. This study comprehensively assesses and compares assembly quality among multiple versions of the two cultivated tetraploid cotton species with different assembly strategies, illustrating the challenges of sequencing and assembling complex genomes and providing a resource for cotton genomics.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据