4.7 Article

Comparative analysis of genome sequences of the two cultivated tetraploid cottons, Gossypium hirsutum (L.) and G. barbadense (L.)

Journal

INDUSTRIAL CROPS AND PRODUCTS
Volume 196, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.indcrop.2023.116471

Keywords

Cotton genome; Genome assembly quality; Comparative genomic; Gene index; Centromeric region

Ask authors/readers for more resources

With advancements in sequencing technology and high-performance computing systems, sequencing and assembling complex genomes has become easier. However, the assembly quality of cotton genomes can vary, posing challenges when selecting appropriate genomes for genetic analysis and comparing results. This study comprehensively evaluates and compares the assembly accuracy, completeness, and contiguity of multiple versions of two cultivated cotton species. Structural errors introduced during genome assembly are identified, and gene relationships between annotations from multiple genomes are defined. The results and resources provided by this study contribute to the field of cotton genomics.
With innovations in sequencing technology and the progress of high-performance computing systems, it is now relatively straightforward to sequence and assemble complex genomes. Many genomes from multiple cotton species have been released in recent years, with the highly homozygous standard genetic lines of two cultivated allotetraploid cottons, i.e., Gossypium hirsutum TM-1 and G. barbadense 3-79, assembled multiple times by different research groups using diverse sequencing technologies. The assembly quality among these genomes is variable, even between multiple accessions or versions of the same species, which can generate both confusion in choosing the appropriate genome for genetic analysis and obstacles when comparing results among the different reference genomes. Accordingly, an assessment of the many cotton genome sequences is necessary to facilitate both choice of genome sequence and comparisons between different versions or species. Here we comprehen-sively assess and compare genome assembly accuracy, completeness, and contiguity for nine G. hirsutum as-semblies and four G. barbadense assemblies using multiple analysis strategies with the same criteria. We identify centromeric regions and several large-scale inversions among genomes from the same accession, indicating structural errors introduced during sequence ordering and orientation in G. hirsutum and G. barbadense genome assembly. Gene relationships between annotations from multiple genomes are defined within and across species, and the results are available at the Cotton Paralogs Groups Search website (https://ihope.shinyapps.io/cotton-Paralogs/), a convenient resource for converting gene ids and comparing annotations between different genome versions. This study comprehensively assesses and compares assembly quality among multiple versions of the two cultivated tetraploid cotton species with different assembly strategies, illustrating the challenges of sequencing and assembling complex genomes and providing a resource for cotton genomics.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available