4.7 Review

Homoeologous gene expression and co-expression network analyses and evolutionary inference in allopolyploids

Journal

BRIEFINGS IN BIOINFORMATICS
Volume 22, Issue 2, Pages 1819-1835

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbaa035

Keywords

allopolyploid; co-expression gene network; differential expression; homoeolog-specific read partitioning; RNA-seq

Funding

  1. National Science Foundation Plant Genome Research Program
  2. Cotton Incorporated

Ask authors/readers for more resources

This study presents an analytical workflow using cotton genus as an example to evaluate various bioinformatic methods at different stages of RNA-seq analysis for understanding polyploid expression evolution. The findings suggest that EAGLE-RC and GSNAP-PolyCat quantification pipelines outperform others, representing homoeolog expression and co-expression divergence well. This work highlights the importance of examining homoeolog read ambiguity to avoid potential artifacts affecting the understanding of duplicate gene expression in polyploids.
Polyploidy is a widespread phenomenon throughout eukaryotes. Due to the coexistence of duplicated genomes, polyploids offer unique challenges for estimating gene expression levels, which is essential for understanding the massive and various forms of transcriptomic responses accompanying polyploidy. Although previous studies have explored the bioinformatics of polyploid transcriptomic profiling, the causes and consequences of inaccurate quantification of transcripts from duplicated gene copies have not been addressed. Using transcriptomic data from the cotton genus (Gossypium) as an example, we present an analytical workflow to evaluate a variety of bioinformatic method choices at different stages of RNA-seq analysis, from homoeolog expression quantification to downstream analysis used to infer key phenomena of polyploid expression evolution. In general, EAGLE-RC and GSNAP-PolyCat outperform other quantification pipelines tested, and their derived expression dataset best represents the expected homoeolog expression and co-expression divergence. The performance of co-expression network analysis was less affected by homoeolog quantification than by network construction methods, where weighted networks outperformed binary networks. By examining the extent and consequences of homoeolog read ambiguity, we illuminate the potential artifacts that may affect our understanding of duplicate gene expression, including an overestimation of homoeolog co-regulation and the incorrect inference of subgenome asymmetry in network topology. Taken together, our work points to a set of reasonable practices that we hope are broadly applicable to the evolutionary exploration of polyploids.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available