4.5 Article

Establishment of bioinformatics pipeline for deciphering the biological complexities of fragmented sperm transcriptome

Journal

ANALYTICAL BIOCHEMISTRY
Volume 620, Issue -, Pages -

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.ab.2021.114141

Keywords

Bioinformatics pipeline; Fragmented transcripts; Transcriptomics; Differential gene expression; Bovine spermatozoa

Funding

  1. Indian Council of Agricultural Research, Government of India
  2. Department of Biotechnology, Government of India [PR3587]
  3. ICAR-National Fellow Project, ICAR, Ministry of Agriculture, Government of India

Ask authors/readers for more resources

The study established a bioinformatics pipeline for analyzing fragmented sperm RNA, with TopHat2 identified as the most effective tool. EdgeR and limma were found to identify the largest number of significantly differentially expressed genes with biological relevance in the differential gene expression analysis.
Despite the development of several tools for the analysis of the transcriptome data, non-availability of a standard pipeline for analyzing the low quality and fragmented mRNA samples pose a major challenge to the computational molecular biologist for effective interpretation of the data. Hence the present study aimed to establish a bioinformatics pipeline for analyzing the biologically fragmented sperm RNA. Sperm transcriptome data (2 x 75 PE sequencing) generated from bulls (n = 8) of high-fertile (n = 4) and low-fertile (n = 4) classified based on the fertility rate (41.52 ? 1.07 vs 36.04 ? 1.04%) were analyzed with different bioinformatics tools for alignment, quantitation, and differential gene expression studies. TopHat2 was effectual compared to HISAT2 and STAR for sperm mRNA due to the higher exonic (6% vs 2%) mapping percentage and quantitating the low expressed genes. TopHat2 also had significantly strong correlation with STAR (0.871, p = 0.05) and HISAT2 (0.933, p = 0.01). TopHat2 and Cufflinks combo quantitated the number of genes higher than the other combinations. Among the tools (Cuffdiff, DESeq, DESeq2, edgeR, and limma) used for the differential gene expression analysis, edgeR and limma identified the largest number of significantly differentially expressed genes (p < 0.05) with biological relevance. The concordance analysis concurred that edgeR had an edge over the other tools. It also identified a higher number (9.5%) of fertility-related genes to be differentially expressed between the two groups. The present study established that TopHat2, Cufflinks, and edgeR as a suitable pipeline for the analysis of fragmented mRNA from bovine spermatozoa.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available