4.6 Article

PrismEXP: gene annotation prediction from stratified gene-gene co-expression matrices

Journal

PEERJ
Volume 11, Issue -, Pages -

Publisher

PEERJ INC
DOI: 10.7717/peerj.14927

Keywords

Transcriptomics; Gene expression; Gene function predictions; RNA-seq; Unsupervised learning; Druggable genome

Ask authors/readers for more resources

This study introduces a method called PrismEXP for improved gene annotation predictions based on RNA-seq gene-gene co-expression data. Using ARCHS4 data, PrismEXP is able to predict various gene annotations including pathway membership, Gene Ontology terms, and human and mouse phenotypes. The predictions made by PrismEXP outperform the global cross-tissue co-expression correlation matrix approach in all tested domains, and training using one annotation domain can be used to predict annotations in other domains.
Background: Gene-gene co-expression correlations measured by mRNA-sequencing (RNA-seq) can be used to predict gene annotations based on the co-variance structure within these data. In our prior work, we showed that uniformly aligned RNA-seq co-expression data from thousands of diverse studies is highly predictive of both gene annotations and protein-protein interactions. However, the performance of the predictions varies depending on whether the gene annotations and interactions are cell type and tissue specific or agnostic. Tissue and cell type-specific gene-gene co-expression data can be useful for making more accurate predictions because many genes perform their functions in unique ways in different cellular contexts. However, identifying the optimal tissues and cell types to partition the global gene-gene co-expression matrix is challenging.Results: Here we introduce and validate an approach called PRediction of gene Insights from Stratified Mammalian gene co-EXPression (PrismEXP) for improved gene annotation predictions based on RNA-seq gene-gene co-expression data. Using uniformly aligned data from ARCHS4, we apply PrismEXP to predict a wide variety of gene annotations including pathway membership, Gene Ontology terms, as well as human and mouse phenotypes. Predictions made with PrismEXP outperform predictions made with the global cross-tissue co-expression correlation matrix approach on all tested domains, and training using one annotation domain can be used to predict annotations in other domains.Conclusions: By demonstrating the utility of PrismEXP predictions in multiple use cases we show how PrismEXP can be used to enhance unsupervised machine learning methods to better understand the roles of understudied genes and proteins. To make PrismEXP accessible, it is provided via a user-friendly web interface, a Python package, and an Appyter. AVAILABILITY. The PrismEXP web-based application, with pre-computed PrismEXP predictions, is available from: https:// maayanlab.cloud/prismexp; PrismEXP is also available as an Appyter: https:// appyters.maayanlab.cloud/PrismEXP/; and as Python package: https://github.com/ maayanlab/prismexp.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available