Journal
BIOSTATISTICS
Volume 15, Issue 3, Pages 413-426Publisher
OXFORD UNIV PRESS
DOI: 10.1093/biostatistics/kxt053
Keywords
Bioinformatics; Differential expression; False discovery rate; Genomics; RNA sequencing
Funding
- NIH [Ro1 HG005220, Ro1 GM105705-01]
- Silvo O. Conte Center [NIH P50 MH-094268]
- Jeffrey Leek's discretionary fund
Ask authors/readers for more resources
RNA-sequencing (RNA-seq) is a flexible technology for measuring genome-wide expression that is rapidly replacing microarrays as costs become comparable. Current differential expression analysis methods for RNA-seq data fall into two broad classes: (1) methods that quantify expression within the boundaries of genes previously published in databases and (2) methods that attempt to reconstruct full length RNA transcripts. The first class cannot discover differential expression outside of previously known genes. While the second approach does possess discovery capabilities, statistical analysis of differential expression is complicated by the ambiguity and variability incurred while assembling transcripts and estimating their abundances. Here, we propose a novel method that first identifies differentially expressed regions (DERs) of interest by assessing differential expression at each base of the genome. The method then segments the genome into regions comprised of bases showing similar differential expression signal, and then assigns a measure of statistical significance to each region. Optionally, DERs can be annotated using a reference database of genomic features. We compare our approach with leading competitors from both current classes of differential expression methods and highlight the strengths and weaknesses of each. A software implementation of our method is available on github ().
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available