4.6 Article

Family of skewed distributions associated with the gene expression and proteome evolution

Journal

SIGNAL PROCESSING
Volume 83, Issue 4, Pages 889-910

Publisher

ELSEVIER
DOI: 10.1016/S0165-1684(02)00481-4

Keywords

gene expression; protein domains; evolution; birth-death stochastic processes; Pareto distribution; Waring distribution

Ask authors/readers for more resources

We study statistical distributions appearing in various genome-related phenomena, including the distribution of the transcript copy number in the transcriptome of eukaryotic cells and the distribution of the number of proteins containing a protein domain in proteomes of species. We found that the empirical distributions for all studied data sets are well fitted by a family of Pareto-like distribution functions whose shape depends in a predictable manner on the sample size. Such distributions are generated as limiting distributions in a Markov random process where the birth and death intensities are linear functions of events. We also propose a novel model of progressive evolution of a population in terms of the increase of the numbers of distinct components and their links in the system and we study evolution of the probability distribution of these links. Estimating two unknown parameters of this model allows us to describe the progressive evolution of the number of distinct protein domain sets and the number of proteins containing a given protein domain in the proteomes of 70 fully sequenced genome organisms. This model also predicts trends in proteome complexity evolution. Published by Elsevier Science B.V.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available