4.5 Article

Parameter Identifiability for a Profile Mixture Model of Protein Evolution

Journal

JOURNAL OF COMPUTATIONAL BIOLOGY
Volume 28, Issue 6, Pages 570-586

Publisher

MARY ANN LIEBERT, INC
DOI: 10.1089/cmb.2020.0315

Keywords

parameter identifiability; phylogenetic trees; profile mixture model

Funding

  1. National Institutes of Health under the Joint DMS/NIGMS Initiative [R01 GM117590]

Ask authors/readers for more resources

The PM model for protein evolution describes sequence data with sites following multiple related substitution processes depending on different amino acid distributions. Using algebraic methods, parameters in the PM model are shown to be identifiable for empirical analyses, particularly when the tree relates 9 or more taxa and the number of profiles is less than 74.
A profile mixture (PM) model is a model of protein evolution, describing sequence data in which sites are assumed to follow many related substitution processes on a single evolutionary tree. The processes depend, in part, on different amino acid distributions, or profiles, varying over sites in aligned sequences. A fundamental question for any stochastic model, which must be answered positively to justify model-based inference, is whether the parameters are identifiable from the probability distribution they determine. Here, using algebraic methods, we show that a PM model has identifiable parameters under circumstances in which it is likely to be used for empirical analyses. In particular, for a tree relating 9 or more taxa, both the tree topology and all numerical parameters are generically identifiable when the number of profiles is less than 74.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available