Journal
JOURNAL OF COMPUTATIONAL BIOLOGY
Volume 28, Issue 6, Pages 570-586Publisher
MARY ANN LIEBERT, INC
DOI: 10.1089/cmb.2020.0315
Keywords
parameter identifiability; phylogenetic trees; profile mixture model
Categories
Funding
- National Institutes of Health under the Joint DMS/NIGMS Initiative [R01 GM117590]
Ask authors/readers for more resources
The PM model for protein evolution describes sequence data with sites following multiple related substitution processes depending on different amino acid distributions. Using algebraic methods, parameters in the PM model are shown to be identifiable for empirical analyses, particularly when the tree relates 9 or more taxa and the number of profiles is less than 74.
A profile mixture (PM) model is a model of protein evolution, describing sequence data in which sites are assumed to follow many related substitution processes on a single evolutionary tree. The processes depend, in part, on different amino acid distributions, or profiles, varying over sites in aligned sequences. A fundamental question for any stochastic model, which must be answered positively to justify model-based inference, is whether the parameters are identifiable from the probability distribution they determine. Here, using algebraic methods, we show that a PM model has identifiable parameters under circumstances in which it is likely to be used for empirical analyses. In particular, for a tree relating 9 or more taxa, both the tree topology and all numerical parameters are generically identifiable when the number of profiles is less than 74.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available