dorsal/arxiv
View SchemaIdentifying evolutionary trees and substitution parameters for the general Markov model with invariable sites
| Authors | Elizabeth S. Allman, John A. Rhodes |
|---|---|
| Categories | |
| ArXiv ID | q-bio/0702050 |
| URL | https://arxiv.org/abs/q-bio/0702050 |
Abstract
The general Markov plus invariable sites (GM+I) model of biological sequence evolution is a two-class model in which an unknown proportion of sites are not allowed to change, while the remainder undergo substitutions according to a Markov process on a tree. For statistical use it is important to know if the model is identifiable; can both the tree topology and the numerical parameters be determined from a joint distribution describing sequences only at the leaves of the tree? We establish that for generic parameters both the tree and all numerical parameter values can be recovered, up to clearly understood issues of `label swapping.' The method of analysis is algebraic, using phylogenetic invariants to study the variety defined by the model. Simple rational formulas, expressed in terms of determinantal ratios, are found for recovering numerical parameters describing the invariable sites.
{
"annotation_id": "1108f9cc-4f75-4b14-b0e0-e9a7819faabd",
"date_created": "2026-03-02T18:01:35.570000Z",
"date_modified": "2026-03-02T18:01:35.570000Z",
"file_hash": "756d7ec018afdd1c272141ade062ed66cfbd5f48cf72c43c6c40488c2b9eda2b",
"private": false,
"record": {
"abstract": "The general Markov plus invariable sites (GM+I) model of biological sequence\nevolution is a two-class model in which an unknown proportion of sites are not\nallowed to change, while the remainder undergo substitutions according to a\nMarkov process on a tree. For statistical use it is important to know if the\nmodel is identifiable; can both the tree topology and the numerical parameters\nbe determined from a joint distribution describing sequences only at the leaves\nof the tree? We establish that for generic parameters both the tree and all\nnumerical parameter values can be recovered, up to clearly understood issues of\n`label swapping.\u0027 The method of analysis is algebraic, using phylogenetic\ninvariants to study the variety defined by the model. Simple rational formulas,\nexpressed in terms of determinantal ratios, are found for recovering numerical\nparameters describing the invariable sites.",
"arxiv_id": "q-bio/0702050",
"authors": [
"Elizabeth S. Allman",
"John A. Rhodes"
],
"categories": [
"q-bio.PE",
"math.AG",
"math.ST",
"stat.TH"
],
"title": "Identifying evolutionary trees and substitution parameters for the general Markov model with invariable sites",
"url": "https://arxiv.org/abs/q-bio/0702050"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "b4ca1bdd-cbe6-4c53-abe7-df34978eaaec",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}