dorsal/arxiv
View SchemaUnidentifiable divergence times in rates-across-sites models
| Authors | Steven N. Evans, Tandy Warnow |
|---|---|
| Categories | |
| ArXiv ID | q-bio/0408011 |
| URL | https://arxiv.org/abs/q-bio/0408011 |
Abstract
The rates-across-sites assumption in phylogenetic inference posits that the rate matrix governing the Markovian evolution of a character on an edge of the putative phylogenetic tree is the product of a character-specific scale factor and a rate matrix that is particular to that edge. Thus, evolution follows basically the same process for all characters, except that it occurs faster for some characters than others. To allow estimation of tree topologies and edge lengths for such models, it is commonly assumed that the scale factors are not arbitrary unknown constants, but rather unobserved, independent, identically distributed draws from a member of some parametric family of distributions. A popular choice is the gamma family. We consider an example of a clock-like tree with three taxa, one unknown edge length, and a parametric family of scale factor distributions that contain the gamma family. This model has the property that, for a generic choice of unknown edge length and scale factor distribution, there is another edge length and scale factor distribution which generates data with exactly the same distribution, so that even with infinitely many data it will be typically impossible to make correct inferences about the unknown edge length.
{
"annotation_id": "5fe08384-eb38-4c86-b918-3be18b1b8011",
"date_created": "2026-03-02T18:01:31.485000Z",
"date_modified": "2026-03-02T18:01:31.485000Z",
"file_hash": "068b99cec711c95454bb42cc8ce4674d463764efca8e4a81a0b38add0aaca390",
"private": false,
"record": {
"abstract": "The rates-across-sites assumption in phylogenetic inference posits that the\nrate matrix governing the Markovian evolution of a character on an edge of the\nputative phylogenetic tree is the product of a character-specific scale factor\nand a rate matrix that is particular to that edge. Thus, evolution follows\nbasically the same process for all characters, except that it occurs faster for\nsome characters than others. To allow estimation of tree topologies and edge\nlengths for such models, it is commonly assumed that the scale factors are not\narbitrary unknown constants, but rather unobserved, independent, identically\ndistributed draws from a member of some parametric family of distributions. A\npopular choice is the gamma family. We consider an example of a clock-like tree\nwith three taxa, one unknown edge length, and a parametric family of scale\nfactor distributions that contain the gamma family. This model has the property\nthat, for a generic choice of unknown edge length and scale factor\ndistribution, there is another edge length and scale factor distribution which\ngenerates data with exactly the same distribution, so that even with infinitely\nmany data it will be typically impossible to make correct inferences about the\nunknown edge length.",
"arxiv_id": "q-bio/0408011",
"authors": [
"Steven N. Evans",
"Tandy Warnow"
],
"categories": [
"q-bio.PE",
"q-bio.GN"
],
"title": "Unidentifiable divergence times in rates-across-sites models",
"url": "https://arxiv.org/abs/q-bio/0408011"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "f6a599a1-00aa-45b8-a694-b68a8db56983",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}