dorsal/arxiv
View SchemaDNA Sequence Evolution with Neighbor-Dependent Mutation
| Authors | Peter F. Arndt, Christopher B. Burge, Terence Hwa |
|---|---|
| Categories | |
| ArXiv ID | physics/0112029 |
| URL | https://arxiv.org/abs/physics/0112029 |
| Journal | RECOMB 2002, Proceedings of the 6th Annual International Conference on Computational Biology (2002) p. 32-38 |
Abstract
We introduce a model of DNA sequence evolution which can account for biases in mutation rates that depend on the identity of the neighboring bases. An analytic solution for this class of non-equilibrium models is developed by adopting well-known methods of nonlinear dynamics. Results are presented for the CpG-methylation-deamination process which dominates point substitutions in vertebrates. The dinucleotide frequencies generated by the model (using empirically obtained mutation rates) match the overall pattern observed in non-coding DNA. A web-based tool has been constructed to compute single- and dinucleotide frequencies for arbitrary neighbor-dependent mutation rates. Alsoprovided is the backward procedure to infer the mutation rates using maximum likelihood analysis given the observed single- and dinucleotide frequencies. Reasonable estimates of the mutation rates can be obtained very efficiently, using generic non-coding DNA sequences as input, after masking outlong homonucleotide subsequences. Our method is much more convenient and versatile to use than the traditional method of deducing mutation rates by counting mutation events in carefully chosen sequences. More generally, our approach provides a more realistic but still tractable description of non-coding genomic DNA, and may be used as a null model for various sequence analysis applications.
{
"annotation_id": "5a946883-802c-41cb-921a-30ce07dc572f",
"date_created": "2026-03-02T18:00:39.219000Z",
"date_modified": "2026-03-02T18:00:39.219000Z",
"file_hash": "56e742e851e6c800c2e1220a8e8b677c1403acdc3bdca3f2745819d71e332500",
"private": false,
"record": {
"abstract": "We introduce a model of DNA sequence evolution which can account for biases\nin mutation rates that depend on the identity of the neighboring bases. An\nanalytic solution for this class of non-equilibrium models is developed by\nadopting well-known methods of nonlinear dynamics. Results are presented for\nthe CpG-methylation-deamination process which dominates point substitutions in\nvertebrates. The dinucleotide frequencies generated by the model (using\nempirically obtained mutation rates) match the overall pattern observed in\nnon-coding DNA. A web-based tool has been constructed to compute single- and\ndinucleotide frequencies for arbitrary neighbor-dependent mutation rates.\nAlsoprovided is the backward procedure to infer the mutation rates using\nmaximum likelihood analysis given the observed single- and dinucleotide\nfrequencies. Reasonable estimates of the mutation rates can be obtained very\nefficiently, using generic non-coding DNA sequences as input, after masking\noutlong homonucleotide subsequences. Our method is much more convenient and\nversatile to use than the traditional method of deducing mutation rates by\ncounting mutation events in carefully chosen sequences. More generally, our\napproach provides a more realistic but still tractable description of\nnon-coding genomic DNA, and may be used as a null model for various sequence\nanalysis applications.",
"arxiv_id": "physics/0112029",
"authors": [
"Peter F. Arndt",
"Christopher B. Burge",
"Terence Hwa"
],
"categories": [
"physics.bio-ph",
"cond-mat.stat-mech",
"physics.comp-ph",
"q-bio.GN"
],
"journal_ref": "RECOMB 2002, Proceedings of the 6th Annual International\n Conference on Computational Biology (2002) p. 32-38",
"title": "DNA Sequence Evolution with Neighbor-Dependent Mutation",
"url": "https://arxiv.org/abs/physics/0112029"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "c8c4026c-254f-42c1-9f03-a156d96d6b76",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}