dorsal/arxiv
View SchemaNew stopping criteria for segmenting DNA sequences
| Authors | Wentian Li |
|---|---|
| Categories | |
| ArXiv ID | physics/0104026 |
| URL | https://arxiv.org/abs/physics/0104026 |
| DOI | 10.1103/PhysRevLett.86.5815 |
| Journal | Phys. Rev. Lett. 86(25):5815-5818 (2001) |
Abstract
We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian Information Criterion (BIC) in the model selection framework. When this stopping criterion is applied to a left telomere sequence of yeast Saccharomyces cerevisiae and the complete genome sequence of bacterium Escherichia coli, borders of biologically meaningful units were identified (e.g. subtelomeric units, replication origin, and replication terminus), and a more reasonable number of domains was obtained. We also introduce a measure called segmentation strength which can be used to control the delineation of large domains. The relationship between the average domain size and the threshold of segmentation strength is determined for several genome sequences.
{
"annotation_id": "9e1867dd-ae36-44b2-ac20-881546027d20",
"date_created": "2026-03-02T18:00:35.938000Z",
"date_modified": "2026-03-02T18:00:35.938000Z",
"file_hash": "b4e2ac89d177afd7d14020a8ee0c576785e44981c814f9d24277da8b50997781",
"private": false,
"record": {
"abstract": "We propose a solution on the stopping criterion in segmenting inhomogeneous\nDNA sequences with complex statistical patterns. This new stopping criterion is\nbased on Bayesian Information Criterion (BIC) in the model selection framework.\nWhen this stopping criterion is applied to a left telomere sequence of yeast\nSaccharomyces cerevisiae and the complete genome sequence of bacterium\nEscherichia coli, borders of biologically meaningful units were identified\n(e.g. subtelomeric units, replication origin, and replication terminus), and a\nmore reasonable number of domains was obtained. We also introduce a measure\ncalled segmentation strength which can be used to control the delineation of\nlarge domains. The relationship between the average domain size and the\nthreshold of segmentation strength is determined for several genome sequences.",
"arxiv_id": "physics/0104026",
"authors": [
"Wentian Li"
],
"categories": [
"physics.bio-ph",
"physics.data-an",
"q-bio.GN"
],
"doi": "10.1103/PhysRevLett.86.5815",
"journal_ref": "Phys. Rev. Lett. 86(25):5815-5818 (2001)",
"title": "New stopping criteria for segmenting DNA sequences",
"url": "https://arxiv.org/abs/physics/0104026"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "538de55b-6e76-4b03-9571-295740c574e9",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}