dorsal/arxiv
View SchemaClustering of SNPs along a chromosome: can the neutral model be rejected?
| Authors | A. Eriksson, B. Haubold, B. Mehlig |
|---|---|
| Categories | |
| ArXiv ID | physics/0207024 |
| URL | https://arxiv.org/abs/physics/0207024 |
Abstract
Single nucleotide polymorphisms (SNPs) often appear in clusters along the length of a chromosome. This is due to variation in local coalescent times caused by,for example, selection or recombination. Here we investigate whether recombination alone (within a neutral model) can cause statistically significant SNP clustering. We measure the extent of SNP clustering as the ratio between the variance of SNPs found in bins of length $l$, and the mean number of SNPs in such bins, $\sigma^2_l/\mu_l$. For a uniform SNP distribution $\sigma^2_l/\mu_l=1$, for clustered SNPs $\sigma^2_l/\mu_l > 1$. Apart from the bin length, three length scales are important when accounting for SNP clustering: The mean distance between neighboring SNPs, $\Delta$, the mean length of chromosome segments with constant time to the most recent common ancestor, $\el$, and the total length of the chromosome, $L$. We show that SNP clustering is observed if $\Delta < \el \ll L$. Moreover, if $l\ll \el \ll L$, clustering becomes independent of the rate of recombination. We apply our results to the analysis of SNP data sets from mice, and human chromosomes 6 and X. Of the three data sets investigated, the human X chromosome displays the most significant deviation from neutrality.
{
"annotation_id": "b78c0b52-6aef-474e-aec3-a2a620646fb9",
"date_created": "2026-03-02T18:00:39.872000Z",
"date_modified": "2026-03-02T18:00:39.872000Z",
"file_hash": "2871d0c4e07d9dc2828d2ec3d03cd0d4295f30ca712307558f2cb62a96a2d040",
"private": false,
"record": {
"abstract": "Single nucleotide polymorphisms (SNPs) often appear in clusters along the\nlength of a chromosome. This is due to variation in local coalescent times\ncaused by,for example, selection or recombination. Here we investigate whether\nrecombination alone (within a neutral model) can cause statistically\nsignificant SNP clustering. We measure the extent of SNP clustering as the\nratio between the variance of SNPs found in bins of length $l$, and the mean\nnumber of SNPs in such bins, $\\sigma^2_l/\\mu_l$. For a uniform SNP distribution\n$\\sigma^2_l/\\mu_l=1$, for clustered SNPs $\\sigma^2_l/\\mu_l \u003e 1$. Apart from the\nbin length, three length scales are important when accounting for SNP\nclustering: The mean distance between neighboring SNPs, $\\Delta$, the mean\nlength of chromosome segments with constant time to the most recent common\nancestor, $\\el$, and the total length of the chromosome, $L$. We show that SNP\nclustering is observed if $\\Delta \u003c \\el \\ll L$. Moreover, if $l\\ll \\el \\ll L$,\nclustering becomes independent of the rate of recombination. We apply our\nresults to the analysis of SNP data sets from mice, and human chromosomes 6 and\nX. Of the three data sets investigated, the human X chromosome displays the\nmost significant deviation from neutrality.",
"arxiv_id": "physics/0207024",
"authors": [
"A. Eriksson",
"B. Haubold",
"B. Mehlig"
],
"categories": [
"physics.bio-ph",
"q-bio"
],
"title": "Clustering of SNPs along a chromosome: can the neutral model be rejected?",
"url": "https://arxiv.org/abs/physics/0207024"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "0fbd871b-6537-4eb2-8ac0-c449a65eff79",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}