dorsal/arxiv
View SchemaSubtree power analysis finds optimal species for comparative genomics
| Authors | Jon D. McAuliffe, Michael I. Jordan, Lior Pachter |
|---|---|
| Categories | |
| ArXiv ID | q-bio/0412012 |
| URL | https://arxiv.org/abs/q-bio/0412012 |
Abstract
Sequence comparison across multiple organisms aids in the detection of regions under selection. However, resource limitations require a prioritization of genomes to be sequenced. This prioritization should be grounded in two considerations: the lineal scope encompassing the biological phenomena of interest, and the optimal species within that scope for detecting functional elements. We introduce a statistical framework for optimal species subset selection, based on maximizing power to detect conserved sites. In a study of vertebrate species, we show that the optimal species subset is not in general the most evolutionarily diverged subset. Our results suggest that marsupials are prime sequencing candidates.
{
"annotation_id": "4015a464-f82f-481e-9c2d-0affc7aae66d",
"date_created": "2026-03-02T18:01:31.754000Z",
"date_modified": "2026-03-02T18:01:31.754000Z",
"file_hash": "5dd7a7baab716f80280507dbee2dc04d60b9a3d7aa50a39eedbff95625342744",
"private": false,
"record": {
"abstract": "Sequence comparison across multiple organisms aids in the detection of\nregions under selection. However, resource limitations require a prioritization\nof genomes to be sequenced. This prioritization should be grounded in two\nconsiderations: the lineal scope encompassing the biological phenomena of\ninterest, and the optimal species within that scope for detecting functional\nelements. We introduce a statistical framework for optimal species subset\nselection, based on maximizing power to detect conserved sites. In a study of\nvertebrate species, we show that the optimal species subset is not in general\nthe most evolutionarily diverged subset. Our results suggest that marsupials\nare prime sequencing candidates.",
"arxiv_id": "q-bio/0412012",
"authors": [
"Jon D. McAuliffe",
"Michael I. Jordan",
"Lior Pachter"
],
"categories": [
"q-bio.GN",
"q-bio.QM"
],
"title": "Subtree power analysis finds optimal species for comparative genomics",
"url": "https://arxiv.org/abs/q-bio/0412012"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "6d75065b-538a-454e-bbc5-ab08b821fe40",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}