dorsal/arxiv
View SchemaA New Simulated Annealing Algorithm for the Multiple Sequence Alignment Problem: The approach of Polymers in a Random Media
| Authors | M. Hernández-Guía, R. Mulet, S. Rodríguez-Pérez |
|---|---|
| Categories | |
| ArXiv ID | q-bio/0501012 |
| URL | https://arxiv.org/abs/q-bio/0501012 |
| DOI | 10.1103/PhysRevE.72.031915 |
Abstract
We proposed a probabilistic algorithm to solve the Multiple Sequence Alignment problem. The algorithm is a Simulated Annealing (SA) that exploits the representation of the Multiple Alignment between $D$ sequences as a directed polymer in $D$ dimensions. Within this representation we can easily track the evolution in the configuration space of the alignment through local moves of low computational cost. At variance with other probabilistic algorithms proposed to solve this problem, our approach allows for the creation and deletion of gaps without extra computational cost. The algorithm was tested aligning proteins from the kinases family. When D=3 the results are consistent with those obtained using a complete algorithm. For $D>3$ where the complete algorithm fails, we show that our algorithm still converges to reasonable alignments. Moreover, we study the space of solutions obtained and show that depending on the number of sequences aligned the solutions are organized in different ways, suggesting a possible source of errors for progressive algorithms.
{
"annotation_id": "51ce7cef-5f5c-4fda-b975-7b56126574f4",
"date_created": "2026-03-02T18:01:31.538000Z",
"date_modified": "2026-03-02T18:01:31.538000Z",
"file_hash": "8201d0b1fa1fe676c4ffee6302a028d8bc5e78789c7d998bec8c4dc80d7c5979",
"private": false,
"record": {
"abstract": "We proposed a probabilistic algorithm to solve the Multiple Sequence\nAlignment problem. The algorithm is a Simulated Annealing (SA) that exploits\nthe representation of the Multiple Alignment between $D$ sequences as a\ndirected polymer in $D$ dimensions. Within this representation we can easily\ntrack the evolution in the configuration space of the alignment through local\nmoves of low computational cost. At variance with other probabilistic\nalgorithms proposed to solve this problem, our approach allows for the creation\nand deletion of gaps without extra computational cost. The algorithm was tested\naligning proteins from the kinases family. When D=3 the results are consistent\nwith those obtained using a complete algorithm. For $D\u003e3$ where the complete\nalgorithm fails, we show that our algorithm still converges to reasonable\nalignments. Moreover, we study the space of solutions obtained and show that\ndepending on the number of sequences aligned the solutions are organized in\ndifferent ways, suggesting a possible source of errors for progressive\nalgorithms.",
"arxiv_id": "q-bio/0501012",
"authors": [
"M. Hern\u00e1ndez-Gu\u00eda",
"R. Mulet",
"S. Rodr\u00edguez-P\u00e9rez"
],
"categories": [
"q-bio.GN"
],
"doi": "10.1103/PhysRevE.72.031915",
"title": "A New Simulated Annealing Algorithm for the Multiple Sequence Alignment Problem: The approach of Polymers in a Random Media",
"url": "https://arxiv.org/abs/q-bio/0501012"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "457b43f7-b0cf-49fe-874f-5766204fb31d",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}