dorsal/arxiv
View SchemaHamming distance geometry of a protein conformational space. Application to the clustering of a 4 ns molecular dynamics trajectory of the HIV-1 integrase catalytic core
| Authors | Cyril Laboulais, Mohammed Ouali, Marc Le Bret, Jacques Gabarro-Arpa |
|---|---|
| Categories | |
| ArXiv ID | physics/0110067 |
| URL | https://arxiv.org/abs/physics/0110067 |
| Journal | PROTEINS: Structure, Function, and Genetics 47, 169-179 (2002) |
Abstract
Protein structures can be encoded into binary sequences, these are used to define a Hamming distance in conformational space: the distance between two different molecular conformations is the number of different bits in their sequences. Each bit in the sequence arises from a partition of conformational space in two halves. Thus, the information encoded in the binary sequences is also used to characterize the regions of conformational space visited by the system. We apply this distance and their associated geometric structures, to the clustering and analysis of conformations sampled during a 4 ns molecular dynamics simulation of the HIV-1 integrase catalytic core. The cluster analysis of the simulation shows a division of the trajectory into two segments of 2.6 and 1.4 ns length, which are qualitatively different: the data points to the fact that equilibration is only reached at the end of the first segment. Some length of the paper is devoted to compare the Hamming distance to the r.m.s. deviation measure. The analysis of the cases studied so far, shows that under the same conditions the two measures behave quite differently, and that the Hamming distance appears to be more robust than the r.m.s. deviation.
{
"annotation_id": "02508604-f076-412c-985d-ebb59051c9ea",
"date_created": "2026-03-02T18:00:36.443000Z",
"date_modified": "2026-03-02T18:00:36.443000Z",
"file_hash": "135606551dd14b2efe51fc82eadf54a54022bc8a6a73f14e7f267a3cb88d4671",
"private": false,
"record": {
"abstract": "Protein structures can be encoded into binary sequences, these are used to\ndefine a Hamming distance in conformational space: the distance between two\ndifferent molecular conformations is the number of different bits in their\nsequences. Each bit in the sequence arises from a partition of conformational\nspace in two halves. Thus, the information encoded in the binary sequences is\nalso used to characterize the regions of conformational space visited by the\nsystem. We apply this distance and their associated geometric structures, to\nthe clustering and analysis of conformations sampled during a 4 ns molecular\ndynamics simulation of the HIV-1 integrase catalytic core. The cluster analysis\nof the simulation shows a division of the trajectory into two segments of 2.6\nand 1.4 ns length, which are qualitatively different: the data points to the\nfact that equilibration is only reached at the end of the first segment. Some\nlength of the paper is devoted to compare the Hamming distance to the r.m.s.\ndeviation measure. The analysis of the cases studied so far, shows that under\nthe same conditions the two measures behave quite differently, and that the\nHamming distance appears to be more robust than the r.m.s. deviation.",
"arxiv_id": "physics/0110067",
"authors": [
"Cyril Laboulais",
"Mohammed Ouali",
"Marc Le Bret",
"Jacques Gabarro-Arpa"
],
"categories": [
"physics.bio-ph",
"physics.chem-ph",
"q-bio"
],
"journal_ref": "PROTEINS: Structure, Function, and Genetics 47, 169-179 (2002)",
"title": "Hamming distance geometry of a protein conformational space. Application to the clustering of a 4 ns molecular dynamics trajectory of the HIV-1 integrase catalytic core",
"url": "https://arxiv.org/abs/physics/0110067"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "63f9eb4c-3215-46f8-9ce6-d114af13bb79",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}