dorsal/arxiv
View SchemaRepresentation of protein structure based on frequency distributions of oriented cycles in contact graphs
| Authors | A. M. Lisewski, O. Lichtarge |
|---|---|
| Categories | |
| ArXiv ID | q-bio/0409029 |
| URL | https://arxiv.org/abs/q-bio/0409029 |
Abstract
We present a statistical approach to protein structure by introducing a representation of protein folds based on simple observables defined as frequencies of oriented cycles in contact graphs. Motivated by the idea that these cycles may form a code of the entire protein structure we investigate its Shannon entropy. The latter shows a characteristic transitory behavior when traced over different values of the geometric threshold $t$ which defines protein residues in contact. To account for this observation, we propose a non-linear mechanical model-- formulated as a Hamiltonian system in which $t$ is regarded as a continuous parameter-- that allows to identify the behavior of the Shannon entropy as a first-order phase transition between a disordered and an ordered phase of the proposed mechanical system. The transition itself reflects the formation of protein structure when represented in terms of contact graph cycles, and it is identified as an example of a fragmentation transition known from several other statistical systems without a proper thermodynamic limit. Some interesting implications follow from our model including chirality, broken $t$-symmetry and one-dimensionality. Although defined from purely structural considerations, we further show that for native protein structures these observables follow some of the quantitative rules that are typically valid for amino acid types in protein sequences. Moreover, this relation between polypeptide structures and their amino acid sequences suggests a specific protein design alphabet.
{
"annotation_id": "4a755b03-70a7-4de8-b59a-2a0e67859955",
"date_created": "2026-03-02T18:01:31.086000Z",
"date_modified": "2026-03-02T18:01:31.086000Z",
"file_hash": "2f6056f6930a054a799609f761076aa5c3ed65750438e89023a9388b4299b888",
"private": false,
"record": {
"abstract": "We present a statistical approach to protein structure by introducing a\nrepresentation of protein folds based on simple observables defined as\nfrequencies of oriented cycles in contact graphs. Motivated by the idea that\nthese cycles may form a code of the entire protein structure we investigate its\nShannon entropy. The latter shows a characteristic transitory behavior when\ntraced over different values of the geometric threshold $t$ which defines\nprotein residues in contact. To account for this observation, we propose a\nnon-linear mechanical model-- formulated as a Hamiltonian system in which $t$\nis regarded as a continuous parameter-- that allows to identify the behavior of\nthe Shannon entropy as a first-order phase transition between a disordered and\nan ordered phase of the proposed mechanical system. The transition itself\nreflects the formation of protein structure when represented in terms of\ncontact graph cycles, and it is identified as an example of a fragmentation\ntransition known from several other statistical systems without a proper\nthermodynamic limit. Some interesting implications follow from our model\nincluding chirality, broken $t$-symmetry and one-dimensionality. Although\ndefined from purely structural considerations, we further show that for native\nprotein structures these observables follow some of the quantitative rules that\nare typically valid for amino acid types in protein sequences. Moreover, this\nrelation between polypeptide structures and their amino acid sequences suggests\na specific protein design alphabet.",
"arxiv_id": "q-bio/0409029",
"authors": [
"A. M. Lisewski",
"O. Lichtarge"
],
"categories": [
"q-bio.BM",
"nlin.AO"
],
"title": "Representation of protein structure based on frequency distributions of oriented cycles in contact graphs",
"url": "https://arxiv.org/abs/q-bio/0409029"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "6a2c5d8d-2c87-4555-8a59-562f280f63d5",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}