dorsal/arxiv
View SchemaLooking at structure, stability, and evolution of proteins through the principal eigenvector of contact matrices and hydrophobicity profiles
| Authors | Ugo Bastolla, Markus Porto, H. Eduardo Roman, Michele Vendruscolo |
|---|---|
| Categories | |
| ArXiv ID | q-bio/0412004 |
| URL | https://arxiv.org/abs/q-bio/0412004 |
| Journal | Gene 347, 219 (2005) |
Abstract
We review and further develop an analytical model that describes how thermodynamic constraints on the stability of the native state influence protein evolution in a site-specific manner. To this end, we represent both protein sequences and protein structures as vectors: Structures are represented by the principal eigenvector (PE) of the protein contact matrix, a quantity that resembles closely the effective connectivity of each site; Sequences are represented through the ``interactivity'' of each amino acid type, using novel parameters that are correlated with hydropathy scales. These interactivity parameters are more strongly correlated than the other hydropathy scales that we examine with: (1) The change upon mutations of the unfolding free energy of proteins with two-states thermodynamics; (2) Genomic properties as the genome-size and the genome-wide GC content; (3) The main eigenvectors of the substitution matrices. The evolutionary average of the interactivity vector correlates very strongly with the PE of a protein structure. Using this result, we derive an analytic expression for site-specific distributions of amino acids across protein families in the form of Boltzmann distributions whose ``inverse temperature'' is a function of the PE component. We show that our predictions are in agreement with site-specific amino acid distributions obtained from the Protein Data Bank, and we determine the mutational model that best fits the observed site-specific amino acid distributions. Interestingly, the optimal model almost minimizes the rate at which deleterious mutations are eliminated by natural selection.
{
"annotation_id": "c840477a-88db-4389-a83c-6af05f3a0a7d",
"date_created": "2026-03-02T18:01:31.930000Z",
"date_modified": "2026-03-02T18:01:31.930000Z",
"file_hash": "c8d10ecc459a70499bea63a39d86d85a66482e436715b2e2255b8ace439b7083",
"private": false,
"record": {
"abstract": "We review and further develop an analytical model that describes how\nthermodynamic constraints on the stability of the native state influence\nprotein evolution in a site-specific manner. To this end, we represent both\nprotein sequences and protein structures as vectors: Structures are represented\nby the principal eigenvector (PE) of the protein contact matrix, a quantity\nthat resembles closely the effective connectivity of each site; Sequences are\nrepresented through the ``interactivity\u0027\u0027 of each amino acid type, using novel\nparameters that are correlated with hydropathy scales. These interactivity\nparameters are more strongly correlated than the other hydropathy scales that\nwe examine with: (1) The change upon mutations of the unfolding free energy of\nproteins with two-states thermodynamics; (2) Genomic properties as the\ngenome-size and the genome-wide GC content; (3) The main eigenvectors of the\nsubstitution matrices. The evolutionary average of the interactivity vector\ncorrelates very strongly with the PE of a protein structure. Using this result,\nwe derive an analytic expression for site-specific distributions of amino acids\nacross protein families in the form of Boltzmann distributions whose ``inverse\ntemperature\u0027\u0027 is a function of the PE component. We show that our predictions\nare in agreement with site-specific amino acid distributions obtained from the\nProtein Data Bank, and we determine the mutational model that best fits the\nobserved site-specific amino acid distributions. Interestingly, the optimal\nmodel almost minimizes the rate at which deleterious mutations are eliminated\nby natural selection.",
"arxiv_id": "q-bio/0412004",
"authors": [
"Ugo Bastolla",
"Markus Porto",
"H. Eduardo Roman",
"Michele Vendruscolo"
],
"categories": [
"q-bio.BM",
"q-bio.PE"
],
"journal_ref": "Gene 347, 219 (2005)",
"title": "Looking at structure, stability, and evolution of proteins through the principal eigenvector of contact matrices and hydrophobicity profiles",
"url": "https://arxiv.org/abs/q-bio/0412004"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "d0bf8986-267d-4ee9-91f2-9ce5a9633b3a",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}