dorsal/arxiv
View SchemaStructure Space of Model Proteins --A Principle Component Analysis
| Authors | Mehdi Yahyanejad, Mehran Kardar, Chao Tang |
|---|---|
| Categories | |
| ArXiv ID | physics/0207039 |
| URL | https://arxiv.org/abs/physics/0207039 |
| DOI | 10.1063/1.1541611 |
| Journal | J. Chem. Phys., 118, (2002) 4277-4284 |
Abstract
We study the space of all compact structures on a two-dimensional square lattice of size $N=6\times6$. Each structure is mapped onto a vector in $N$-dimensions according to a hydrophobic model. Previous work has shown that the designabilities of structures are closely related to the distribution of the structure vectors in the $N$-dimensional space, with highly designable structures predominantly found in low density regions. We use principal component analysis to probe and characterize the distribution of structure vectors, and find a non-uniform density with a single peak. Interestingly, the principal axes of this peak are almost aligned with Fourier eigenvectors, and the corresponding Fourier eigenvalues go to zero continuously at the wave-number for alternating patterns ($q=\pi$). These observations provide a stepping stone for an analytic description of the distribution of structural points, and open the possibility of estimating designabilities of realistic structures by simply Fourier transforming the hydrophobicities of the corresponding sequences.
{
"annotation_id": "6bb26bfb-0417-45c8-841e-6f48d427afad",
"date_created": "2026-03-02T18:00:39.477000Z",
"date_modified": "2026-03-02T18:00:39.477000Z",
"file_hash": "1bff9ef75dcf3d763a7f02eaad93caa3b2833fa6090703f5b05b504a47ba4b6e",
"private": false,
"record": {
"abstract": "We study the space of all compact structures on a two-dimensional square\nlattice of size $N=6\\times6$. Each structure is mapped onto a vector in\n$N$-dimensions according to a hydrophobic model. Previous work has shown that\nthe designabilities of structures are closely related to the distribution of\nthe structure vectors in the $N$-dimensional space, with highly designable\nstructures predominantly found in low density regions. We use principal\ncomponent analysis to probe and characterize the distribution of structure\nvectors, and find a non-uniform density with a single peak. Interestingly, the\nprincipal axes of this peak are almost aligned with Fourier eigenvectors, and\nthe corresponding Fourier eigenvalues go to zero continuously at the\nwave-number for alternating patterns ($q=\\pi$). These observations provide a\nstepping stone for an analytic description of the distribution of structural\npoints, and open the possibility of estimating designabilities of realistic\nstructures by simply Fourier transforming the hydrophobicities of the\ncorresponding sequences.",
"arxiv_id": "physics/0207039",
"authors": [
"Mehdi Yahyanejad",
"Mehran Kardar",
"Chao Tang"
],
"categories": [
"physics.bio-ph",
"cond-mat.soft",
"q-bio.BM"
],
"doi": "10.1063/1.1541611",
"journal_ref": "J. Chem. Phys., 118, (2002) 4277-4284",
"title": "Structure Space of Model Proteins --A Principle Component Analysis",
"url": "https://arxiv.org/abs/physics/0207039"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "1d3830ef-6c92-42e7-a693-7f7287340ebb",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}