dorsal/arxiv
View SchemaSimplicial edge representation of protein structures and alpha contact potential with confidence measure
| Authors | Xiang Li, Changyu Hu, Jie Liang |
|---|---|
| Categories | |
| ArXiv ID | physics/0302082 |
| URL | https://arxiv.org/abs/physics/0302082 |
Abstract
Protein representation and potential function are essential ingredients for studying proteins folding and protein prediction. We introduce a novel geometric representation of contact interactions using the edge simplices from alpha shape of protein structure. This representation can eliminate implausible neighbors not in physical contact, and can avoid spurious contact between two residues when a third residue is between them. We develop statistical alpha contact potential. A studentized bootstrap method is then introduced for assessing the 95% confidence intervals for each of the 210 parameters. We found with confidence that there is significant long range propensity (>30 residues apart) for hydrophobic interactions. We test alpha contact potential for native structure discrimination using several decoy sets, and found it often has comparable performance with atom-based potentials requiring more parameters. We also show that alpha contact potential has better performance than potential defined by cut-off distance between geometric centers of side chains. Clustering of alpha contact potentials reveals natural grouping of residues. To explore the relationship between shape representation and physicochemical representation, we test the minimum alphabet size for structure discrimination. We found that there is no significant difference in discrimination when alphabet size varies from 7 to 20, if geometry is represented accurately by alpha simplicial edges. This result suggests that the geometry of packing plays an important role, but the specific residue types are often interchangeable.
{
"annotation_id": "b64f8632-2b7c-4bef-b307-1b33b0db7ab1",
"date_created": "2026-03-02T18:00:43.302000Z",
"date_modified": "2026-03-02T18:00:43.302000Z",
"file_hash": "710e8a77d23e57b23b5f046e7e036ed3b6d2aa20e77ec744dae6770007e96479",
"private": false,
"record": {
"abstract": "Protein representation and potential function are essential ingredients for\nstudying proteins folding and protein prediction. We introduce a novel\ngeometric representation of contact interactions using the edge simplices from\nalpha shape of protein structure. This representation can eliminate implausible\nneighbors not in physical contact, and can avoid spurious contact between two\nresidues when a third residue is between them. We develop statistical alpha\ncontact potential. A studentized bootstrap method is then introduced for\nassessing the 95% confidence intervals for each of the 210 parameters. We found\nwith confidence that there is significant long range propensity (\u003e30 residues\napart) for hydrophobic interactions. We test alpha contact potential for native\nstructure discrimination using several decoy sets, and found it often has\ncomparable performance with atom-based potentials requiring more parameters. We\nalso show that alpha contact potential has better performance than potential\ndefined by cut-off distance between geometric centers of side chains.\nClustering of alpha contact potentials reveals natural grouping of residues. To\nexplore the relationship between shape representation and physicochemical\nrepresentation, we test the minimum alphabet size for structure discrimination.\nWe found that there is no significant difference in discrimination when\nalphabet size varies from 7 to 20, if geometry is represented accurately by\nalpha simplicial edges. This result suggests that the geometry of packing plays\nan important role, but the specific residue types are often interchangeable.",
"arxiv_id": "physics/0302082",
"authors": [
"Xiang Li",
"Changyu Hu",
"Jie Liang"
],
"categories": [
"physics.bio-ph",
"q-bio"
],
"title": "Simplicial edge representation of protein structures and alpha contact potential with confidence measure",
"url": "https://arxiv.org/abs/physics/0302082"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "ce4f0cb9-45c3-4c7d-95d6-9ca2865b74cc",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}