dorsal/arxiv
View SchemaComparison of amino acid occurrence and composition for predicting protein folds
| Authors | Y-h. Taguchi, M. Michael Gromiha |
|---|---|
| Categories | |
| ArXiv ID | q-bio/0609037 |
| URL | https://arxiv.org/abs/q-bio/0609037 |
Abstract
Background:Prediction of protein three-dimensional structures from amino acid sequences is a long-standing goal in computational/molecular biology. The successful discrimination of protein folds would help to improve the accuracy of protein 3D structure prediction. Results: In this work, we propose a method based on linear discriminant analysis (LDA) for recognizing proteins belonging to 30 different folds using the occurrence of amino acid residues in a set of 1612 proteins. The present method could discriminate the globular proteins from 30 major folding types with the sensitivity of 37%, which is comparable to or better than other methods in the literature. A web server has been developed for predicting the folding type of the protein from amino acid sequence and it is available at http://granular.com/PROLDA/. Conclusions:Linear discriminant analysis based on amino acid occurrence could successfully recognize protein folds. The present method has several advantages such as, (i) it directly predicts the folding type of a protein without performing pair-wise comparisons, (ii) it can discriminate folds among large number of proteins and (iii) it is very fast to obtain the results. This is a simple method, which can be easily incorporated in any other structure prediction algorithms.
{
"annotation_id": "e64d12e0-b83e-44d1-876c-3c8fe481c258",
"date_created": "2026-03-02T18:01:34.681000Z",
"date_modified": "2026-03-02T18:01:34.681000Z",
"file_hash": "fcf85b7784371c7b7c1d7a68f07cf44c47a74783aeed5c8550750c5ef5df9ac9",
"private": false,
"record": {
"abstract": "Background:Prediction of protein three-dimensional structures from amino acid\nsequences is a long-standing goal in computational/molecular biology. The\nsuccessful discrimination of protein folds would help to improve the accuracy\nof protein 3D structure prediction. Results: In this work, we propose a method\nbased on linear discriminant analysis (LDA) for recognizing proteins belonging\nto 30 different folds using the occurrence of amino acid residues in a set of\n1612 proteins. The present method could discriminate the globular proteins from\n30 major folding types with the sensitivity of 37%, which is comparable to or\nbetter than other methods in the literature. A web server has been developed\nfor predicting the folding type of the protein from amino acid sequence and it\nis available at http://granular.com/PROLDA/. Conclusions:Linear discriminant\nanalysis based on amino acid occurrence could successfully recognize protein\nfolds. The present method has several advantages such as, (i) it directly\npredicts the folding type of a protein without performing pair-wise\ncomparisons, (ii) it can discriminate folds among large number of proteins and\n(iii) it is very fast to obtain the results. This is a simple method, which can\nbe easily incorporated in any other structure prediction algorithms.",
"arxiv_id": "q-bio/0609037",
"authors": [
"Y-h. Taguchi",
"M. Michael Gromiha"
],
"categories": [
"q-bio.BM",
"cond-mat.soft",
"nlin.AO",
"q-bio.QM"
],
"title": "Comparison of amino acid occurrence and composition for predicting protein folds",
"url": "https://arxiv.org/abs/q-bio/0609037"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "1c5af8a9-1d0e-44b8-995a-f421a2ad5058",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}