dorsal/arxiv
View SchemaOptimality of the genetic code with respect to protein stability and amino acid frequencies
| Authors | Dimitri Gilis, Serge Massar, Nicolas Cerf, Marianne Rooman |
|---|---|
| Categories | |
| ArXiv ID | physics/0102044 |
| URL | https://arxiv.org/abs/physics/0102044 |
| Journal | Genome Biology 2 (2001) research0049 |
Abstract
How robust is the natural genetic code with respect to mistranslation errors? It has long been known that the genetic code is very efficient in limiting the effect of point mutation. A misread codon will commonly code either for the same amino acid or for a similar one in terms of its biochemical properties, so the structure and function of the coded protein remain relatively unaltered. Previous studies have attempted to address this question more quantitatively, namely by statistically estimating the fraction of randomly generated codes that do better than the genetic code regarding its overall robustness. In this paper, we extend these results by investigating the role of amino acid frequencies in the optimality of the genetic code. When measuring the relative fitness of the natural code with respect to a random code, it is indeed natural to assume that a translation error affecting a frequent amino acid is less favorable than that of a rare one, at equal mutation cost. We find that taking the amino acid frequency into account accordingly decreases the fraction of random codes that beat the natural code, making the latter comparatively even more robust. This effect is particularly pronounced when more refined measures of the amino acid substitution cost are used than hydrophobicity. To show this, we devise a new cost function by evaluating with computer experiments the change in folding free energy caused by all possible single-site mutations in a set of known protein structures. With this cost function, we estimate that of the order of one random code out of 100 millions is more fit than the natural code when taking amino acid frequencies into account. The genetic code seems therefore structured so as to minimize the consequences of translation errors on the 3D structure and stability of proteins.
{
"annotation_id": "a8a5cc03-63b8-4049-ae12-8d7853f361ea",
"date_created": "2026-03-02T18:00:36.244000Z",
"date_modified": "2026-03-02T18:00:36.244000Z",
"file_hash": "95c0684f0996ff417b7339a66428a477e564b84b9de5a7a12943a32aa2754d6e",
"private": false,
"record": {
"abstract": "How robust is the natural genetic code with respect to mistranslation errors?\nIt has long been known that the genetic code is very efficient in limiting the\neffect of point mutation. A misread codon will commonly code either for the\nsame amino acid or for a similar one in terms of its biochemical properties, so\nthe structure and function of the coded protein remain relatively unaltered.\nPrevious studies have attempted to address this question more quantitatively,\nnamely by statistically estimating the fraction of randomly generated codes\nthat do better than the genetic code regarding its overall robustness. In this\npaper, we extend these results by investigating the role of amino acid\nfrequencies in the optimality of the genetic code. When measuring the relative\nfitness of the natural code with respect to a random code, it is indeed natural\nto assume that a translation error affecting a frequent amino acid is less\nfavorable than that of a rare one, at equal mutation cost. We find that taking\nthe amino acid frequency into account accordingly decreases the fraction of\nrandom codes that beat the natural code, making the latter comparatively even\nmore robust. This effect is particularly pronounced when more refined measures\nof the amino acid substitution cost are used than hydrophobicity. To show this,\nwe devise a new cost function by evaluating with computer experiments the\nchange in folding free energy caused by all possible single-site mutations in a\nset of known protein structures. With this cost function, we estimate that of\nthe order of one random code out of 100 millions is more fit than the natural\ncode when taking amino acid frequencies into account. The genetic code seems\ntherefore structured so as to minimize the consequences of translation errors\non the 3D structure and stability of proteins.",
"arxiv_id": "physics/0102044",
"authors": [
"Dimitri Gilis",
"Serge Massar",
"Nicolas Cerf",
"Marianne Rooman"
],
"categories": [
"physics.bio-ph",
"q-bio"
],
"journal_ref": "Genome Biology 2 (2001) research0049",
"title": "Optimality of the genetic code with respect to protein stability and amino acid frequencies",
"url": "https://arxiv.org/abs/physics/0102044"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "01c7d728-01b8-4f2d-9480-a03e138d1843",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}