dorsal/arxiv
View SchemaParametrized Stochastic Grammars for RNA Secondary Structure Prediction
| Authors | Robert S. Maier |
|---|---|
| Categories | |
| ArXiv ID | q-bio/0701036 |
| URL | https://arxiv.org/abs/q-bio/0701036 |
| DOI | 10.1109/ITA.2007.4357589 |
Abstract
We propose a two-level stochastic context-free grammar (SCFG) architecture for parametrized stochastic modeling of a family of RNA sequences, including their secondary structure. A stochastic model of this type can be used for maximum a posteriori estimation of the secondary structure of any new sequence in the family. The proposed SCFG architecture models RNA subsequences comprising paired bases as stochastically weighted Dyck-language words, i.e., as weighted balanced-parenthesis expressions. The length of each run of unpaired bases, forming a loop or a bulge, is taken to have a phase-type distribution: that of the hitting time in a finite-state Markov chain. Without loss of generality, each such Markov chain can be taken to have a bounded complexity. The scheme yields an overall family SCFG with a manageable number of parameters.
{
"annotation_id": "fbea4ec2-0ab8-4397-909b-b59f22a491f7",
"date_created": "2026-03-02T18:01:35.763000Z",
"date_modified": "2026-03-02T18:01:35.763000Z",
"file_hash": "f0663242eca3c428da4b77b8a738033dd5954abb303fa6948d23154b27cf599a",
"private": false,
"record": {
"abstract": "We propose a two-level stochastic context-free grammar (SCFG) architecture\nfor parametrized stochastic modeling of a family of RNA sequences, including\ntheir secondary structure. A stochastic model of this type can be used for\nmaximum a posteriori estimation of the secondary structure of any new sequence\nin the family. The proposed SCFG architecture models RNA subsequences\ncomprising paired bases as stochastically weighted Dyck-language words, i.e.,\nas weighted balanced-parenthesis expressions. The length of each run of\nunpaired bases, forming a loop or a bulge, is taken to have a phase-type\ndistribution: that of the hitting time in a finite-state Markov chain. Without\nloss of generality, each such Markov chain can be taken to have a bounded\ncomplexity. The scheme yields an overall family SCFG with a manageable number\nof parameters.",
"arxiv_id": "q-bio/0701036",
"authors": [
"Robert S. Maier"
],
"categories": [
"q-bio.BM",
"math.PR"
],
"doi": "10.1109/ITA.2007.4357589",
"title": "Parametrized Stochastic Grammars for RNA Secondary Structure Prediction",
"url": "https://arxiv.org/abs/q-bio/0701036"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "c99d55ea-c073-40d4-8f51-889a660d0a2e",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}