dorsal/arxiv
View SchemaPredicting Genetic Regulatory Response using Classification: Yeast Stress Response
| Authors | Manuel Middendorf, Anshul Kundaje, Chris Wiggins, Yoav Freund, Christina Leslie |
|---|---|
| Categories | |
| ArXiv ID | q-bio/0406016 |
| URL | https://arxiv.org/abs/q-bio/0406016 |
| Journal | Proceedings of the First Annual RECOMB Regulation Workshop 2004 |
Abstract
We present a novel classification-based algorithm called GeneClass for learning to predict gene regulatory response. Our approach is motivated by the hypothesis that in simple organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular experiment based on (1) the presence of binding site subsequences (``motifs'') in the gene's regulatory region and (2) the expression levels of regulators such as transcription factors in the experiment (``parents''). Thus our learning task integrates two qualitatively different data sources: genome-wide cDNA microarray data across multiple perturbation and mutant experiments along with motif profile data from regulatory sequences. Rather than focusing on the regression task of predicting real-valued gene expression measurements, GeneClass performs the classification task of predicting +1 and -1 labels, corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. GeneClass uses the Adaboost learning algorithm with a margin-based generalization of decision trees called alternating decision trees. In computational experiments based on the Gasch S. cerevisiae dataset, we show that the GeneClass method predicts up- and down-regulation on held-out experiments with high accuracy. We explore a range of experimental setups related to environmental stress response, and we retrieve important regulators, binding site motifs, and relationships between regulators and binding sites that are known to be associated to specific stress response pathways. Our method thus provides predictive hypotheses, suggests biological experiments, and provides interpretable insight into the structure of genetic regulatory networks.
{
"annotation_id": "4b2427cc-d2ff-43e0-b18b-433c14b0a29d",
"date_created": "2026-03-02T18:01:31.675000Z",
"date_modified": "2026-03-02T18:01:31.675000Z",
"file_hash": "9a1071616bbbdcde848f24e244e2ac92e76c95ffc4c32b4b718b99ddb2d9f2fb",
"private": false,
"record": {
"abstract": "We present a novel classification-based algorithm called GeneClass for\nlearning to predict gene regulatory response. Our approach is motivated by the\nhypothesis that in simple organisms such as Saccharomyces cerevisiae, we can\nlearn a decision rule for predicting whether a gene is up- or down-regulated in\na particular experiment based on (1) the presence of binding site subsequences\n(``motifs\u0027\u0027) in the gene\u0027s regulatory region and (2) the expression levels of\nregulators such as transcription factors in the experiment (``parents\u0027\u0027). Thus\nour learning task integrates two qualitatively different data sources:\ngenome-wide cDNA microarray data across multiple perturbation and mutant\nexperiments along with motif profile data from regulatory sequences. Rather\nthan focusing on the regression task of predicting real-valued gene expression\nmeasurements, GeneClass performs the classification task of predicting +1 and\n-1 labels, corresponding to up- and down-regulation beyond the levels of\nbiological and measurement noise in microarray measurements. GeneClass uses the\nAdaboost learning algorithm with a margin-based generalization of decision\ntrees called alternating decision trees. In computational experiments based on\nthe Gasch S. cerevisiae dataset, we show that the GeneClass method predicts up-\nand down-regulation on held-out experiments with high accuracy. We explore a\nrange of experimental setups related to environmental stress response, and we\nretrieve important regulators, binding site motifs, and relationships between\nregulators and binding sites that are known to be associated to specific stress\nresponse pathways. Our method thus provides predictive hypotheses, suggests\nbiological experiments, and provides interpretable insight into the structure\nof genetic regulatory networks.",
"arxiv_id": "q-bio/0406016",
"authors": [
"Manuel Middendorf",
"Anshul Kundaje",
"Chris Wiggins",
"Yoav Freund",
"Christina Leslie"
],
"categories": [
"q-bio.QM"
],
"journal_ref": "Proceedings of the First Annual RECOMB Regulation Workshop 2004",
"title": "Predicting Genetic Regulatory Response using Classification: Yeast Stress Response",
"url": "https://arxiv.org/abs/q-bio/0406016"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "8ea6f2cd-1ceb-49b7-8036-3b16f42f0496",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}