dorsal/arxiv
View SchemaThe Iterative Signature Algorithm for the analysis of large scale gene expression data
| Authors | Sven Bergmann, Jan Ihmels, Naama Barkai |
|---|---|
| Categories | |
| ArXiv ID | physics/0210038 |
| URL | https://arxiv.org/abs/physics/0210038 |
| DOI | 10.1103/PhysRevE.67.031902 |
Abstract
We present a new approach for the analysis of genome-wide expression data. Our method is designed to overcome the limitations of traditional techniques, when applied to large-scale data. Rather than alloting each gene to a single cluster, we assign both genes and conditions to context-dependent and potentially overlapping transcription modules. We provide a rigorous definition of a transcription module as the object to be retrieved from the expression data. An efficient algorithm, that searches for the modules encoded in the data by iteratively refining sets of genes and conditions until they match this definition, is established. Each iteration involves a linear map, induced by the normalized expression matrix, followed by the application of a threshold function. We argue that our method is in fact a generalization of Singular Value Decomposition, which corresponds to the special case where no threshold is applied. We show analytically that for noisy expression data our approach leads to better classification due to the implementation of the threshold. This result is confirmed by numerical analyses based on in-silico expression data. We discuss briefly results obtained by applying our algorithm to expression data from the yeast S. cerevisiae.
{
"annotation_id": "e73e90aa-a51e-4584-92a6-f71cbe627966",
"date_created": "2026-03-02T18:00:39.374000Z",
"date_modified": "2026-03-02T18:00:39.374000Z",
"file_hash": "50564ae391c778b257a84825758b741be6144e3ae4e7ceb9bf5425500674c956",
"private": false,
"record": {
"abstract": "We present a new approach for the analysis of genome-wide expression data.\nOur method is designed to overcome the limitations of traditional techniques,\nwhen applied to large-scale data. Rather than alloting each gene to a single\ncluster, we assign both genes and conditions to context-dependent and\npotentially overlapping transcription modules. We provide a rigorous definition\nof a transcription module as the object to be retrieved from the expression\ndata. An efficient algorithm, that searches for the modules encoded in the data\nby iteratively refining sets of genes and conditions until they match this\ndefinition, is established. Each iteration involves a linear map, induced by\nthe normalized expression matrix, followed by the application of a threshold\nfunction. We argue that our method is in fact a generalization of Singular\nValue Decomposition, which corresponds to the special case where no threshold\nis applied. We show analytically that for noisy expression data our approach\nleads to better classification due to the implementation of the threshold. This\nresult is confirmed by numerical analyses based on in-silico expression data.\nWe discuss briefly results obtained by applying our algorithm to expression\ndata from the yeast S. cerevisiae.",
"arxiv_id": "physics/0210038",
"authors": [
"Sven Bergmann",
"Jan Ihmels",
"Naama Barkai"
],
"categories": [
"physics.bio-ph",
"physics.data-an",
"q-bio.GN"
],
"doi": "10.1103/PhysRevE.67.031902",
"title": "The Iterative Signature Algorithm for the analysis of large scale gene expression data",
"url": "https://arxiv.org/abs/physics/0210038"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "760e4bd6-165f-4716-a134-9366b8e0ec08",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}