dorsal/arxiv
View SchemaCoupled Two-Way Clustering Analysis of Gene Microarray Data
| Authors | G. Getz, E. Levine, E. Domany |
|---|---|
| Categories | |
| ArXiv ID | physics/0004009 |
| URL | https://arxiv.org/abs/physics/0004009 |
| DOI | 10.1073/pnas.210134797 |
Abstract
We present a novel coupled two-way clustering approach to gene microarray data analysis. The main idea is to identify subsets of the genes and samples, such that when one of these is used to cluster the other, stable and significant partitions emerge. The search for such subsets is a computationally complex task: we present an algorithm, based on iterative clustering, which performs such a search. This analysis is especially suitable for gene microarray data, where the contributions of a variety of biological mechanisms to the gene expression levels are entangled in a large body of experimental data. The method was applied to two gene microarray data sets, on colon cancer and leukemia. By identifying relevant subsets of the data and focusing on them we were able to discover partitions and correlations that were masked and hidden when the full dataset was used in the analysis. Some of these partitions have clear biological interpretation; others can serve to identify possible directions for future research.
{
"annotation_id": "acb65873-a291-4d97-9291-821c57ba054e",
"date_created": "2026-03-02T18:00:29.403000Z",
"date_modified": "2026-03-02T18:00:29.403000Z",
"file_hash": "514f813838d3947c0625fc59a8d9896c87816a1d53dc4ed416971aee882adcce",
"private": false,
"record": {
"abstract": "We present a novel coupled two-way clustering approach to gene microarray\ndata analysis. The main idea is to identify subsets of the genes and samples,\nsuch that when one of these is used to cluster the other, stable and\nsignificant partitions emerge. The search for such subsets is a computationally\ncomplex task: we present an algorithm, based on iterative clustering, which\nperforms such a search. This analysis is especially suitable for gene\nmicroarray data, where the contributions of a variety of biological mechanisms\nto the gene expression levels are entangled in a large body of experimental\ndata. The method was applied to two gene microarray data sets, on colon cancer\nand leukemia. By identifying relevant subsets of the data and focusing on them\nwe were able to discover partitions and correlations that were masked and\nhidden when the full dataset was used in the analysis. Some of these partitions\nhave clear biological interpretation; others can serve to identify possible\ndirections for future research.",
"arxiv_id": "physics/0004009",
"authors": [
"G. Getz",
"E. Levine",
"E. Domany"
],
"categories": [
"physics.bio-ph",
"physics.comp-ph",
"physics.data-an",
"q-bio.QM"
],
"doi": "10.1073/pnas.210134797",
"title": "Coupled Two-Way Clustering Analysis of Gene Microarray Data",
"url": "https://arxiv.org/abs/physics/0004009"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "10828c8b-e5c9-41f0-b9ee-3c7ccff29b0b",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}