dorsal/arxiv
View SchemaCluster Analysis of Gene Expression Data
| Authors | Eytan Domany |
|---|---|
| Categories | |
| ArXiv ID | physics/0206056 |
| URL | https://arxiv.org/abs/physics/0206056 |
Abstract
The expression levels of many thousands of genes can be measured simultaneously by DNA microarrays (chips). This novel experimental tool has revolutionized research in molecular biology and generated considerable excitement. A typical experiment uses a few tens of such chips, each dedicated to a single sample - such as tissue extracted from a particular tumor. The results of such an experiment contain several hundred thousand numbers, that come in the form of a table, of several thousand rows (one for each gene) and 50 - 100 columns (one for each sample). We developed a clustering methodology to mine such data. In this review I provide a very basic introduction to the subject, aimed at a physics audience with no prior knowledge of either gene expression or clustering methods. I explain what genes are, what is gene expression and how it is measured by DNA chips. Next I explain what is meant by "clustering" and how we analyze the massive amounts of data from such experiments, and present results obtained from analysis of data obtained from colon cancer, brain tumors and breast cancer.
{
"annotation_id": "4293cfe6-3bde-47fb-b684-e57349f8420e",
"date_created": "2026-03-02T18:00:39.860000Z",
"date_modified": "2026-03-02T18:00:39.860000Z",
"file_hash": "68bc9854bf0fb2954e378f3c1d293161a552de7c81c93aa8731f9eec3b743e22",
"private": false,
"record": {
"abstract": "The expression levels of many thousands of genes can be measured\nsimultaneously by DNA microarrays (chips). This novel experimental tool has\nrevolutionized research in molecular biology and generated considerable\nexcitement. A typical experiment uses a few tens of such chips, each dedicated\nto a single sample - such as tissue extracted from a particular tumor. The\nresults of such an experiment contain several hundred thousand numbers, that\ncome in the form of a table, of several thousand rows (one for each gene) and\n50 - 100 columns (one for each sample). We developed a clustering methodology\nto mine such data. In this review I provide a very basic introduction to the\nsubject, aimed at a physics audience with no prior knowledge of either gene\nexpression or clustering methods. I explain what genes are, what is gene\nexpression and how it is measured by DNA chips. Next I explain what is meant by\n\"clustering\" and how we analyze the massive amounts of data from such\nexperiments, and present results obtained from analysis of data obtained from\ncolon cancer, brain tumors and breast cancer.",
"arxiv_id": "physics/0206056",
"authors": [
"Eytan Domany"
],
"categories": [
"physics.bio-ph",
"q-bio"
],
"title": "Cluster Analysis of Gene Expression Data",
"url": "https://arxiv.org/abs/physics/0206056"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "7cfd1479-a966-44bc-ac59-38e2240a8635",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}