dorsal/arxiv
View SchemaEffect of pooling samples on the efficiency of comparative studies using microarrays
| Authors | Shu-Dong Zhang, Timothy W. Gant |
|---|---|
| Categories | |
| ArXiv ID | q-bio/0510024 |
| URL | https://arxiv.org/abs/q-bio/0510024 |
| DOI | 10.1093/bioinformatics/bti717 |
| Journal | Bioinformatics 2005 21(24):4378-4383. |
Abstract
Many biomedical experiments are carried out by pooling individual biological samples. However, pooling samples can potentially hide biological variance and give false confidence concerning the data significance. In the context of microarray experiments for detecting differentially expressed genes, recent publications have addressed the problem of the efficiency of sample-pooling, and some approximate formulas were provided for the power and sample size calculations. It is desirable to have exact formulas for these calculations and have the approximate results checked against the exact ones. We show that the difference between the approximate and exact results can be large. In this study, we have characterized quantitatively the effect of pooling samples on the efficiency of microarray experiments for the detection of differential gene expression between two classes. We present exact formulas for calculating the power of microarray experimental designs involving sample pooling and technical replications. The formulas can be used to determine the total numbers of arrays and biological subjects required in an experiment to achieve the desired power at a given significance level. The conditions under which pooled design becomes preferable to non-pooled design can then be derived given the unit cost associated with a microarray and that with a biological subject. This paper thus serves to provide guidance on sample pooling and cost effectiveness. The formulation in this paper is outlined in the context of performing microarray comparative studies, but its applicability is not limited to microarray experiments. It is also applicable to a wide range of biomedical comparative studies where sample pooling may be involved.
{
"annotation_id": "133c5c2b-97e7-4fb4-8e0b-e595fcf4ef5b",
"date_created": "2026-03-02T18:01:32.285000Z",
"date_modified": "2026-03-02T18:01:32.285000Z",
"file_hash": "b85a0db3d708725613b619e8628581cffd0f3ab918ad40434bf01e63eeddefa3",
"private": false,
"record": {
"abstract": "Many biomedical experiments are carried out by pooling individual biological\nsamples. However, pooling samples can potentially hide biological variance and\ngive false confidence concerning the data significance. In the context of\nmicroarray experiments for detecting differentially expressed genes, recent\npublications have addressed the problem of the efficiency of sample-pooling,\nand some approximate formulas were provided for the power and sample size\ncalculations. It is desirable to have exact formulas for these calculations and\nhave the approximate results checked against the exact ones. We show that the\ndifference between the approximate and exact results can be large. In this\nstudy, we have characterized quantitatively the effect of pooling samples on\nthe efficiency of microarray experiments for the detection of differential gene\nexpression between two classes. We present exact formulas for calculating the\npower of microarray experimental designs involving sample pooling and technical\nreplications. The formulas can be used to determine the total numbers of arrays\nand biological subjects required in an experiment to achieve the desired power\nat a given significance level. The conditions under which pooled design becomes\npreferable to non-pooled design can then be derived given the unit cost\nassociated with a microarray and that with a biological subject. This paper\nthus serves to provide guidance on sample pooling and cost effectiveness. The\nformulation in this paper is outlined in the context of performing microarray\ncomparative studies, but its applicability is not limited to microarray\nexperiments. It is also applicable to a wide range of biomedical comparative\nstudies where sample pooling may be involved.",
"arxiv_id": "q-bio/0510024",
"authors": [
"Shu-Dong Zhang",
"Timothy W. Gant"
],
"categories": [
"q-bio.QM",
"q-bio.GN"
],
"doi": "10.1093/bioinformatics/bti717",
"journal_ref": "Bioinformatics 2005 21(24):4378-4383.",
"title": "Effect of pooling samples on the efficiency of comparative studies using microarrays",
"url": "https://arxiv.org/abs/q-bio/0510024"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "68894f5e-b3e1-48c2-9ae0-1bf9ad914ace",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}