dorsal/arxiv
View SchemaData processing model for the CDF experiment
| Authors | J. Antos, M. Babik, D. Benjamin, S. Cabrera, A. W. Chan, Y. C. Chen, M. Coca, B. Cooper, S. Farrington, K. Genser, K. Hatakeyama, S. Hou, T. L. Hsieh, B. Jayatilaka, S. Y. Jun, A. V. Kotwal, A. C. Kraan, R. Lysak, I. V. Mandrichenko, P. Murat, A. Robson, P. Savard, M. Siket, B. Stelzer, J. Syu, P. K. Teng, S. C. Timm, T. Tomura, E. Vataga, S. A. Wolbers |
|---|---|
| Categories | |
| ArXiv ID | physics/0606042 |
| URL | https://arxiv.org/abs/physics/0606042 |
| DOI | 10.1109/TNS.2006.881908 |
| Journal | IEEE Trans.Nucl.Sci.53:2897-2906,2006 |
Abstract
The data processing model for the CDF experiment is described. Data processing reconstructs events from parallel data streams taken with different combinations of physics event triggers and further splits the events into datasets of specialized physics datasets. The design of the processing control system faces strict requirements on bookkeeping records, which trace the status of data files and event contents during processing and storage. The computing architecture was updated to meet the mass data flow of the Run II data collection, recently upgraded to a maximum rate of 40 MByte/sec. The data processing facility consists of a large cluster of Linux computers with data movement managed by the CDF data handling system to a multi-petaByte Enstore tape library. The latest processing cycle has achieved a stable speed of 35 MByte/sec (3 TByte/day). It can be readily scaled by increasing CPU and data-handling capacity as required.
{
"annotation_id": "494ef85f-2f8d-43a1-877c-af7333961947",
"date_created": "2026-03-02T18:01:10.855000Z",
"date_modified": "2026-03-02T18:01:10.855000Z",
"file_hash": "94fd997c4a8c54733ffdab3d569b88e08e5c4b7097d517185554694f1f05080e",
"private": false,
"record": {
"abstract": "The data processing model for the CDF experiment is described. Data\nprocessing reconstructs events from parallel data streams taken with different\ncombinations of physics event triggers and further splits the events into\ndatasets of specialized physics datasets. The design of the processing control\nsystem faces strict requirements on bookkeeping records, which trace the status\nof data files and event contents during processing and storage. The computing\narchitecture was updated to meet the mass data flow of the Run II data\ncollection, recently upgraded to a maximum rate of 40 MByte/sec. The data\nprocessing facility consists of a large cluster of Linux computers with data\nmovement managed by the CDF data handling system to a multi-petaByte Enstore\ntape library. The latest processing cycle has achieved a stable speed of 35\nMByte/sec (3 TByte/day). It can be readily scaled by increasing CPU and\ndata-handling capacity as required.",
"arxiv_id": "physics/0606042",
"authors": [
"J. Antos",
"M. Babik",
"D. Benjamin",
"S. Cabrera",
"A. W. Chan",
"Y. C. Chen",
"M. Coca",
"B. Cooper",
"S. Farrington",
"K. Genser",
"K. Hatakeyama",
"S. Hou",
"T. L. Hsieh",
"B. Jayatilaka",
"S. Y. Jun",
"A. V. Kotwal",
"A. C. Kraan",
"R. Lysak",
"I. V. Mandrichenko",
"P. Murat",
"A. Robson",
"P. Savard",
"M. Siket",
"B. Stelzer",
"J. Syu",
"P. K. Teng",
"S. C. Timm",
"T. Tomura",
"E. Vataga",
"S. A. Wolbers"
],
"categories": [
"physics.ins-det",
"physics.data-an"
],
"doi": "10.1109/TNS.2006.881908",
"journal_ref": "IEEE Trans.Nucl.Sci.53:2897-2906,2006",
"title": "Data processing model for the CDF experiment",
"url": "https://arxiv.org/abs/physics/0606042"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "d71b7912-8cb3-446f-babb-152b6fe269a2",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}