dorsal/arxiv
View SchemaA Parallel Tree code for large Nbody simulation: dynamic load balance and data distribution on CRAY T3D system
| Authors | U. Becciani, R. Ansaloni, V. Antonuccio-Delogu, G. Erbacci, M. Gambera, A. Pagliaro, - |
|---|---|
| Categories | |
| ArXiv ID | physics/9709003 |
| URL | https://arxiv.org/abs/physics/9709003 |
| DOI | 10.1016/S0010-4655(97)00102-1 |
Abstract
N-body algorithms for long-range unscreened interactions like gravity belong to a class of highly irregular problems whose optimal solution is a challenging task for present-day massively parallel computers. In this paper we describe a strategy for optimal memory and work distribution which we have applied to our parallel implementation of the Barnes & Hut (1986) recursive tree scheme on a Cray T3D using the CRAFT programming environment. We have performed a series of tests to find an " optimal data distribution " in the T3D memory, and to identify a strategy for the " Dynamic Load Balance " in order to obtain good performances when running large simulations (more than 10 million particles). The results of tests show that the step duration depends on two main factors: the data locality and the T3D network contention. Increasing data locality we are able to minimize the step duration if the closest bodies (direct interaction) tend to be located in the same PE local memory (contiguous block subdivison, high granularity), whereas the tree properties have a fine grain distribution. In a very large simulation, due to network contention, an unbalanced load arises. To remedy this we have devised an automatic work redistribution mechanism which provided a good Dynamic Load Balance at the price of an insignificant overhead.
{
"annotation_id": "e54e2e2e-53f5-40e2-955f-8ac6fa10db68",
"date_created": "2026-03-02T18:01:21.880000Z",
"date_modified": "2026-03-02T18:01:21.880000Z",
"file_hash": "f02b431ceb3c421e347f0874e07ca6232dbb368d1bd223082b1a79a95682b099",
"private": false,
"record": {
"abstract": "N-body algorithms for long-range unscreened interactions like gravity belong\nto a class of highly irregular problems whose optimal solution is a challenging\ntask for present-day massively parallel computers. In this paper we describe a\nstrategy for optimal memory and work distribution which we have applied to our\nparallel implementation of the Barnes \u0026 Hut (1986) recursive tree scheme on a\nCray T3D using the CRAFT programming environment. We have performed a series of\ntests to find an \" optimal data distribution \" in the T3D memory, and to\nidentify a strategy for the \" Dynamic Load Balance \" in order to obtain good\nperformances when running large simulations (more than 10 million particles).\nThe results of tests show that the step duration depends on two main factors:\nthe data locality and the T3D network contention. Increasing data locality we\nare able to minimize the step duration if the closest bodies (direct\ninteraction) tend to be located in the same PE local memory (contiguous block\nsubdivison, high granularity), whereas the tree properties have a fine grain\ndistribution. In a very large simulation, due to network contention, an\nunbalanced load arises. To remedy this we have devised an automatic work\nredistribution mechanism which provided a good Dynamic Load Balance at the\nprice of an insignificant overhead.",
"arxiv_id": "physics/9709003",
"authors": [
"U. Becciani",
"R. Ansaloni",
"V. Antonuccio-Delogu",
"G. Erbacci",
"M. Gambera",
"A. Pagliaro",
"-"
],
"categories": [
"physics.comp-ph",
"astro-ph"
],
"doi": "10.1016/S0010-4655(97)00102-1",
"title": "A Parallel Tree code for large Nbody simulation: dynamic load balance and data distribution on CRAY T3D system",
"url": "https://arxiv.org/abs/physics/9709003"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "5d77fc7d-05b9-4ce6-9ddd-40a6177fe89f",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}