Preliminary results¶
We performed a set of experiments and we presented them into the “Star Clusters and Black Holes in Galaxies Across Cosmic Time” (IAU Symposium #312) under the title GraviDy, a modular, GPU-based, direct-summation N-BODY integrator, C. Maureira-Fredes & P. Amaro-Seoane.
Experiments setup¶
The computational environment used for the following experiments was:
- CPU, Intel(R) Xeon(R) CPU X5650 @ 2.67GHz (24 cores)
- GPU, Tesla M2050 @ 575 Mhz (448 cores).
- RAM, 24 GB
- OS, Scientific Linux release 6.4
Results¶
Globular cluster evolution¶
Lagrange radii of an N- body system with 1024 particles, the lines in the plot shows the radii distribution, using 5%, 10%, 15%, ..., 65% of the total mass. The core collapse is reached at \(T_{\rm cc}\,\approx\,15\,T{\rm rh}_{t=0}\), with a initial half-mass relaxation time of \(T{\rm rh}_{t=0}\,=\,20.24\) NBU.
data:image/s3,"s3://crabby-images/e2b5b/e2b5b2fe419e5818b177fb607dafb7481744612a" alt="Lagrange radii of an `N-`body system with 1024 particles."
Cumulative energy error up to \(t=1\) NBU as a function of \(\eta\). All the plots represent Plummer spheres with different amount of particles (N).
data:image/s3,"s3://crabby-images/be5df/be5dfd248165eff606ecd61c9325ec36fb766341" alt="Energy conservation using different values for `eta`"
Clock time up to \(t=1\) NBU as a function of \(\eta\). All the plots represent Plummer spheres with different amount of particles (N).
data:image/s3,"s3://crabby-images/9cf8c/9cf8c6aaa445bfe71959f2c9e0f57341011a36ac" alt="Clock time as a function of `eta`"
Performance¶
Clock time of integration from \(t=1\) to \(t = 2\) NBU using \(\eta = 0.01\) and \(\epsilon = 0.0001\) using different amount of particles (N).
data:image/s3,"s3://crabby-images/e73bd/e73bdea962318baf74925817aa26136a8787578e" alt="Clock time as a function of `N`"
The following plot shows the acceleration of five different implementations using parallel computing techniques, compared to the single-thread base run.
- OpenMP, ...
- CPU + GPU, ...
- MPI-1, ...
- MPI-2, ...
- GPU, ...
data:image/s3,"s3://crabby-images/84220/842201b2050a989db28ca06b27e0acc4a35ae847" alt="Speed-up"
GPU gravitational interactions performance in GFLOPS for different amount of particles. The blue line at the top corresponds to the theoretical peak of the double precision floating point performance.
data:image/s3,"s3://crabby-images/2fb35/2fb357b7ea3b9c072bda04267ef7bc1aea45ae46" alt="GFLOPs"