Preliminary results¶
We performed a set of experiments and we presented them into the “Star Clusters and Black Holes in Galaxies Across Cosmic Time” (IAU Symposium #312) under the title GraviDy, a modular, GPU-based, direct-summation N-BODY integrator, C. Maureira-Fredes & P. Amaro-Seoane.
Experiments setup¶
The computational environment used for the following experiments was:
- CPU, Intel(R) Xeon(R) CPU X5650 @ 2.67GHz (24 cores)
- GPU, Tesla M2050 @ 575 Mhz (448 cores).
- RAM, 24 GB
- OS, Scientific Linux release 6.4
Results¶
Globular cluster evolution¶
Lagrange radii of an N- body system with 1024 particles, the lines in the plot shows the radii distribution, using 5%, 10%, 15%, ..., 65% of the total mass. The core collapse is reached at \(T_{\rm cc}\,\approx\,15\,T{\rm rh}_{t=0}\), with a initial half-mass relaxation time of \(T{\rm rh}_{t=0}\,=\,20.24\) NBU.

Cumulative energy error up to \(t=1\) NBU as a function of \(\eta\). All the plots represent Plummer spheres with different amount of particles (N).

Clock time up to \(t=1\) NBU as a function of \(\eta\). All the plots represent Plummer spheres with different amount of particles (N).

Performance¶
Clock time of integration from \(t=1\) to \(t = 2\) NBU using \(\eta = 0.01\) and \(\epsilon = 0.0001\) using different amount of particles (N).

The following plot shows the acceleration of five different implementations using parallel computing techniques, compared to the single-thread base run.
- OpenMP, ...
- CPU + GPU, ...
- MPI-1, ...
- MPI-2, ...
- GPU, ...

GPU gravitational interactions performance in GFLOPS for different amount of particles. The blue line at the top corresponds to the theoretical peak of the double precision floating point performance.
