[chibi@centos7 nbody]$ cat /etc/redhat-release CentOS Linux release 7.7.1908 (Core) [chibi@centos7 nbody]$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243 [chibi@centos7 nbody]$ sudo hddtemp /dev/sda [sudo] chibi password: /dev/sda: ST2000LX001-1RG174: 18°C [chibi@centos7 nbody]$ ./nbody --benchmark --numbodies=256000 -numdevices=2 Run "nbody -benchmark [-numbodies=]" to measure performance. -fullscreen (run n-body simulation in fullscreen mode) -fp64 (use double precision floating point values for simulation) -hostmem (stores simulation data in host memory) -benchmark (run benchmark to measure performance) -numbodies= (number of bodies (>= 1) to run in simulation) -device= (where d=0,1,2.... for the CUDA device to use) -numdevices= (where i=(number of CUDA devices > 0) to use for simulation) -compare (compares simulation results running once on the default GPU and once on the CPU) -cpu (run n-body simulation on the CPU) -tipsy= (load a tipsy model file for simulation) NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. number of CUDA devices = 2 > Windowed mode > Simulation data stored in system memory > Single precision floating point simulation > 2 Devices used for simulation GPU Device 0: "TITAN V" with compute capability 7.0 > Compute 7.0 CUDA device: [TITAN V] > Compute 7.0 CUDA device: [TITAN V] number of bodies = 256000 256000 bodies, total time for 10 iterations: 744.869 ms = 879.833 billion interactions per second = 17596.655 single-precision GFLOP/s at 20 flops per interaction [chibi@centos7 nbody]$ ./nbody --benchmark -fp64 --numbodies=256000 -numdevices= 2 Run "nbody -benchmark [-numbodies=]" to measure performance. -fullscreen (run n-body simulation in fullscreen mode) -fp64 (use double precision floating point values for simulation) -hostmem (stores simulation data in host memory) -benchmark (run benchmark to measure performance) -numbodies= (number of bodies (>= 1) to run in simulation) -device= (where d=0,1,2.... for the CUDA device to use) -numdevices= (where i=(number of CUDA devices > 0) to use for simulation) -compare (compares simulation results running once on the default GPU and once on the CPU) -cpu (run n-body simulation on the CPU) -tipsy= (load a tipsy model file for simulation) NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. number of CUDA devices = 2 > Windowed mode > Simulation data stored in system memory > Double precision floating point simulation > 2 Devices used for simulation GPU Device 0: "TITAN V" with compute capability 7.0 > Compute 7.0 CUDA device: [TITAN V] > Compute 7.0 CUDA device: [TITAN V] number of bodies = 256000 256000 bodies, total time for 10 iterations: 2992.719 ms = 218.985 billion interactions per second = 6569.545 double-precision GFLOP/s at 30 flops per interaction [chibi@centos7 nbody]$ cat /etc/redhat-release CentOS Linux release 7.7.1908 (Core) [chibi@centos7 nbody]$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243 [chibi@centos7 nbody]$ sudo hddtemp /dev/sda [sudo] chibi password: /dev/sda: ST2000LX001-1RG174: 18°C [chibi@centos7 nbody]$