user0034@10-111-34-13:~/NVIDIA_CUDA-10.1_Samples/5_Simulations/nbody$ ./nbody -benchmark -numbodies=256000 -device=0 Run "nbody -benchmark [-numbodies=]" to measure performance. -fullscreen (run n-body simulation in fullscreen mode) -fp64 (use double precision floating point values for simulation) -hostmem (stores simulation data in host memory) -benchmark (run benchmark to measure performance) -numbodies= (number of bodies (>= 1) to run in simulation) -device= (where d=0,1,2.... for the CUDA device to use) -numdevices= (where i=(number of CUDA devices > 0) to use for simulation) -compare (compares simulation results running once on the default GPU and once on the CPU) -cpu (run n-body simulation on the CPU) -tipsy= (load a tipsy model file for simulation) NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. > Windowed mode > Simulation data stored in video memory > Single precision floating point simulation > 1 Devices used for simulation gpuDeviceInit() CUDA Device [0]: "Tesla V100-SXM2-16GB > Compute 7.0 CUDA device: [Tesla V100-SXM2-16GB] number of bodies = 256000 256000 bodies, total time for 10 iterations: 1450.994 ms = 451.663 billion interactions per second = 9033.258 single-precision GFLOP/s at 20 flops per interaction user0034@10-111-34-13:~/NVIDIA_CUDA-10.1_Samples/5_Simulations/nbody$ ./nbody -f p64 -benchmark -numbodies=256000 -device=0 Run "nbody -benchmark [-numbodies=]" to measure performance. -fullscreen (run n-body simulation in fullscreen mode) -fp64 (use double precision floating point values for simulation) -hostmem (stores simulation data in host memory) -benchmark (run benchmark to measure performance) -numbodies= (number of bodies (>= 1) to run in simulation) -device= (where d=0,1,2.... for the CUDA device to use) -numdevices= (where i=(number of CUDA devices > 0) to use for simulation) -compare (compares simulation results running once on the default GPU and once on the CPU) -cpu (run n-body simulation on the CPU) -tipsy= (load a tipsy model file for simulation) NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. > Windowed mode > Simulation data stored in video memory > Double precision floating point simulation > 1 Devices used for simulation gpuDeviceInit() CUDA Device [0]: "Tesla V100-SXM2-16GB > Compute 7.0 CUDA device: [Tesla V100-SXM2-16GB] number of bodies = 256000 256000 bodies, total time for 10 iterations: 3783.535 ms = 173.214 billion interactions per second = 5196.411 double-precision GFLOP/s at 30 flops per interaction user0034@10-111-34-13:~/NVIDIA_CUDA-10.1_Samples/5_Simulations/nbody$ cd user0034@10-111-34-13:~$ cat /etc/os-release NAME="Ubuntu" VERSION="16.04.6 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.6 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" VERSION_CODENAME=xenial UBUNTU_CODENAME=xenial user0034@10-111-34-13:~$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243 user0034@10-111-34-13:~$ nvidia-smi nvlink -c GPU 0: Tesla V100-SXM2-16GB (UUID: GPU-645fd6fb-5884-66a2-d47c-238743a797c3) user0034@10-111-34-13:~$