CentOS7.3 GTX1050Ti GTX1070x3 CUDA Samplesでnbody benchmark を動作させてみた11773.113 GFLOP/s


[chibi@centos7 nbody]$ ./nbody -benchmark -numbodies=256000 -numdevices=4
Run “nbody -benchmark [-numbodies=]” to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies= (number of bodies (>= 1) to run in simulation)
-device= (where d=0,1,2…. for the CUDA device to use)
-numdevices= (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy= (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

number of CUDA devices = 4
> Windowed mode
> Simulation data stored in system memory
> Single precision floating point simulation
> 4 Devices used for simulation
GPU Device 0: “GeForce GTX 1070” with compute capability 6.1

> Compute 6.1 CUDA device: [GeForce GTX 1070]
> Compute 6.1 CUDA device: [GeForce GTX 1050 Ti]
> Compute 6.1 CUDA device: [GeForce GTX 1070]
> Compute 6.1 CUDA device: [GeForce GTX 1070]
number of bodies = 256000
256000 bodies, total time for 10 iterations: 1113.316 ms
= 588.656 billion interactions per second
= 11773.113 single-precision GFLOP/s at 20 flops per interaction
[chibi@centos7 nbody]$ nvidia-smi
CentOS7.3 GTX1050Ti GTX1070x3 CUDA Samples nbody benchmark 11773.113 GFLOPs

カテゴリー: centos7, nvidia パーマリンク

コメントを残す

メールアドレスが公開されることはありません。 が付いている欄は必須項目です