[chibi@tiger deviceQuery]$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 2 CUDA Capable device(s) Device 0: "TITAN V" CUDA Driver Version / Runtime Version 10.0 / 10.0 CUDA Capability Major/Minor version number: 7.0 Total amount of global memory: 12036 MBytes (12620988416 bytes) (80) Multiprocessors, ( 64) CUDA Cores/MP: 5120 CUDA Cores GPU Max Clock rate: 1455 MHz (1.46 GHz) Memory Clock rate: 850 Mhz Memory Bus Width: 3072-bit L2 Cache Size: 4718592 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 7 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 130 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > Device 1: "TITAN V" CUDA Driver Version / Runtime Version 10.0 / 10.0 CUDA Capability Major/Minor version number: 7.0 Total amount of global memory: 12037 MBytes (12621381632 bytes) (80) Multiprocessors, ( 64) CUDA Cores/MP: 5120 CUDA Cores GPU Max Clock rate: 1455 MHz (1.46 GHz) Memory Clock rate: 850 Mhz Memory Bus Width: 3072-bit L2 Cache Size: 4718592 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 7 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 131 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > > Peer access from TITAN V (GPU0) -> TITAN V (GPU1) : Yes > Peer access from TITAN V (GPU1) -> TITAN V (GPU0) : Yes deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.0, CUDA Runtime Version = 10.0, NumDevs = 2 Result = PASS [chibi@tiger deviceQuery]$ cd [chibi@tiger ~]$ cd NVIDIA_CUDA-10.0_Samples/1_Utilities/bandwidthTest [chibi@tiger bandwidthTest]$ ./bandwidthTest [CUDA Bandwidth Test] - Starting... Running on... Device 0: TITAN V Quick Mode Host to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(MB/s) 33554432 11783.4 Device to Host Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(MB/s) 33554432 12898.5 Device to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(MB/s) 33554432 547871.9 Result = PASS NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. [chibi@tiger bandwidthTest]$ nvidia-smi Wed Sep 26 11:03:39 2018 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 410.48 Driver Version: 410.48 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 TITAN V Off | 00000000:82:00.0 On | N/A | | 36% 50C P0 37W / 250W | 213MiB / 12036MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 TITAN V Off | 00000000:83:00.0 Off | N/A | | 28% 39C P8 26W / 250W | 12MiB / 12036MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 2094 G /usr/bin/X 114MiB | | 0 3192 G /usr/bin/gnome-shell 85MiB | +-----------------------------------------------------------------------------+ [chibi@tiger bandwidthTest]$ nvidia-smi Wed Sep 26 11:04:03 2018 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 410.48 Driver Version: 410.48 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 TITAN V Off | 00000000:82:00.0 On | N/A | | 35% 50C P0 37W / 250W | 213MiB / 12036MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 TITAN V Off | 00000000:83:00.0 Off | N/A | | 28% 40C P8 26W / 250W | 12MiB / 12036MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 2094 G /usr/bin/X 114MiB | | 0 3192 G /usr/bin/gnome-shell 85MiB | +-----------------------------------------------------------------------------+ [chibi@tiger bandwidthTest]$ cat /etc/redhat-release CentOS Linux release 7.5.1804 (Core) [chibi@tiger bandwidthTest]$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2018 NVIDIA Corporation Built on Sat_Aug_25_21:08:01_CDT_2018 Cuda compilation tools, release 10.0, V10.0.130 [chibi@tiger bandwidthTest]$ sensors i350bb-pci-8100 Adapter: PCI adapter loc1: +51.0°C (high = +120.0°C, crit = +110.0°C) power_meter-acpi-0 Adapter: ACPI interface power1: 4.29 MW (interval = 1.00 s) coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +29.0°C (high = +85.0°C, crit = +95.0°C) Core 0: +23.0°C (high = +85.0°C, crit = +95.0°C) Core 1: +24.0°C (high = +85.0°C, crit = +95.0°C) Core 2: +23.0°C (high = +85.0°C, crit = +95.0°C) Core 3: +25.0°C (high = +85.0°C, crit = +95.0°C) Core 4: +25.0°C (high = +85.0°C, crit = +95.0°C) Core 5: +23.0°C (high = +85.0°C, crit = +95.0°C) Core 8: +24.0°C (high = +85.0°C, crit = +95.0°C) Core 9: +23.0°C (high = +85.0°C, crit = +95.0°C) Core 10: +24.0°C (high = +85.0°C, crit = +95.0°C) Core 11: +24.0°C (high = +85.0°C, crit = +95.0°C) Core 12: +24.0°C (high = +85.0°C, crit = +95.0°C) Core 13: +23.0°C (high = +85.0°C, crit = +95.0°C) coretemp-isa-0001 Adapter: ISA adapter Physical id 1: +32.0°C (high = +85.0°C, crit = +95.0°C) Core 0: +26.0°C (high = +85.0°C, crit = +95.0°C) Core 1: +25.0°C (high = +85.0°C, crit = +95.0°C) Core 2: +26.0°C (high = +85.0°C, crit = +95.0°C) Core 3: +25.0°C (high = +85.0°C, crit = +95.0°C) Core 4: +26.0°C (high = +85.0°C, crit = +95.0°C) Core 5: +26.0°C (high = +85.0°C, crit = +95.0°C) Core 8: +25.0°C (high = +85.0°C, crit = +95.0°C) Core 9: +25.0°C (high = +85.0°C, crit = +95.0°C) Core 10: +25.0°C (high = +85.0°C, crit = +95.0°C) Core 11: +26.0°C (high = +85.0°C, crit = +95.0°C) Core 12: +25.0°C (high = +85.0°C, crit = +95.0°C) Core 13: +25.0°C (high = +85.0°C, crit = +95.0°C) [chibi@tiger bandwidthTest]$