C:\Windows\system32>cd C:\Program Files\NVIDIA Corporation\NVSMI C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi Sat Apr 17 17:29:57 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 466.11 Driver Version: 466.11 CUDA Version: 11.3 | |-------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA TITAN RTX WDDM | 00000000:41:00.0 On | N/A | | 41% 33C P8 22W / 280W | 1095MiB / 24576MiB | 6% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA TITAN RTX WDDM | 00000000:61:00.0 Off | N/A | | 41% 30C P8 33W / 280W | 1095MiB / 24576MiB | 2% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 2420 C+G Insufficient Permissions N/A | | 0 N/A N/A 5900 C+G ...underbird\thunderbird.exe N/A | | 0 N/A N/A 7152 C+G ...artMenuExperienceHost.exe N/A | | 0 N/A N/A 7684 C+G ...nputApp\TextInputHost.exe N/A | | 0 N/A N/A 11572 C+G ...5n1h2txyewy\SearchApp.exe N/A | | 1 N/A N/A 2420 C+G Insufficient Permissions N/A | | 1 N/A N/A 5900 C+G ...underbird\thunderbird.exe N/A | | 1 N/A N/A 7152 C+G ...artMenuExperienceHost.exe N/A | | 1 N/A N/A 7684 C+G ...nputApp\TextInputHost.exe N/A | | 1 N/A N/A 11572 C+G ...5n1h2txyewy\SearchApp.exe N/A | +-----------------------------------------------------------------------------+ C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi nvlink -c GPU 0: NVIDIA TITAN RTX (UUID: GPU-7fb51c1d-c1e7-35cc-aad7-66971f05ddb7) Link 0, P2P is supported: true Link 0, Access to system memory supported: true Link 0, P2P atomics supported: true Link 0, System memory atomics supported: true Link 0, SLI is supported: true Link 0, Link is supported: false Link 1, P2P is supported: true Link 1, Access to system memory supported: true Link 1, P2P atomics supported: true Link 1, System memory atomics supported: true Link 1, SLI is supported: true Link 1, Link is supported: false GPU 1: NVIDIA TITAN RTX (UUID: GPU-5a71d61e-f130-637a-b33d-4df555b0ed88) Link 0, P2P is supported: true Link 0, Access to system memory supported: true Link 0, P2P atomics supported: true Link 0, System memory atomics supported: true Link 0, SLI is supported: true Link 0, Link is supported: false Link 1, P2P is supported: true Link 1, Access to system memory supported: true Link 1, P2P atomics supported: true Link 1, System memory atomics supported: true Link 1, SLI is supported: true Link 1, Link is supported: false C:\Program Files\NVIDIA Corporation\NVSMI>nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Sun_Mar_21_19:24:09_Pacific_Daylight_Time_2021 Cuda compilation tools, release 11.3, V11.3.58 Build cuda_11.3.r11.3/compiler.29745058_0 C:\Program Files\NVIDIA Corporation\NVSMI> C:\Windows\system32>cd C:\ProgramData\NVIDIA Corporation\CUDA Samples\v11.3\bin\win64\Debug C:\ProgramData\NVIDIA Corporation\CUDA Samples\v11.3\bin\win64\Debug>p2pBandwidthLatencyTest [P2P (Peer-to-Peer) GPU Bandwidth Latency Test] Device: 0, NVIDIA TITAN RTX, pciBusID: 41, pciDeviceID: 0, pciDomainID:0 Device: 1, NVIDIA TITAN RTX, pciBusID: 61, pciDeviceID: 0, pciDomainID:0 Device=0 CAN Access Peer Device=1 Device=1 CAN Access Peer Device=0 ***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure. So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases. P2P Connectivity Matrix D\D 0 1 0 1 1 1 1 1 Unidirectional P2P=Disabled Bandwidth Matrix (GB/s) D\D 0 1 0 560.46 9.67 1 10.64 559.84 Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s) D\D 0 1 0 547.68 46.05 1 47.09 560.44 Bidirectional P2P=Disabled Bandwidth Matrix (GB/s) D\D 0 1 0 548.89 10.83 1 11.39 552.37 Bidirectional P2P=Enabled Bandwidth Matrix (GB/s) D\D 0 1 0 549.73 93.04 1 92.40 550.95 P2P=Disabled Latency Matrix (us) GPU 0 1 0 4.55 164.51 1 154.15 3.38 CPU 0 1 0 2.36 50.61 1 47.51 2.19 P2P=Enabled Latency (P2P Writes) Matrix (us) GPU 0 1 0 3.42 2.49 1 2.48 3.38 CPU 0 1 0 2.69 1.27 1 1.38 2.14 NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. C:\ProgramData\NVIDIA Corporation\CUDA Samples\v11.3\bin\win64\Debug>