chibi@1804:~$ cat /etc/os-release NAME="Ubuntu" VERSION="18.04.2 LTS (Bionic Beaver)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 18.04.2 LTS" VERSION_ID="18.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=bionic UBUNTU_CODENAME=bionic chibi@1804:~$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Fri_Feb__8_19:08:17_PST_2019 Cuda compilation tools, release 10.1, V10.1.105 chibi@1804:~/NVIDIA_CUDA-10.1_Samples/1_Utilities/p2pBandwidthLatencyTest$ ./p2pBandwidthLatencyTest [P2P (Peer-to-Peer) GPU Bandwidth Latency Test] Device: 0, GeForce RTX 2080 Ti, pciBusID: 42, pciDeviceID: 0, pciDomainID:0 Device: 1, GeForce RTX 2080 Ti, pciBusID: 43, pciDeviceID: 0, pciDomainID:0 Device=0 CAN Access Peer Device=1 Device=1 CAN Access Peer Device=0 ***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure. So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases. P2P Connectivity Matrix D\D 0 1 0 1 1 1 1 1 Unidirectional P2P=Disabled Bandwidth Matrix (GB/s) D\D 0 1 0 528.52 4.18 1 4.17 527.72 Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s) D\D 0 1 0 530.67 47.00 1 46.94 507.31 Bidirectional P2P=Disabled Bandwidth Matrix (GB/s) D\D 0 1 0 534.45 7.72 1 7.78 532.24 Bidirectional P2P=Enabled Bandwidth Matrix (GB/s) D\D 0 1 0 524.11 93.71 1 93.79 528.95 P2P=Disabled Latency Matrix (us) GPU 0 1 0 1.31 16.16 1 21.55 1.66 CPU 0 1 0 3.77 9.87 1 9.87 3.73 P2P=Enabled Latency (P2P Writes) Matrix (us) GPU 0 1 0 1.32 1.59 1 1.87 1.66 CPU 0 1 0 3.72 2.89 1 2.89 3.70 NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. chibi@1804:~/NVIDIA_CUDA-10.1_Samples/1_Utilities/p2pBandwidthLatencyTest$ cd ~/NVIDIA_CUDA-10.1_Samples/1_Utilities/deviceQuery chibi@1804:~/NVIDIA_CUDA-10.1_Samples/1_Utilities/deviceQuery$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 2 CUDA Capable device(s) Device 0: "GeForce RTX 2080 Ti" CUDA Driver Version / Runtime Version 10.1 / 10.1 CUDA Capability Major/Minor version number: 7.5 Total amount of global memory: 10989 MBytes (11523260416 bytes) (68) Multiprocessors, ( 64) CUDA Cores/MP: 4352 CUDA Cores GPU Max Clock rate: 1635 MHz (1.63 GHz) Memory Clock rate: 7000 Mhz Memory Bus Width: 352-bit L2 Cache Size: 5767168 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 1024 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 3 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 66 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > Device 1: "GeForce RTX 2080 Ti" CUDA Driver Version / Runtime Version 10.1 / 10.1 CUDA Capability Major/Minor version number: 7.5 Total amount of global memory: 10988 MBytes (11521884160 bytes) (68) Multiprocessors, ( 64) CUDA Cores/MP: 4352 CUDA Cores GPU Max Clock rate: 1635 MHz (1.63 GHz) Memory Clock rate: 7000 Mhz Memory Bus Width: 352-bit L2 Cache Size: 5767168 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 1024 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 3 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 67 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > > Peer access from GeForce RTX 2080 Ti (GPU0) -> GeForce RTX 2080 Ti (GPU1) : Yes > Peer access from GeForce RTX 2080 Ti (GPU1) -> GeForce RTX 2080 Ti (GPU0) : Yes deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.1, NumDevs = 2 Result = PASS chibi@1804:~/NVIDIA_CUDA-10.1_Samples/1_Utilities/deviceQuery$ cd ~/NVIDIA_CUDA-10.1_Samples/1_Utilities/bandwidthTest chibi@1804:~/NVIDIA_CUDA-10.1_Samples/1_Utilities/bandwidthTest$ ./bandwidthTest [CUDA Bandwidth Test] - Starting... Running on... Device 0: GeForce RTX 2080 Ti Quick Mode Host to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(GB/s) 32000000 6.5 Device to Host Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(GB/s) 32000000 6.6 Device to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(GB/s) 32000000 518.1 Result = PASS NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. chibi@1804:~/NVIDIA_CUDA-10.1_Samples/1_Utilities/bandwidthTest$ cd chibi@1804:~$ nvidia-smi Sat Mar 9 18:25:25 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 418.39 Driver Version: 418.39 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce RTX 208... On | 00000000:42:00.0 Off | N/A | | 41% 28C P8 20W / 260W | 1MiB / 10989MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce RTX 208... On | 00000000:43:00.0 On | N/A | | 41% 35C P8 18W / 260W | 398MiB / 10988MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 1 1675 G /usr/lib/xorg/Xorg 208MiB | | 1 1808 G /usr/bin/gnome-shell 188MiB | +-----------------------------------------------------------------------------+ chibi@1804:~$ nvidia-smi Sat Mar 9 18:25:34 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 418.39 Driver Version: 418.39 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce RTX 208... On | 00000000:42:00.0 Off | N/A | | 40% 28C P8 20W / 260W | 1MiB / 10989MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce RTX 208... On | 00000000:43:00.0 On | N/A | | 41% 35C P8 18W / 260W | 398MiB / 10988MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 1 1675 G /usr/lib/xorg/Xorg 208MiB | | 1 1808 G /usr/bin/gnome-shell 188MiB | +-----------------------------------------------------------------------------+ chibi@1804:~$ sensors iwlwifi-virtual-0 Adapter: Virtual device temp1: +28.0C k10temp-pci-00d3 Adapter: PCI adapter Tdie: +31.5C (high = +70.0C) Tctl: +58.5C k10temp-pci-00c3 Adapter: PCI adapter Tdie: +31.9C (high = +70.0C) Tctl: +58.9C k10temp-pci-00db Adapter: PCI adapter Tdie: +30.8C (high = +70.0C) Tctl: +57.8C k10temp-pci-00cb Adapter: PCI adapter Tdie: +31.9C (high = +70.0C) Tctl: +58.9C chibi@1804:~$ sudo nvme list [sudo] chibi ̃pX[h: Node SN Model Namespace Usage Format FW Rev ---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- -------- /dev/nvme0n1 P02811102935 PLEXTOR PX-256M9PeG 1 256.06 GB / 256.06 GB 512 B + 0 B 1.06 chibi@1804:~$ sudo nvme smart-log /dev/nvme0n1 Smart Log for NVME device:nvme0n1 namespace-id:ffffffff critical_warning : 0 temperature : 25 C available_spare : 100% available_spare_threshold : 0% percentage_used : 0% data_units_read : 350,017 data_units_written : 1,026,322 host_read_commands : 6,356,163 host_write_commands : 6,208,030 controller_busy_time : 781 power_cycles : 231 power_on_hours : 67 unsafe_shutdowns : 107 media_errors : 0 num_err_log_entries : 0 Warning Temperature Time : 0 Critical Composite Temperature Time : 0 Temperature Sensor 1 : 25 C Thermal Management T1 Trans Count : 0 Thermal Management T2 Trans Count : 0 Thermal Management T1 Total Time : 0 Thermal Management T2 Total Time : 0 chibi@1804:~$ cat /etc/os-release NAME="Ubuntu" VERSION="18.04.2 LTS (Bionic Beaver)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 18.04.2 LTS" VERSION_ID="18.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=bionic UBUNTU_CODENAME=bionic chibi@1804:~$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Fri_Feb__8_19:08:17_PST_2019 Cuda compilation tools, release 10.1, V10.1.105 chibi@1804:~$ nvidia-smi nvlink -c GPU 0: GeForce RTX 2080 Ti (UUID: GPU-1ac935c2-557f-282e-14e5-3f749ffd63ac) Link 0, P2P is supported: true Link 0, Access to system memory supported: true Link 0, P2P atomics supported: true Link 0, System memory atomics supported: true Link 0, SLI is supported: true Link 0, Link is supported: false Link 1, P2P is supported: true Link 1, Access to system memory supported: true Link 1, P2P atomics supported: true Link 1, System memory atomics supported: true Link 1, SLI is supported: true Link 1, Link is supported: false GPU 1: GeForce RTX 2080 Ti (UUID: GPU-13277ce5-e1e9-0cb1-8cee-6c9e6618e774) Link 0, P2P is supported: true Link 0, Access to system memory supported: true Link 0, P2P atomics supported: true Link 0, System memory atomics supported: true Link 0, SLI is supported: true Link 0, Link is supported: false Link 1, P2P is supported: true Link 1, Access to system memory supported: true Link 1, P2P atomics supported: true Link 1, System memory atomics supported: true Link 1, SLI is supported: true Link 1, Link is supported: false chibi@1804:~$