{"id":212,"date":"2020-02-02T05:03:44","date_gmt":"2020-02-01T20:03:44","guid":{"rendered":"https:\/\/wp.study3.biz\/?p=212"},"modified":"2020-02-02T05:04:56","modified_gmt":"2020-02-01T20:04:56","slug":"ubuntu16-04-5-tesla-v100-sxm2-16gb-x4-cuda-9-0-samples-simplep2p-p2pbandwidthlatencytest-%e4%bb%96%e3%82%92%e5%8b%95%e4%bd%9c%e3%81%95%e3%81%9b%e3%81%a6%e3%81%bf%e3%81%9f","status":"publish","type":"post","link":"https:\/\/wp.study3.biz\/?p=212","title":{"rendered":"Ubuntu16.04.5 TESLA V100-SXM2 16GB x4 CUDA 9.0 Samples simpleP2P p2pBandwidthLatencyTest \u4ed6\u3092\u52d5\u4f5c\u3055\u305b\u3066\u307f\u305f"},"content":{"rendered":"<p>user0018@10-111-146-12:~\/cuda-samples-9.0\/samples\/0_Simple\/simpleP2P$ .\/simpleP2P<br \/>\n[.\/simpleP2P] &#8211; Starting&#8230;<br \/>\nChecking for multiple GPUs&#8230;<br \/>\nCUDA-capable device count: 4<br \/>\n&gt; GPU0 = &#8220;Tesla V100-SXM2-16GB&#8221; IS capable of Peer-to-Peer (P2P)<br \/>\n&gt; GPU1 = &#8220;Tesla V100-SXM2-16GB&#8221; IS capable of Peer-to-Peer (P2P)<br \/>\n&gt; GPU2 = &#8220;Tesla V100-SXM2-16GB&#8221; IS capable of Peer-to-Peer (P2P)<br \/>\n&gt; GPU3 = &#8220;Tesla V100-SXM2-16GB&#8221; IS capable of Peer-to-Peer (P2P)<\/p>\n<p>Checking GPU(s) for support of peer to peer memory access&#8230;<br \/>\n&gt; Peer access from Tesla V100-SXM2-16GB (GPU0) -&gt; Tesla V100-SXM2-16GB (GPU1) : Yes<br \/>\n&gt; Peer access from Tesla V100-SXM2-16GB (GPU0) -&gt; Tesla V100-SXM2-16GB (GPU2) : Yes<br \/>\n&gt; Peer access from Tesla V100-SXM2-16GB (GPU0) -&gt; Tesla V100-SXM2-16GB (GPU3) : Yes<br \/>\n&gt; Peer access from Tesla V100-SXM2-16GB (GPU1) -&gt; Tesla V100-SXM2-16GB (GPU0) : Yes<br \/>\n&gt; Peer access from Tesla V100-SXM2-16GB (GPU1) -&gt; Tesla V100-SXM2-16GB (GPU2) : Yes<br \/>\n&gt; Peer access from Tesla V100-SXM2-16GB (GPU1) -&gt; Tesla V100-SXM2-16GB (GPU3) : Yes<br \/>\n&gt; Peer access from Tesla V100-SXM2-16GB (GPU2) -&gt; Tesla V100-SXM2-16GB (GPU0) : Yes<br \/>\n&gt; Peer access from Tesla V100-SXM2-16GB (GPU2) -&gt; Tesla V100-SXM2-16GB (GPU1) : Yes<br \/>\n&gt; Peer access from Tesla V100-SXM2-16GB (GPU2) -&gt; Tesla V100-SXM2-16GB (GPU3) : Yes<br \/>\n&gt; Peer access from Tesla V100-SXM2-16GB (GPU3) -&gt; Tesla V100-SXM2-16GB (GPU0) : Yes<br \/>\n&gt; Peer access from Tesla V100-SXM2-16GB (GPU3) -&gt; Tesla V100-SXM2-16GB (GPU1) : Yes<br \/>\n&gt; Peer access from Tesla V100-SXM2-16GB (GPU3) -&gt; Tesla V100-SXM2-16GB (GPU2) : Yes<br \/>\nEnabling peer access between GPU0 and GPU1&#8230;<br \/>\nChecking GPU0 and GPU1 for UVA capabilities&#8230;<br \/>\n&gt; Tesla V100-SXM2-16GB (GPU0) supports UVA: Yes<br \/>\n&gt; Tesla V100-SXM2-16GB (GPU1) supports UVA: Yes<br \/>\nBoth GPUs can support UVA, enabling&#8230;<br \/>\nAllocating buffers (64MB on GPU0, GPU1 and CPU Host)&#8230;<br \/>\nCreating event handles&#8230;<br \/>\ncudaMemcpyPeer \/ cudaMemcpy between GPU0 and GPU1: <strong>44.84GB\/s<\/strong><br \/>\nPreparing host buffer and memcpy to GPU0&#8230;<br \/>\nRun kernel on GPU1, taking source data from GPU0 and writing to GPU1&#8230;<br \/>\nRun kernel on GPU0, taking source data from GPU1 and writing to GPU0&#8230;<br \/>\nCopy data back to host from GPU0 and verify results&#8230;<br \/>\nDisabling peer access&#8230;<br \/>\nShutting down&#8230;<br \/>\nTest passed<br \/>\nuser0018@10-111-146-12:~\/cuda-samples-9.0\/samples\/0_Simple\/simpleP2P$ cd ~\/cuda-samples-9.0\/samples\/1_Utilities<br \/>\nuser0018@10-111-146-12:~\/cuda-samples-9.0\/samples\/1_Utilities$ ls<br \/>\nbandwidthTest deviceQueryDrv topologyQuery<br \/>\ndeviceQuery p2pBandwidthLatencyTest<br \/>\nuser0018@10-111-146-12:~\/cuda-samples-9.0\/samples\/1_Utilities$ cd p2pBandwidthLatencyTest<br \/>\nuser0018@10-111-146-12:~\/cuda-samples-9.0\/samples\/1_Utilities\/p2pBandwidthLatencyTest$ .\/p2pBandwidthLatencyTest<br \/>\n[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]<br \/>\nDevice: 0, Tesla V100-SXM2-16GB, pciBusID: 61, pciDeviceID: 0, pciDomainID:0<br \/>\nDevice: 1, Tesla V100-SXM2-16GB, pciBusID: 62, pciDeviceID: 0, pciDomainID:0<br \/>\nDevice: 2, Tesla V100-SXM2-16GB, pciBusID: 89, pciDeviceID: 0, pciDomainID:0<br \/>\nDevice: 3, Tesla V100-SXM2-16GB, pciBusID: 8a, pciDeviceID: 0, pciDomainID:0<br \/>\nDevice=0 CAN Access Peer Device=1<br \/>\nDevice=0 CAN Access Peer Device=2<br \/>\nDevice=0 CAN Access Peer Device=3<br \/>\nDevice=1 CAN Access Peer Device=0<br \/>\nDevice=1 CAN Access Peer Device=2<br \/>\nDevice=1 CAN Access Peer Device=3<br \/>\nDevice=2 CAN Access Peer Device=0<br \/>\nDevice=2 CAN Access Peer Device=1<br \/>\nDevice=2 CAN Access Peer Device=3<br \/>\nDevice=3 CAN Access Peer Device=0<br \/>\nDevice=3 CAN Access Peer Device=1<br \/>\nDevice=3 CAN Access Peer Device=2<\/p>\n<p>***NOTE: In case a device doesn&#8217;t have P2P access to other one, it falls back to normal memcopy procedure.<br \/>\nSo you can see lesser Bandwidth (GB\/s) in those cases.<\/p>\n<p>P2P Connectivity Matrix<br \/>\nD\\D 0 1 2 3<br \/>\n0 1 1 1 1<br \/>\n1 1 1 1 1<br \/>\n2 1 1 1 1<br \/>\n3 1 1 1 1<br \/>\nUnidirectional P2P=Disabled Bandwidth Matrix (GB\/s)<br \/>\nD\\D 0 1 2 3<br \/>\n0 748.32 9.92 11.02 11.03<br \/>\n1 9.96 742.63 11.02 11.07<br \/>\n2 11.05 11.03 744.05 9.94<br \/>\n3 11.04 11.01 9.92 745.47<br \/>\nUnidirectional P2P=Enabled Bandwidth Matrix (GB\/s)<br \/>\nD\\D 0 1 2 3<br \/>\n0 744.05 47.91 47.95 47.89<br \/>\n1 47.90 745.47 47.96 47.96<br \/>\n2 47.91 47.97 746.89 48.00<br \/>\n3 47.97 47.90 47.97 745.47<br \/>\nBidirectional P2P=Disabled Bandwidth Matrix (GB\/s)<br \/>\nD\\D 0 1 2 3<br \/>\n0 764.43 10.38 18.92 17.99<br \/>\n1 10.42 765.93 18.08 17.56<br \/>\n2 18.79 18.23 763.69 10.41<br \/>\n3 17.86 17.49 10.43 762.94<br \/>\nBidirectional P2P=Enabled Bandwidth Matrix (GB\/s)<br \/>\nD\\D 0 1 2 3<br \/>\n0 762.20 95.58 95.67 95.58<br \/>\n1 95.37 765.93 95.67 95.65<br \/>\n2 95.37 95.39 765.93 95.44<br \/>\n3 95.48 95.46 95.51 769.70<br \/>\nP2P=Disabled Latency Matrix (us)<br \/>\nD\\D 0 1 2 3<br \/>\n0 4.21 20.29 19.84 19.75<br \/>\n1 20.36 4.27 20.36 20.39<br \/>\n2 20.30 20.31 3.92 18.73<br \/>\n3 19.48 19.49 18.69 3.26<br \/>\nP2P=Enabled Latency Matrix (us)<br \/>\nD\\D 0 1 2 3<br \/>\n0 4.14 7.38 7.32 7.02<br \/>\n1 8.32 4.37 7.45 7.52<br \/>\n2 6.80 6.84 4.04 6.79<br \/>\n3 6.81 6.75 6.88 3.39<\/p>\n<p>NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.<br \/>\nuser0018@10-111-146-12:~\/cuda-samples-9.0\/samples\/1_Utilities\/p2pBandwidthLatencyTest$<\/p>\n<p><a href=\"https:\/\/wp.study3.biz\/wp-content\/uploads\/2020\/01\/Ubuntu16.04.5-TESLA-V100-SXM2-16GB-x4-CUDA-9.0-Samples-simpleP2P-p2pBandwidthLatencyTest\u3000\u4ed6.txt\">Ubuntu16.04.5 TESLA V100-SXM2 16GB x4 CUDA 9.0 Samples simpleP2P p2pBandwidthLatencyTest\u3000\u4ed6<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>user0018@10-111-146-12:~\/cuda-samples-9.0\/samples\/0_Simple\/simpleP2P$ .\/simpleP2P [.\/simpleP2P] &#8211; Starti &hellip; <a href=\"https:\/\/wp.study3.biz\/?p=212\">\u7d9a\u304d\u3092\u8aad\u3080 <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[18,17],"tags":[],"class_list":["post-212","post","type-post","status-publish","format-standard","hentry","category-nvidia","category-ubuntu"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/wp.study3.biz\/index.php?rest_route=\/wp\/v2\/posts\/212","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wp.study3.biz\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wp.study3.biz\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wp.study3.biz\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/wp.study3.biz\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=212"}],"version-history":[{"count":4,"href":"https:\/\/wp.study3.biz\/index.php?rest_route=\/wp\/v2\/posts\/212\/revisions"}],"predecessor-version":[{"id":219,"href":"https:\/\/wp.study3.biz\/index.php?rest_route=\/wp\/v2\/posts\/212\/revisions\/219"}],"wp:attachment":[{"href":"https:\/\/wp.study3.biz\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=212"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wp.study3.biz\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=212"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wp.study3.biz\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=212"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}