chibi@1804:~$ sudo nvidia-docker run -it –rm nvcr.io/hpc/namd:2.12-171025 /opt/namd/namd-multicore-memopt +p40 +setcpuaffinity +idlepoll /workspace/examples/stmv/stmv_pmecuda.namd
Unable to find image ‘nvcr.io/hpc/namd:2.12-171025’ locally
2.12-171025: Pulling from hpc/namd
f6fa9a861b90: Pull complete
2d93875543ec: Pull complete
407421ef3e7e: Pull complete
ea9ffec33008: Pull complete
c695ce24f66e: Pull complete
cb6e6f26f62f: Pull complete
4ca5cacd5888: Pull complete
127359e380ae: Pull complete
09f52fb90f32: Pull complete
c8b4fccff7c3: Pull complete
a898f5b12168: Pull complete
0c9fa151e12b: Pull complete
e9f9e8f970e4: Pull complete
6cb19c6e7375: Pull complete
76db9e80d16b: Pull complete
Digest: sha256:c9184f9b071f2197f20a0064ed47f9dc8deac9f007e03d384ecbaad28a754124
Status: Downloaded newer image for nvcr.io/hpc/namd:2.12-171025
Charm++: standalone mode (not using charmrun)
Charm++> Running in Multicore mode: 40 threads
Charm++> Using recursive bisection (scheme 3) for topology aware partitions
Converse/Charm++ Commit ID: v6.8.2
Warning> Randomization of virtual memory (ASLR) is turned on in the kernel, thread migration may not work! Run ‘echo 0 > /proc/sys/kernel/randomize_va_space’ as root to disable it, or try running with ‘+isomalloc_sync’.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> cpu affinity enabled.
Charm++> Running on 1 unique compute nodes (64-way SMP).
Charm++> cpu topology info is gathered in 0.033 seconds.
Info: Built with CUDA version 9000
Did not find +devices i,j,k,… argument, using all
Pe 22 physical rank 22 will use CUDA device of pe 32
Pe 23 physical rank 23 will use CUDA device of pe 32
Pe 7 physical rank 7 will use CUDA device of pe 16
Pe 36 physical rank 36 will use CUDA device of pe 32
Pe 13 physical rank 13 will use CUDA device of pe 16
Pe 39 physical rank 39 will use CUDA device of pe 32
Pe 14 physical rank 14 will use CUDA device of pe 16
Pe 27 physical rank 27 will use CUDA device of pe 32
Pe 4 physical rank 4 will use CUDA device of pe 16
Pe 30 physical rank 30 will use CUDA device of pe 32
Pe 1 physical rank 1 will use CUDA device of pe 16
Pe 33 physical rank 33 will use CUDA device of pe 32
Pe 37 physical rank 37 will use CUDA device of pe 32
Pe 5 physical rank 5 will use CUDA device of pe 16
Pe 34 physical rank 34 will use CUDA device of pe 32
Pe 2 physical rank 2 will use CUDA device of pe 16
Pe 38 physical rank 38 will use CUDA device of pe 32
Pe 3 physical rank 3 will use CUDA device of pe 16
Pe 9 physical rank 9 will use CUDA device of pe 16
Pe 15 physical rank 15 will use CUDA device of pe 16
Pe 11 physical rank 11 will use CUDA device of pe 16
Pe 35 physical rank 35 will use CUDA device of pe 32
Pe 12 physical rank 12 will use CUDA device of pe 16
Pe 6 physical rank 6 will use CUDA device of pe 16
Pe 25 physical rank 25 will use CUDA device of pe 32
Pe 29 physical rank 29 will use CUDA device of pe 32
Pe 24 physical rank 24 will use CUDA device of pe 32
Pe 10 physical rank 10 will use CUDA device of pe 16
Pe 21 physical rank 21 will use CUDA device of pe 32
Pe 28 physical rank 28 will use CUDA device of pe 32
Pe 8 physical rank 8 will use CUDA device of pe 16
Pe 0 physical rank 0 will use CUDA device of pe 16
Pe 26 physical rank 26 will use CUDA device of pe 32
Pe 31 physical rank 31 will use CUDA device of pe 32
Pe 20 physical rank 20 will use CUDA device of pe 32
Pe 17 physical rank 17 will use CUDA device of pe 16
Pe 19 physical rank 19 will use CUDA device of pe 16
Pe 18 physical rank 18 will use CUDA device of pe 16
Pe 32 physical rank 32 binding to CUDA device 1 on 344ba69a45c4: ‘TITAN RTX’ Mem: 24219MB Rev: 7.5
Pe 16 physical rank 16 binding to CUDA device 0 on 344ba69a45c4: ‘TITAN RTX’ Mem: 24220MB Rev: 7.5
Info: Benchmark time: 40 CPUs 0.0407352 s/step 0.471472 days/ns
参考サイト