[chibi@centos8 ~]$ sudo nvidia-docker run --rm -ti nvcr.io/nvidia/tensorflow:19.04-py3 Unable to find image 'nvcr.io/nvidia/tensorflow:19.04-py3' locally 19.04-py3: Pulling from nvidia/tensorflow 34667c7e4631: Pulling fs layer d18d76a881a4: Pulling fs layer 119c7358fbfc: Pulling fs layer 2aaf13f3eff0: Pulling fs layer 202fa0f8874b: Pulling fs layer 3b700a61ede6: Pulling fs layer 87e6ca450d3f: Pulling fs layer a1e76dce1aec: Pulling fs layer 9b91fa2f9276: Pulling fs layer 2aaf13f3eff0: Waiting bab74df105f1: Pulling fs layer 534bbf505504: Pulling fs layer 4956bf3bbbb9: Pulling fs layer 202fa0f8874b: Waiting 4615a735431d: Pulling fs layer 3b700a61ede6: Waiting 629d5c9d75a4: Waiting 8071b94b5429: Pulling fs layer 87e6ca450d3f: Waiting e32e86c15b8b: Waiting 6eb8eba2ad5a: Waiting 4956bf3bbbb9: Waiting 3498ed8c5685: Pulling fs layer 62819d8896c1: Waiting f4371944c97d: Waiting 9b91fa2f9276: Waiting 4615a735431d: Waiting b5877a9add73: Waiting 5db2639932b5: Waiting a531832992b8: Pulling fs layer 4a95ca3431c4: Waiting b24a8fd8f2e1: Waiting e5cafe011f22: Pull complete eca19a329cd4: Pull complete 65ee50af0bcc: Pull complete 5f60ec8c32f4: Pull complete d7dcb657fa13: Pull complete 1f6ef6575fbe: Pull complete d1ef346a3015: Pull complete 4ef9cb404fd5: Pull complete f6797f45a018: Pull complete 1d4380527325: Pull complete 965f2629db02: Pull complete 5debff4c8c0a: Pull complete b3a3a9d82be6: Pull complete eac05f20b729: Pull complete 3ce0a7f80167: Pull complete 2a21e34a5784: Pull complete c1ccf19e258e: Pull complete 0b6ea9d0652b: Pull complete 307bc8c3f024: Pull complete ca75fd593a79: Pull complete 0cd3cdca1af7: Pull complete 48e857e9d372: Pull complete 3264ea403ca9: Pull complete Digest: sha256:aaebc136d5d50937362675c77afd908bd96cded68846f39163050a023c8a9851 Status: Downloaded newer image for nvcr.io/nvidia/tensorflow:19.04-py3 ================ == TensorFlow == ================ NVIDIA Release 19.04 (build 6132408) TensorFlow Version 1.13.1 Container image Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. Copyright 2017-2019 The TensorFlow Authors. All rights reserved. Various files include modifications (c) NVIDIA CORPORATION. All rights reserved. NVIDIA modifications are covered by the license terms that apply to the underlying project or file. NOTE: MOFED driver for multi-node communication was not detected. Multi-node communication performance may be reduced. NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be insufficient for TensorFlow. NVIDIA recommends the use of the following flags: nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ... root@84792c578a3b:/workspace# ls README.md docker-examples nvidia-examples root@84792c578a3b:/workspace# cd nvidia-examples root@84792c578a3b:/workspace/nvidia-examples# ls NCF bert cnn ssdv1.2 OpenSeq2Seq big_lstm gnmt_v2 tensorrt UNet_Industrial build_imagenet_data resnet50v1.5 root@84792c578a3b:/workspace/nvidia-examples# cd big_lstm root@84792c578a3b:/workspace/nvidia-examples/big_lstm# ls 1b_word_vocab.txt data_utils_test.py language_model_test.py README.md download_1b_words_data.sh model_utils.py __init__.py hparams.py run_utils.py common.py hparams_test.py single_lm_train.py data_utils.py language_model.py testdata root@84792c578a3b:/workspace/nvidia-examples/big_lstm# ./download_1b_words_data.sh Please specify root of dataset directory: data Success: dataset root dir validated --2020-06-05 07:40:32-- http://www.statmt.org/lm-benchmark/1-billion-word-language-modeling-benchmark-r13output.tar.gz Resolving www.statmt.org (www.statmt.org)... 129.215.197.184 Connecting to www.statmt.org (www.statmt.org)|129.215.197.184|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 1792209805 (1.7G) [application/x-gzip] Saving to: ‘1-billion-word-language-modeling-benchmark-r13output.tar.gz’ 1-billion-word-lang 100%[===================>] 1.67G 32.6KB/s in 5h 48m . 2020-06-05 13:28:43 (83.8 KB/s) - ‘1-billion-word-language-modeling-benchmark-r13output.tar.gz’ saved [1792209805/1792209805] 1-billion-word-language-modeling-benchmark-r13output/ 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/ 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00024-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00057-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00055-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00096-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00081-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00033-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00072-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00082-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00018-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00008-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00059-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00005-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00091-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00062-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00031-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00095-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00076-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00006-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00038-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00015-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00087-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00021-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00049-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00009-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00027-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00056-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00046-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00032-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00029-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00088-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00085-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00011-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00012-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00067-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00003-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00093-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00050-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00053-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00044-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00019-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00066-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00028-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00045-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00039-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00071-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00052-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00078-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00037-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00002-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00014-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00048-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00017-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00004-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00077-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00080-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00020-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00051-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00016-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00079-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00043-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00068-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00099-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00064-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00034-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00054-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00040-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00070-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00063-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00041-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00083-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00061-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00073-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00094-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00030-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00060-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00035-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00023-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00042-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00025-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00090-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00089-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00065-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00075-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00022-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00026-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00098-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00084-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00010-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00069-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00013-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00092-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00036-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00097-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00007-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00074-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00001-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00047-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00086-of-00100 1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00058-of-00100 1-billion-word-language-modeling-benchmark-r13output/.svn/ 1-billion-word-language-modeling-benchmark-r13output/.svn/tmp/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/de/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/de/de102cd0c91cd19e6612f0840e68a2f20ba8134c.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/de/deed1b75d3bd5cc36ae6aeb85d56680b892b7948.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/86/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/86/86c58db52fbf362c5bc329afc33b8805085fcb0d.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/9f/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/9f/9f2882e21f860a83ad6ea8898ebab140974ed301.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/bc/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/bc/bcdbc523ee7488dc438cab869b6d5e236578dbfa.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/d2/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/d2/d2718bc26d0ee0a213d7d4add99a304cb5b39ede.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/c5/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/c5/c5b24f61479da923123d0394a188da922ea0359c.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/11/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/11/116d6ea61730d8199127596b072e981338597779.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/b0/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/b0/b0e26559cfe641245584a9400b35ba28d64f1411.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/d3/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/d3/d3ae508e3bcb0e696dd70aecd052410f1f7afc1d.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/9e/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/9e/9e148bd766e8805e0eb97eeae250433ec7a2e996.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/31/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/31/31b645a482e0b81fda3c567cada307c6fcf7ec80.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/da/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/da/da39a3ee5e6b4b0d3255bfef95601890afd80709.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/c1/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/c1/c1ed42c415ec884e591fb5c70d373da640a383b5.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/e3/ 1-billion-word-language-modeling-benchmark-r13output/.svn/pristine/e3/e37ba0f85e94073ccaced1eed7e4f5d737a25f49.svn-base 1-billion-word-language-modeling-benchmark-r13output/.svn/entries 1-billion-word-language-modeling-benchmark-r13output/.svn/format 1-billion-word-language-modeling-benchmark-r13output/.svn/wc.db 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/ 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00015-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00031-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00027-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00010-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00033-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00042-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00046-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00037-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00000-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00029-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00013-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00002-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00048-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00006-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00030-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00025-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00039-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00008-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00020-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00001-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00034-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00044-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00045-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00016-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00004-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00035-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00038-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00009-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00024-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00022-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00021-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00032-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00011-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00049-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00041-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00019-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00023-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00040-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00014-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00007-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00017-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00012-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00018-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00003-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00028-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en-00000-of-00100 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00043-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00005-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00036-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00026-of-00050 1-billion-word-language-modeling-benchmark-r13output/heldout-monolingual.tokenized.shuffled/news.en.heldout-00047-of-00050 1-billion-word-language-modeling-benchmark-r13output/README Success! One billion words dataset ready at: data/1-billion-word-language-modeling-benchmark-r13output/ Please pass this dir to single_lm_train.py via the --datadir option. root@84792c578a3b:/workspace/nvidia-examples/big_lstm# .time python single_lm_train.py --mode=train --logdir=./logs --num_gpus=4 --datadir=./data/1-billion-word-language-modeling-benchmark-r13output bash: .time: command not found root@84792c578a3b:/workspace/nvidia-examples/big_lstm# time python single_lm_tra in.py --mode=train --logdir=./logs --num_gpus=4 --datadir=./data/1-billion-word- language-modeling-benchmark-r13output WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see: * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md * https://github.com/tensorflow/addons If you depend on functionality not listed there, please file an issue. *****HYPER PARAMETERS***** {'num_gpus': 4, 'learning_rate': 0.2, 'batch_size': 128, 'max_grad_norm': 10.0, 'num_delayed_steps': 150, 'projected_size': 512, 'num_shards': 8, 'vocab_size': 793470, 'state_size': 2048, 'average_params': True, 'keep_prob': 0.9, 'max_time': 180, 'emb_size': 512, 'num_steps': 20, 'do_summaries': False, 'num_layers': 1, 'run_profiler': False, 'optimizer': 0, 'num_sampled': 8192} ************************** WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/model_utils.py:33: UniformUnitScaling.__init__ (from tensorflow.python.ops.init_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.initializers.variance_scaling instead with distribution=uniform to get equivalent behavior. WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/language_model.py:75: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version. Instructions for updating: Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`. WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/language_model.py:107: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_impl.py:1444: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version. Instructions for updating: Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_grad.py:425: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. Current time: 1591372795.3405795 ALL VARIABLES WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/run_utils.py:18: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02. Instructions for updating: Please use tf.global_variables instead. model/emb_0:0 (99184, 512) /gpu:0 model/emb_1:0 (99184, 512) /gpu:0 model/emb_2:0 (99184, 512) /gpu:0 model/emb_3:0 (99184, 512) /gpu:0 model/emb_4:0 (99184, 512) /gpu:0 model/emb_5:0 (99184, 512) /gpu:0 model/emb_6:0 (99184, 512) /gpu:0 model/emb_7:0 (99184, 512) /gpu:0 model/lstm_0/LSTMCell/W_0:0 (1024, 8192) /gpu:0 model/lstm_0/LSTMCell/B:0 (8192,) /gpu:0 model/lstm_0/LSTMCell/W_P_0:0 (2048, 512) /gpu:0 model/softmax_w_0:0 (99184, 512) /gpu:0 model/softmax_w_1:0 (99184, 512) /gpu:0 model/softmax_w_2:0 (99184, 512) /gpu:0 model/softmax_w_3:0 (99184, 512) /gpu:0 model/softmax_w_4:0 (99184, 512) /gpu:0 model/softmax_w_5:0 (99184, 512) /gpu:0 model/softmax_w_6:0 (99184, 512) /gpu:0 model/softmax_w_7:0 (99184, 512) /gpu:0 model/softmax_b:0 (793470,) /gpu:0 model/global_step:0 () model/model/emb_0/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_1/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_2/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_3/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_4/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_5/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_6/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_7/Adagrad:0 (99184, 512) /gpu:0 model/model/lstm_0/LSTMCell/W_0/Adagrad:0 (1024, 8192) /gpu:0 model/model/lstm_0/LSTMCell/B/Adagrad:0 (8192,) /gpu:0 model/model/lstm_0/LSTMCell/W_P_0/Adagrad:0 (2048, 512) /gpu:0 model/model/softmax_w_0/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_1/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_2/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_3/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_4/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_5/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_6/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_7/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_b/Adagrad:0 (793470,) /gpu:0 model/model/lstm_0/LSTMCell/W_0/ExponentialMovingAverage:0 (1024, 8192) /gpu:0 model/model/lstm_0/LSTMCell/B/ExponentialMovingAverage:0 (8192,) /gpu:0 model/model/lstm_0/LSTMCell/W_P_0/ExponentialMovingAverage:0 (2048, 512) /gpu:0 TRAINABLE VARIABLES model/emb_0:0 (99184, 512) /gpu:0 model/emb_1:0 (99184, 512) /gpu:0 model/emb_2:0 (99184, 512) /gpu:0 model/emb_3:0 (99184, 512) /gpu:0 model/emb_4:0 (99184, 512) /gpu:0 model/emb_5:0 (99184, 512) /gpu:0 model/emb_6:0 (99184, 512) /gpu:0 model/emb_7:0 (99184, 512) /gpu:0 model/lstm_0/LSTMCell/W_0:0 (1024, 8192) /gpu:0 model/lstm_0/LSTMCell/B:0 (8192,) /gpu:0 model/lstm_0/LSTMCell/W_P_0:0 (2048, 512) /gpu:0 model/softmax_w_0:0 (99184, 512) /gpu:0 model/softmax_w_1:0 (99184, 512) /gpu:0 model/softmax_w_2:0 (99184, 512) /gpu:0 model/softmax_w_3:0 (99184, 512) /gpu:0 model/softmax_w_4:0 (99184, 512) /gpu:0 model/softmax_w_5:0 (99184, 512) /gpu:0 model/softmax_w_6:0 (99184, 512) /gpu:0 model/softmax_w_7:0 (99184, 512) /gpu:0 model/softmax_b:0 (793470,) /gpu:0 LOCAL VARIABLES model/model/state_0_0:0 (128, 2560) /gpu:0 model/model_1/state_1_0:0 (128, 2560) /gpu:1 model/model_2/state_2_0:0 (128, 2560) /gpu:2 model/model_3/state_3_0:0 (128, 2560) /gpu:3 WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/run_utils.py:32: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.MonitoredTrainingSession 2020-06-05 15:59:56.014467: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2900080000 Hz 2020-06-05 15:59:56.022230: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0xbe7b9d0 executing computations on platform Host. Devices: 2020-06-05 15:59:56.022278: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (0): , 2020-06-05 15:59:56.432326: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-06-05 15:59:56.469365: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-06-05 15:59:56.481492: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-06-05 15:59:56.482651: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0xbe7b3f0 executing computations on platform CUDA. Devices: 2020-06-05 15:59:56.482670: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (0): TITAN RTX, Compute Capability 7.5 2020-06-05 15:59:56.482677: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (1): TITAN RTX, Compute Capability 7.5 2020-06-05 15:59:56.482682: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (2): GeForce RTX 2080 Ti, Compute Capability 7.5 2020-06-05 15:59:56.482690: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (3): GeForce RTX 2080 Ti, Compute Capability 7.5 2020-06-05 15:59:56.483718: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77 pciBusID: 0000:01:00.0 totalMemory: 23.65GiB freeMemory: 23.22GiB 2020-06-05 15:59:56.483748: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties: name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77 pciBusID: 0000:21:00.0 totalMemory: 23.65GiB freeMemory: 23.48GiB 2020-06-05 15:59:56.483770: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 2 with properties: name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.635 pciBusID: 0000:4a:00.0 totalMemory: 10.76GiB freeMemory: 10.60GiB 2020-06-05 15:59:56.483794: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 3 with properties: name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.635 pciBusID: 0000:4b:00.0 totalMemory: 10.76GiB freeMemory: 10.60GiB 2020-06-05 15:59:56.483932: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1, 2, 3 2020-06-05 15:59:57.279477: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-06-05 15:59:57.279514: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 1 2 3 2020-06-05 15:59:57.279519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N N N N 2020-06-05 15:59:57.279523: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1: N N N N 2020-06-05 15:59:57.279528: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 2: N N N N 2020-06-05 15:59:57.279533: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 3: N N N N 2020-06-05 15:59:57.279666: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22500 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:01:00.0, compute capability: 7.5) 2020-06-05 15:59:57.280315: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 22757 MB memory) -> physical GPU (device: 1, name: TITAN RTX, pci bus id: 0000:21:00.0, compute capability: 7.5) 2020-06-05 15:59:57.280488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 10224 MB memory) -> physical GPU (device: 2, name: GeForce RTX 2080 Ti, pci bus id: 0000:4a:00.0, compute capability: 7.5) 2020-06-05 15:59:57.280682: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 10224 MB memory) -> physical GPU (device: 3, name: GeForce RTX 2080 Ti, pci bus id: 0000:4b:00.0, compute capability: 7.5) Processing file: ./data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00091-of-00100 Finished processing! 2020-06-05 16:00:16.942284: I tensorflow/stream_executor/dso_loader.cc:153] successfully opened CUDA library libcublas.so.10 locally Iteration 1, time = 10.13s, wps = 1011, train loss = 13.0261 Iteration 2, time = 7.96s, wps = 1287, train loss = 12.9453 Iteration 3, time = 0.12s, wps = 87100, train loss = 12.8457 Iteration 4, time = 0.10s, wps = 100150, train loss = 11.4972 Iteration 5, time = 0.10s, wps = 104007, train loss = 12.7629 Iteration 6, time = 0.10s, wps = 104707, train loss = 11.8297 Iteration 7, time = 0.10s, wps = 105679, train loss = 70.5868 Iteration 8, time = 0.09s, wps = 108187, train loss = 31.3149 Iteration 9, time = 0.10s, wps = 106176, train loss = 14.8355 Iteration 20, time = 1.05s, wps = 107768, train loss = 10.9780 Iteration 40, time = 1.88s, wps = 108922, train loss = 13.3108 Iteration 60, time = 1.89s, wps = 108382, train loss = 9.4566 Iteration 80, time = 1.87s, wps = 109356, train loss = 8.0756 Iteration 100, time = 1.88s, wps = 108886, train loss = 8.0821 Iteration 120, time = 1.89s, wps = 108350, train loss = 8.2249 Iteration 140, time = 1.89s, wps = 108586, train loss = 7.5883 Iteration 160, time = 1.88s, wps = 108934, train loss = 7.0517 Iteration 180, time = 1.90s, wps = 107528, train loss = 6.6798 Iteration 200, time = 1.87s, wps = 109323, train loss = 6.5299 Iteration 220, time = 1.89s, wps = 108518, train loss = 6.3988 Iteration 240, time = 1.89s, wps = 108598, train loss = 6.2627 Iteration 260, time = 1.89s, wps = 108429, train loss = 6.2382 Iteration 280, time = 1.90s, wps = 107708, train loss = 6.1207 Iteration 300, time = 1.88s, wps = 108835, train loss = 6.0924 Iteration 320, time = 1.88s, wps = 108860, train loss = 6.0301 Iteration 340, time = 1.90s, wps = 107878, train loss = 5.9870 Iteration 360, time = 1.88s, wps = 108841, train loss = 6.0093 Iteration 380, time = 1.88s, wps = 108748, train loss = 5.9170 Iteration 400, time = 1.90s, wps = 107850, train loss = 5.8868 Iteration 420, time = 1.90s, wps = 107853, train loss = 5.8923 Iteration 440, time = 1.88s, wps = 109180, train loss = 5.8416 Iteration 460, time = 1.88s, wps = 108754, train loss = 5.9032 Iteration 480, time = 1.90s, wps = 107898, train loss = 5.7667 Iteration 500, time = 1.89s, wps = 108579, train loss = 5.7084 Iteration 520, time = 1.88s, wps = 108971, train loss = 5.6963 Iteration 540, time = 1.88s, wps = 108711, train loss = 5.6646 Iteration 560, time = 1.89s, wps = 108212, train loss = 5.6641 Iteration 580, time = 1.89s, wps = 108642, train loss = 5.5762 Iteration 600, time = 1.89s, wps = 108420, train loss = 5.6005 Iteration 620, time = 1.90s, wps = 108047, train loss = 5.5164 Iteration 640, time = 1.89s, wps = 108251, train loss = 5.5491 Iteration 660, time = 1.89s, wps = 108263, train loss = 5.5376 Iteration 680, time = 1.92s, wps = 106841, train loss = 5.5372 Iteration 700, time = 1.90s, wps = 107516, train loss = 5.5495 Iteration 720, time = 1.89s, wps = 108454, train loss = 5.3899 Iteration 740, time = 1.90s, wps = 107655, train loss = 5.4728 Iteration 760, time = 1.90s, wps = 107984, train loss = 5.4817 Iteration 780, time = 1.89s, wps = 108225, train loss = 5.3296 Processing file: ./data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00055-of-00100 Finished processing! Iteration 800, time = 3.51s, wps = 58362, train loss = 5.4279 Iteration 820, time = 1.89s, wps = 108254, train loss = 5.3740 Iteration 840, time = 1.88s, wps = 108849, train loss = 5.3310 Iteration 860, time = 1.89s, wps = 108142, train loss = 5.3180 Iteration 880, time = 1.91s, wps = 107242, train loss = 5.3198 Iteration 900, time = 1.91s, wps = 107441, train loss = 5.2913 Iteration 920, time = 1.88s, wps = 108916, train loss = 5.3182 Iteration 940, time = 1.88s, wps = 108873, train loss = 5.2879 Iteration 960, time = 1.90s, wps = 108073, train loss = 5.2520 Iteration 980, time = 1.89s, wps = 108253, train loss = 5.2595 Iteration 1000, time = 1.89s, wps = 108226, train loss = 5.2372 Iteration 1020, time = 1.92s, wps = 106816, train loss = 5.2344 Iteration 1040, time = 1.89s, wps = 108427, train loss = 5.1966 Iteration 1060, time = 1.91s, wps = 107503, train loss = 5.2576 Iteration 1080, time = 1.90s, wps = 107816, train loss = 5.2299 Iteration 1100, time = 1.89s, wps = 108307, train loss = 5.1964 Iteration 1120, time = 1.89s, wps = 108230, train loss = 5.1467 Iteration 1140, time = 1.89s, wps = 108092, train loss = 5.2071 Iteration 1160, time = 1.89s, wps = 108124, train loss = 5.1907 Iteration 1180, time = 1.88s, wps = 108736, train loss = 5.2074 Iteration 1200, time = 1.89s, wps = 108604, train loss = 5.1304 Iteration 1220, time = 1.91s, wps = 107491, train loss = 5.0821 Iteration 1240, time = 1.91s, wps = 107406, train loss = 5.0945 Iteration 1260, time = 1.89s, wps = 108176, train loss = 5.0748 Iteration 1280, time = 1.89s, wps = 108560, train loss = 5.1071 Iteration 1300, time = 1.89s, wps = 108105, train loss = 5.0596 Iteration 1320, time = 1.90s, wps = 107768, train loss = 5.0051 Iteration 1340, time = 1.91s, wps = 107381, train loss = 4.9719 Iteration 1360, time = 1.90s, wps = 107793, train loss = 5.0852 Iteration 1380, time = 1.89s, wps = 108444, train loss = 5.0161 Iteration 1400, time = 1.90s, wps = 107812, train loss = 5.0230 Iteration 1420, time = 1.90s, wps = 107938, train loss = 5.0099 Iteration 1440, time = 1.90s, wps = 107763, train loss = 4.9443 Iteration 1460, time = 1.90s, wps = 107851, train loss = 4.9851 Iteration 1480, time = 1.91s, wps = 107124, train loss = 4.9288 Iteration 1500, time = 1.92s, wps = 106776, train loss = 4.9733 Iteration 1520, time = 1.89s, wps = 108161, train loss = 4.9342 Iteration 1540, time = 1.89s, wps = 108192, train loss = 4.9511 Iteration 1560, time = 1.91s, wps = 107393, train loss = 4.9846 /usr/local/lib/python3.5/dist-packages/tensorflow/python/summary/writer/writer.py:386: UserWarning: Attempting to use a closed FileWriter. The operation will be a noop unless the FileWriter is explicitly reopened. warnings.warn("Attempting to use a closed FileWriter. " real 3m13.759s user 23m49.256s sys 4m54.227s root@84792c578a3b:/workspace/nvidia-examples/big_lstm# time python single_lm_train.py --mode=train --logdir=./logs --num_gpus=3 --datadir=./data/1-billion-word- language-modeling-benchmark-r13output WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see: * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md * https://github.com/tensorflow/addons If you depend on functionality not listed there, please file an issue. *****HYPER PARAMETERS***** {'num_steps': 20, 'num_delayed_steps': 150, 'num_sampled': 8192, 'projected_size': 512, 'num_gpus': 3, 'keep_prob': 0.9, 'vocab_size': 793470, 'average_params': True, 'learning_rate': 0.2, 'do_summaries': False, 'max_time': 180, 'max_grad_norm': 10.0, 'run_profiler': False, 'num_shards': 8, 'num_layers': 1, 'optimizer': 0, 'batch_size': 128, 'emb_size': 512, 'state_size': 2048} ************************** WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/model_utils.py:33: UniformUnitScaling.__init__ (from tensorflow.python.ops.init_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.initializers.variance_scaling instead with distribution=uniform to get equivalent behavior. WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/language_model.py:75: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version. Instructions for updating: Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`. WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/language_model.py:107: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_impl.py:1444: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version. Instructions for updating: Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_grad.py:425: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. Current time: 1591375181.1564753 ALL VARIABLES WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/run_utils.py:18: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02. Instructions for updating: Please use tf.global_variables instead. model/emb_0:0 (99184, 512) /gpu:0 model/emb_1:0 (99184, 512) /gpu:0 model/emb_2:0 (99184, 512) /gpu:0 model/emb_3:0 (99184, 512) /gpu:0 model/emb_4:0 (99184, 512) /gpu:0 model/emb_5:0 (99184, 512) /gpu:0 model/emb_6:0 (99184, 512) /gpu:0 model/emb_7:0 (99184, 512) /gpu:0 model/lstm_0/LSTMCell/W_0:0 (1024, 8192) /gpu:0 model/lstm_0/LSTMCell/B:0 (8192,) /gpu:0 model/lstm_0/LSTMCell/W_P_0:0 (2048, 512) /gpu:0 model/softmax_w_0:0 (99184, 512) /gpu:0 model/softmax_w_1:0 (99184, 512) /gpu:0 model/softmax_w_2:0 (99184, 512) /gpu:0 model/softmax_w_3:0 (99184, 512) /gpu:0 model/softmax_w_4:0 (99184, 512) /gpu:0 model/softmax_w_5:0 (99184, 512) /gpu:0 model/softmax_w_6:0 (99184, 512) /gpu:0 model/softmax_w_7:0 (99184, 512) /gpu:0 model/softmax_b:0 (793470,) /gpu:0 model/global_step:0 () model/model/emb_0/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_1/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_2/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_3/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_4/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_5/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_6/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_7/Adagrad:0 (99184, 512) /gpu:0 model/model/lstm_0/LSTMCell/W_0/Adagrad:0 (1024, 8192) /gpu:0 model/model/lstm_0/LSTMCell/B/Adagrad:0 (8192,) /gpu:0 model/model/lstm_0/LSTMCell/W_P_0/Adagrad:0 (2048, 512) /gpu:0 model/model/softmax_w_0/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_1/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_2/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_3/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_4/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_5/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_6/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_7/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_b/Adagrad:0 (793470,) /gpu:0 model/model/lstm_0/LSTMCell/W_0/ExponentialMovingAverage:0 (1024, 8192) /gpu:0 model/model/lstm_0/LSTMCell/B/ExponentialMovingAverage:0 (8192,) /gpu:0 model/model/lstm_0/LSTMCell/W_P_0/ExponentialMovingAverage:0 (2048, 512) /gpu:0 TRAINABLE VARIABLES model/emb_0:0 (99184, 512) /gpu:0 model/emb_1:0 (99184, 512) /gpu:0 model/emb_2:0 (99184, 512) /gpu:0 model/emb_3:0 (99184, 512) /gpu:0 model/emb_4:0 (99184, 512) /gpu:0 model/emb_5:0 (99184, 512) /gpu:0 model/emb_6:0 (99184, 512) /gpu:0 model/emb_7:0 (99184, 512) /gpu:0 model/lstm_0/LSTMCell/W_0:0 (1024, 8192) /gpu:0 model/lstm_0/LSTMCell/B:0 (8192,) /gpu:0 model/lstm_0/LSTMCell/W_P_0:0 (2048, 512) /gpu:0 model/softmax_w_0:0 (99184, 512) /gpu:0 model/softmax_w_1:0 (99184, 512) /gpu:0 model/softmax_w_2:0 (99184, 512) /gpu:0 model/softmax_w_3:0 (99184, 512) /gpu:0 model/softmax_w_4:0 (99184, 512) /gpu:0 model/softmax_w_5:0 (99184, 512) /gpu:0 model/softmax_w_6:0 (99184, 512) /gpu:0 model/softmax_w_7:0 (99184, 512) /gpu:0 model/softmax_b:0 (793470,) /gpu:0 LOCAL VARIABLES model/model/state_0_0:0 (128, 2560) /gpu:0 model/model_1/state_1_0:0 (128, 2560) /gpu:1 model/model_2/state_2_0:0 (128, 2560) /gpu:2 WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/run_utils.py:32: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.MonitoredTrainingSession 2020-06-05 16:39:41.689476: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2900080000 Hz 2020-06-05 16:39:41.697223: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0xa53d9f0 executing computations on platform Host. Devices: 2020-06-05 16:39:41.697267: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (0): , 2020-06-05 16:39:42.170389: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-06-05 16:39:42.216198: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-06-05 16:39:42.223481: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-06-05 16:39:42.224380: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0xa53d410 executing computations on platform CUDA. Devices: 2020-06-05 16:39:42.224396: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (0): TITAN RTX, Compute Capability 7.5 2020-06-05 16:39:42.224400: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (1): TITAN RTX, Compute Capability 7.5 2020-06-05 16:39:42.224405: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (2): GeForce RTX 2080 Ti, Compute Capability 7.5 2020-06-05 16:39:42.224412: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (3): GeForce RTX 2080 Ti, Compute Capability 7.5 2020-06-05 16:39:42.225370: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77 pciBusID: 0000:01:00.0 totalMemory: 23.65GiB freeMemory: 23.22GiB 2020-06-05 16:39:42.225398: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties: name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77 pciBusID: 0000:21:00.0 totalMemory: 23.65GiB freeMemory: 23.48GiB 2020-06-05 16:39:42.225437: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 2 with properties: name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.635 pciBusID: 0000:4a:00.0 totalMemory: 10.76GiB freeMemory: 10.60GiB 2020-06-05 16:39:42.225462: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 3 with properties: name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.635 pciBusID: 0000:4b:00.0 totalMemory: 10.76GiB freeMemory: 10.60GiB 2020-06-05 16:39:42.225604: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1, 2, 3 2020-06-05 16:39:43.004069: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-06-05 16:39:43.004106: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 1 2 3 2020-06-05 16:39:43.004111: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N N N N 2020-06-05 16:39:43.004115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1: N N N N 2020-06-05 16:39:43.004120: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 2: N N N N 2020-06-05 16:39:43.004125: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 3: N N N N 2020-06-05 16:39:43.004245: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22500 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:01:00.0, compute capability: 7.5) 2020-06-05 16:39:43.004618: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 22757 MB memory) -> physical GPU (device: 1, name: TITAN RTX, pci bus id: 0000:21:00.0, compute capability: 7.5) 2020-06-05 16:39:43.004899: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 10224 MB memory) -> physical GPU (device: 2, name: GeForce RTX 2080 Ti, pci bus id: 0000:4a:00.0, compute capability: 7.5) 2020-06-05 16:39:43.005106: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 10224 MB memory) -> physical GPU (device: 3, name: GeForce RTX 2080 Ti, pci bus id: 0000:4b:00.0, compute capability: 7.5) WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py:1070: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file utilities to get mtimes. Processing file: ./data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00063-of-00100 Finished processing! 2020-06-05 16:39:55.664013: I tensorflow/stream_executor/dso_loader.cc:153] successfully opened CUDA library libcublas.so.10 locally Iteration 1565, time = 7.97s, wps = 963, train loss = 5.3077 Iteration 1566, time = 5.91s, wps = 1299, train loss = 5.0201 Iteration 1567, time = 0.08s, wps = 96034, train loss = 5.1090 Iteration 1568, time = 0.08s, wps = 92438, train loss = 4.9818 Iteration 1569, time = 0.08s, wps = 98757, train loss = 4.9726 Iteration 1570, time = 0.08s, wps = 102001, train loss = 4.9602 Iteration 1571, time = 0.07s, wps = 102654, train loss = 4.9898 Iteration 1572, time = 0.08s, wps = 99452, train loss = 4.9471 Iteration 1573, time = 0.08s, wps = 100000, train loss = 4.9239 Iteration 1584, time = 0.83s, wps = 101399, train loss = 4.9684 Iteration 1604, time = 1.52s, wps = 100858, train loss = 4.9880 Iteration 1624, time = 1.51s, wps = 101715, train loss = 4.9801 Iteration 1644, time = 1.51s, wps = 101519, train loss = 4.9577 Iteration 1664, time = 1.52s, wps = 101026, train loss = 4.9207 Iteration 1684, time = 1.52s, wps = 101007, train loss = 4.9540 Iteration 1704, time = 1.52s, wps = 101129, train loss = 4.8382 Iteration 1724, time = 1.53s, wps = 100351, train loss = 4.8272 Iteration 1744, time = 1.51s, wps = 101860, train loss = 4.8951 Iteration 1764, time = 1.53s, wps = 100467, train loss = 4.8796 Iteration 1784, time = 1.52s, wps = 101021, train loss = 4.9188 Iteration 1804, time = 1.52s, wps = 101375, train loss = 4.8301 Iteration 1824, time = 1.53s, wps = 100690, train loss = 4.8422 Iteration 1844, time = 1.52s, wps = 100919, train loss = 4.8151 Iteration 1864, time = 1.52s, wps = 100871, train loss = 4.8794 Iteration 1884, time = 1.52s, wps = 100813, train loss = 4.9041 Iteration 1904, time = 1.52s, wps = 101056, train loss = 4.8676 Iteration 1924, time = 1.52s, wps = 101131, train loss = 4.8051 Iteration 1944, time = 1.52s, wps = 101325, train loss = 4.8906 Iteration 1964, time = 1.51s, wps = 101482, train loss = 4.7368 Iteration 1984, time = 1.53s, wps = 100351, train loss = 4.8789 Iteration 2004, time = 1.53s, wps = 100440, train loss = 4.7503 Iteration 2024, time = 1.54s, wps = 99678, train loss = 4.8795 Iteration 2044, time = 1.52s, wps = 100930, train loss = 4.7845 Iteration 2064, time = 1.52s, wps = 101126, train loss = 4.7564 Iteration 2084, time = 1.52s, wps = 100793, train loss = 4.7861 Iteration 2104, time = 1.52s, wps = 100792, train loss = 4.8106 Iteration 2124, time = 1.52s, wps = 100739, train loss = 4.7396 Iteration 2144, time = 1.52s, wps = 100855, train loss = 4.7555 Iteration 2164, time = 1.52s, wps = 100742, train loss = 4.7910 Iteration 2184, time = 1.53s, wps = 100359, train loss = 4.7509 Iteration 2204, time = 1.52s, wps = 100938, train loss = 4.7113 Iteration 2224, time = 1.54s, wps = 100040, train loss = 4.7188 Iteration 2244, time = 1.52s, wps = 101014, train loss = 4.7292 Iteration 2264, time = 1.52s, wps = 101041, train loss = 4.8142 Iteration 2284, time = 1.52s, wps = 101110, train loss = 4.6939 Iteration 2304, time = 1.52s, wps = 100753, train loss = 4.7228 Iteration 2324, time = 1.53s, wps = 100506, train loss = 4.7184 Iteration 2344, time = 1.53s, wps = 100087, train loss = 4.7162 Iteration 2364, time = 1.53s, wps = 100392, train loss = 4.7408 Iteration 2384, time = 1.52s, wps = 101224, train loss = 4.6466 Iteration 2404, time = 1.53s, wps = 100617, train loss = 4.7383 Iteration 2424, time = 1.53s, wps = 100641, train loss = 4.6949 Iteration 2444, time = 1.53s, wps = 100345, train loss = 4.7459 Iteration 2464, time = 1.53s, wps = 100533, train loss = 4.6758 Iteration 2484, time = 1.53s, wps = 100609, train loss = 4.6879 Iteration 2504, time = 1.52s, wps = 101081, train loss = 4.7634 Iteration 2524, time = 1.54s, wps = 99805, train loss = 4.6684 Iteration 2544, time = 1.54s, wps = 99806, train loss = 4.6473 Iteration 2564, time = 1.53s, wps = 100684, train loss = 4.7262 Iteration 2584, time = 1.53s, wps = 100438, train loss = 4.6528 Iteration 2604, time = 1.53s, wps = 100211, train loss = 4.6423 Processing file: ./data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00082-of-00100 Finished processing! Iteration 2624, time = 3.12s, wps = 49198, train loss = 4.6994 Iteration 2644, time = 1.53s, wps = 100456, train loss = 4.7074 Iteration 2664, time = 1.53s, wps = 100608, train loss = 4.6693 Iteration 2684, time = 1.52s, wps = 100962, train loss = 4.5933 Iteration 2704, time = 1.54s, wps = 99971, train loss = 4.6359 Iteration 2724, time = 1.53s, wps = 100499, train loss = 4.6055 Iteration 2744, time = 1.53s, wps = 100661, train loss = 4.5642 Iteration 2764, time = 1.54s, wps = 99838, train loss = 4.6715 Iteration 2784, time = 1.53s, wps = 100576, train loss = 4.6754 Iteration 2804, time = 1.52s, wps = 100861, train loss = 4.6031 Iteration 2824, time = 1.53s, wps = 100103, train loss = 4.5900 Iteration 2844, time = 1.52s, wps = 100792, train loss = 4.6994 Iteration 2864, time = 1.52s, wps = 100985, train loss = 4.5902 Iteration 2884, time = 1.53s, wps = 100256, train loss = 4.5989 Iteration 2904, time = 1.53s, wps = 100328, train loss = 4.5632 Iteration 2924, time = 1.54s, wps = 99845, train loss = 4.5489 Iteration 2944, time = 1.52s, wps = 101167, train loss = 4.6305 Iteration 2964, time = 1.52s, wps = 100888, train loss = 4.6078 Iteration 2984, time = 1.54s, wps = 100011, train loss = 4.6886 Iteration 3004, time = 1.53s, wps = 100366, train loss = 4.5708 Iteration 3024, time = 1.53s, wps = 100441, train loss = 4.6014 Iteration 3044, time = 1.52s, wps = 100916, train loss = 4.6029 Iteration 3064, time = 1.54s, wps = 99999, train loss = 4.6145 Iteration 3084, time = 1.53s, wps = 100581, train loss = 4.6258 Iteration 3104, time = 1.54s, wps = 99501, train loss = 4.5920 Iteration 3124, time = 1.56s, wps = 98346, train loss = 4.6124 Iteration 3144, time = 1.54s, wps = 99956, train loss = 4.6159 Iteration 3164, time = 1.52s, wps = 100734, train loss = 4.5657 Iteration 3184, time = 1.55s, wps = 99079, train loss = 4.6364 Iteration 3204, time = 1.54s, wps = 99785, train loss = 4.5126 Iteration 3224, time = 1.53s, wps = 100272, train loss = 4.5089 Iteration 3244, time = 1.53s, wps = 100491, train loss = 4.5832 Iteration 3264, time = 1.54s, wps = 100007, train loss = 4.5721 Iteration 3284, time = 1.53s, wps = 100602, train loss = 4.6088 Iteration 3304, time = 1.53s, wps = 100435, train loss = 4.5931 Iteration 3324, time = 1.52s, wps = 100922, train loss = 4.5719 Iteration 3344, time = 1.53s, wps = 100578, train loss = 4.5366 Iteration 3364, time = 1.54s, wps = 99985, train loss = 4.5832 Iteration 3384, time = 1.53s, wps = 100233, train loss = 4.5192 Iteration 3404, time = 1.54s, wps = 99915, train loss = 4.6047 Iteration 3424, time = 1.54s, wps = 99581, train loss = 4.5225 Iteration 3444, time = 1.55s, wps = 99294, train loss = 4.4780 Iteration 3464, time = 1.53s, wps = 100716, train loss = 4.5221 Iteration 3484, time = 1.53s, wps = 100637, train loss = 4.5805 Iteration 3504, time = 1.54s, wps = 99641, train loss = 4.5132 Iteration 3524, time = 1.53s, wps = 100631, train loss = 4.4837 Iteration 3544, time = 1.52s, wps = 100751, train loss = 4.5682 Iteration 3564, time = 1.52s, wps = 100910, train loss = 4.4643 Iteration 3584, time = 1.53s, wps = 100131, train loss = 4.5748 Iteration 3604, time = 1.54s, wps = 99924, train loss = 4.5915 Iteration 3624, time = 1.54s, wps = 99942, train loss = 4.5325 /usr/local/lib/python3.5/dist-packages/tensorflow/python/summary/writer/writer.py:386: UserWarning: Attempting to use a closed FileWriter. The operation will be a noop unless the FileWriter is explicitly reopened. warnings.warn("Attempting to use a closed FileWriter. " real 3m11.882s user 20m47.862s sys 4m45.178s root@84792c578a3b:/workspace/nvidia-examples/big_lstm# time python single_lm_train.py --mode=train --logdir=./logs --num_gpus=2 --datadir=./data/1-billion-word- language-modeling-benchmark-r13output WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see: * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md * https://github.com/tensorflow/addons If you depend on functionality not listed there, please file an issue. *****HYPER PARAMETERS***** {'max_grad_norm': 10.0, 'do_summaries': False, 'average_params': True, 'projected_size': 512, 'num_steps': 20, 'batch_size': 128, 'num_delayed_steps': 150, 'vocab_size': 793470, 'num_shards': 8, 'max_time': 180, 'num_layers': 1, 'run_profiler': False, 'num_gpus': 2, 'optimizer': 0, 'learning_rate': 0.2, 'state_size': 2048, 'num_sampled': 8192, 'emb_size': 512, 'keep_prob': 0.9} ************************** WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/model_utils.py:33: UniformUnitScaling.__init__ (from tensorflow.python.ops.init_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.initializers.variance_scaling instead with distribution=uniform to get equivalent behavior. WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/language_model.py:75: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version. Instructions for updating: Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`. WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/language_model.py:107: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_impl.py:1444: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version. Instructions for updating: Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_grad.py:425: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. Current time: 1591376921.8346786 ALL VARIABLES WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/run_utils.py:18: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02. Instructions for updating: Please use tf.global_variables instead. model/emb_0:0 (99184, 512) /gpu:0 model/emb_1:0 (99184, 512) /gpu:0 model/emb_2:0 (99184, 512) /gpu:0 model/emb_3:0 (99184, 512) /gpu:0 model/emb_4:0 (99184, 512) /gpu:0 model/emb_5:0 (99184, 512) /gpu:0 model/emb_6:0 (99184, 512) /gpu:0 model/emb_7:0 (99184, 512) /gpu:0 model/lstm_0/LSTMCell/W_0:0 (1024, 8192) /gpu:0 model/lstm_0/LSTMCell/B:0 (8192,) /gpu:0 model/lstm_0/LSTMCell/W_P_0:0 (2048, 512) /gpu:0 model/softmax_w_0:0 (99184, 512) /gpu:0 model/softmax_w_1:0 (99184, 512) /gpu:0 model/softmax_w_2:0 (99184, 512) /gpu:0 model/softmax_w_3:0 (99184, 512) /gpu:0 model/softmax_w_4:0 (99184, 512) /gpu:0 model/softmax_w_5:0 (99184, 512) /gpu:0 model/softmax_w_6:0 (99184, 512) /gpu:0 model/softmax_w_7:0 (99184, 512) /gpu:0 model/softmax_b:0 (793470,) /gpu:0 model/global_step:0 () model/model/emb_0/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_1/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_2/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_3/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_4/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_5/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_6/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_7/Adagrad:0 (99184, 512) /gpu:0 model/model/lstm_0/LSTMCell/W_0/Adagrad:0 (1024, 8192) /gpu:0 model/model/lstm_0/LSTMCell/B/Adagrad:0 (8192,) /gpu:0 model/model/lstm_0/LSTMCell/W_P_0/Adagrad:0 (2048, 512) /gpu:0 model/model/softmax_w_0/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_1/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_2/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_3/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_4/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_5/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_6/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_7/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_b/Adagrad:0 (793470,) /gpu:0 model/model/lstm_0/LSTMCell/W_0/ExponentialMovingAverage:0 (1024, 8192) /gpu:0 model/model/lstm_0/LSTMCell/B/ExponentialMovingAverage:0 (8192,) /gpu:0 model/model/lstm_0/LSTMCell/W_P_0/ExponentialMovingAverage:0 (2048, 512) /gpu:0 TRAINABLE VARIABLES model/emb_0:0 (99184, 512) /gpu:0 model/emb_1:0 (99184, 512) /gpu:0 model/emb_2:0 (99184, 512) /gpu:0 model/emb_3:0 (99184, 512) /gpu:0 model/emb_4:0 (99184, 512) /gpu:0 model/emb_5:0 (99184, 512) /gpu:0 model/emb_6:0 (99184, 512) /gpu:0 model/emb_7:0 (99184, 512) /gpu:0 model/lstm_0/LSTMCell/W_0:0 (1024, 8192) /gpu:0 model/lstm_0/LSTMCell/B:0 (8192,) /gpu:0 model/lstm_0/LSTMCell/W_P_0:0 (2048, 512) /gpu:0 model/softmax_w_0:0 (99184, 512) /gpu:0 model/softmax_w_1:0 (99184, 512) /gpu:0 model/softmax_w_2:0 (99184, 512) /gpu:0 model/softmax_w_3:0 (99184, 512) /gpu:0 model/softmax_w_4:0 (99184, 512) /gpu:0 model/softmax_w_5:0 (99184, 512) /gpu:0 model/softmax_w_6:0 (99184, 512) /gpu:0 model/softmax_w_7:0 (99184, 512) /gpu:0 model/softmax_b:0 (793470,) /gpu:0 LOCAL VARIABLES model/model/state_0_0:0 (128, 2560) /gpu:0 model/model_1/state_1_0:0 (128, 2560) /gpu:1 WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/run_utils.py:32: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.MonitoredTrainingSession 2020-06-05 17:08:42.243480: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2900080000 Hz 2020-06-05 17:08:42.251194: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x7bdfc50 executing computations on platform Host. Devices: 2020-06-05 17:08:42.251238: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (0): , 2020-06-05 17:08:42.676615: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-06-05 17:08:42.690960: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-06-05 17:08:42.724775: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-06-05 17:08:42.725678: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x7bdf670 executing computations on platform CUDA. Devices: 2020-06-05 17:08:42.725696: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (0): TITAN RTX, Compute Capability 7.5 2020-06-05 17:08:42.725701: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (1): TITAN RTX, Compute Capability 7.5 2020-06-05 17:08:42.725705: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (2): GeForce RTX 2080 Ti, Compute Capability 7.5 2020-06-05 17:08:42.725714: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (3): GeForce RTX 2080 Ti, Compute Capability 7.5 2020-06-05 17:08:42.726708: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77 pciBusID: 0000:01:00.0 totalMemory: 23.65GiB freeMemory: 23.22GiB 2020-06-05 17:08:42.726736: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties: name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77 pciBusID: 0000:21:00.0 totalMemory: 23.65GiB freeMemory: 23.48GiB 2020-06-05 17:08:42.726759: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 2 with properties: name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.635 pciBusID: 0000:4a:00.0 totalMemory: 10.76GiB freeMemory: 10.60GiB 2020-06-05 17:08:42.726781: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 3 with properties: name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.635 pciBusID: 0000:4b:00.0 totalMemory: 10.76GiB freeMemory: 10.60GiB 2020-06-05 17:08:42.726914: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1, 2, 3 2020-06-05 17:08:43.495480: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-06-05 17:08:43.495515: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 1 2 3 2020-06-05 17:08:43.495520: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N N N N 2020-06-05 17:08:43.495524: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1: N N N N 2020-06-05 17:08:43.495529: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 2: N N N N 2020-06-05 17:08:43.495533: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 3: N N N N 2020-06-05 17:08:43.495673: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22500 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:01:00.0, compute capability: 7.5) 2020-06-05 17:08:43.495974: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 22757 MB memory) -> physical GPU (device: 1, name: TITAN RTX, pci bus id: 0000:21:00.0, compute capability: 7.5) 2020-06-05 17:08:43.496155: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 10224 MB memory) -> physical GPU (device: 2, name: GeForce RTX 2080 Ti, pci bus id: 0000:4a:00.0, compute capability: 7.5) 2020-06-05 17:08:43.496445: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 10224 MB memory) -> physical GPU (device: 3, name: GeForce RTX 2080 Ti, pci bus id: 0000:4b:00.0, compute capability: 7.5) WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py:1070: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file utilities to get mtimes. Processing file: ./data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00049-of-00100 Finished processing! 2020-06-05 17:08:52.944977: I tensorflow/stream_executor/dso_loader.cc:153] successfully opened CUDA library libcublas.so.10 locally Iteration 3627, time = 5.44s, wps = 942, train loss = 4.7566 Iteration 3628, time = 3.51s, wps = 1458, train loss = 4.5165 Iteration 3629, time = 0.06s, wps = 80162, train loss = 4.4922 Iteration 3630, time = 0.06s, wps = 78827, train loss = 4.4974 Iteration 3631, time = 0.07s, wps = 75329, train loss = 4.5499 Iteration 3632, time = 0.06s, wps = 83470, train loss = 4.5005 Iteration 3633, time = 0.06s, wps = 89711, train loss = 4.5208 Iteration 3634, time = 0.06s, wps = 89861, train loss = 4.5119 Iteration 3635, time = 0.06s, wps = 88752, train loss = 4.4521 Iteration 3646, time = 0.64s, wps = 88432, train loss = 4.4266 Iteration 3666, time = 1.17s, wps = 87248, train loss = 4.5206 Iteration 3686, time = 1.17s, wps = 87161, train loss = 4.5742 Iteration 3706, time = 1.15s, wps = 89018, train loss = 4.4816 Iteration 3726, time = 1.16s, wps = 88060, train loss = 4.5057 Iteration 3746, time = 1.17s, wps = 87244, train loss = 4.4762 Iteration 3766, time = 1.16s, wps = 88586, train loss = 4.4988 Iteration 3786, time = 1.16s, wps = 88397, train loss = 4.5656 Iteration 3806, time = 1.16s, wps = 88568, train loss = 4.5083 Iteration 3826, time = 1.16s, wps = 87951, train loss = 4.5052 Iteration 3846, time = 1.16s, wps = 88024, train loss = 4.3658 Iteration 3866, time = 1.16s, wps = 87909, train loss = 4.5220 Iteration 3886, time = 1.16s, wps = 88129, train loss = 4.4991 Iteration 3906, time = 1.16s, wps = 88293, train loss = 4.4235 Iteration 3926, time = 1.17s, wps = 87232, train loss = 4.4760 Iteration 3946, time = 1.16s, wps = 87927, train loss = 4.5122 Iteration 3966, time = 1.15s, wps = 88860, train loss = 4.4662 Iteration 3986, time = 1.17s, wps = 87815, train loss = 4.5097 Iteration 4006, time = 1.16s, wps = 88303, train loss = 4.5156 Iteration 4026, time = 1.17s, wps = 87504, train loss = 4.4528 Iteration 4046, time = 1.15s, wps = 88756, train loss = 4.4759 Iteration 4066, time = 1.17s, wps = 87537, train loss = 4.4339 Iteration 4086, time = 1.17s, wps = 87829, train loss = 4.4780 Iteration 4106, time = 1.17s, wps = 87440, train loss = 4.4685 Iteration 4126, time = 1.16s, wps = 88602, train loss = 4.5041 Iteration 4146, time = 1.17s, wps = 87606, train loss = 4.4539 Iteration 4166, time = 1.17s, wps = 87653, train loss = 4.4632 Iteration 4186, time = 1.15s, wps = 88852, train loss = 4.3951 Iteration 4206, time = 1.16s, wps = 88193, train loss = 4.5227 Iteration 4226, time = 1.17s, wps = 87865, train loss = 4.5056 Iteration 4246, time = 1.16s, wps = 88106, train loss = 4.4377 Iteration 4266, time = 1.16s, wps = 88550, train loss = 4.5702 Iteration 4286, time = 1.16s, wps = 88213, train loss = 4.5764 Iteration 4306, time = 1.17s, wps = 87834, train loss = 4.4816 Iteration 4326, time = 1.16s, wps = 88195, train loss = 4.4706 Iteration 4346, time = 1.17s, wps = 87494, train loss = 4.4866 Iteration 4366, time = 1.16s, wps = 88535, train loss = 4.4419 Iteration 4386, time = 1.17s, wps = 87850, train loss = 4.4617 Iteration 4406, time = 1.16s, wps = 88479, train loss = 4.5236 Iteration 4426, time = 1.16s, wps = 88502, train loss = 4.5171 Iteration 4446, time = 1.17s, wps = 87269, train loss = 4.4331 Iteration 4466, time = 1.16s, wps = 87954, train loss = 4.4099 Iteration 4486, time = 1.16s, wps = 88542, train loss = 4.4250 Iteration 4506, time = 1.17s, wps = 87551, train loss = 4.4414 Iteration 4526, time = 1.17s, wps = 87769, train loss = 4.4486 Iteration 4546, time = 1.18s, wps = 86513, train loss = 4.3860 Iteration 4566, time = 1.16s, wps = 88199, train loss = 4.4600 Iteration 4586, time = 1.16s, wps = 88398, train loss = 4.4497 Iteration 4606, time = 1.16s, wps = 88395, train loss = 4.4586 Iteration 4626, time = 1.16s, wps = 88394, train loss = 4.5178 Iteration 4646, time = 1.16s, wps = 88247, train loss = 4.3999 Iteration 4666, time = 1.14s, wps = 89437, train loss = 4.4442 Iteration 4686, time = 1.16s, wps = 88306, train loss = 4.3629 Iteration 4706, time = 1.17s, wps = 87567, train loss = 4.4156 Iteration 4726, time = 1.16s, wps = 88332, train loss = 4.4432 Iteration 4746, time = 1.17s, wps = 87788, train loss = 4.5745 Iteration 4766, time = 1.17s, wps = 87682, train loss = 4.4192 Iteration 4786, time = 1.16s, wps = 88107, train loss = 4.4124 Iteration 4806, time = 1.17s, wps = 87674, train loss = 4.4430 Iteration 4826, time = 1.18s, wps = 86828, train loss = 4.4586 Iteration 4846, time = 1.16s, wps = 88145, train loss = 4.4484 Iteration 4866, time = 1.17s, wps = 87455, train loss = 4.3422 Iteration 4886, time = 1.18s, wps = 86909, train loss = 4.4523 Iteration 4906, time = 1.16s, wps = 88000, train loss = 4.4101 Iteration 4926, time = 1.17s, wps = 87702, train loss = 4.3597 Iteration 4946, time = 1.17s, wps = 87632, train loss = 4.3591 Iteration 4966, time = 1.17s, wps = 87752, train loss = 4.4223 Iteration 4986, time = 1.16s, wps = 88129, train loss = 4.4771 Iteration 5006, time = 1.17s, wps = 87673, train loss = 4.4191 Iteration 5026, time = 1.16s, wps = 88096, train loss = 4.3752 Iteration 5046, time = 1.17s, wps = 87372, train loss = 4.3998 Iteration 5066, time = 1.17s, wps = 87193, train loss = 4.3275 Iteration 5086, time = 1.17s, wps = 87608, train loss = 4.4637 Iteration 5106, time = 1.17s, wps = 87742, train loss = 4.3791 Iteration 5126, time = 1.18s, wps = 86798, train loss = 4.3843 Iteration 5146, time = 1.17s, wps = 87469, train loss = 4.3257 Iteration 5166, time = 1.16s, wps = 88364, train loss = 4.4247 Iteration 5186, time = 1.18s, wps = 86941, train loss = 4.3494 Processing file: ./data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00080-of-00100 Finished processing! Iteration 5206, time = 2.81s, wps = 36494, train loss = 4.4983 Iteration 5226, time = 1.18s, wps = 86971, train loss = 4.4332 Iteration 5246, time = 1.18s, wps = 86441, train loss = 4.4162 Iteration 5266, time = 1.18s, wps = 86554, train loss = 4.4147 Iteration 5286, time = 1.17s, wps = 87177, train loss = 4.3586 Iteration 5306, time = 1.18s, wps = 87136, train loss = 4.4887 Iteration 5326, time = 1.18s, wps = 87053, train loss = 4.4770 Iteration 5346, time = 1.18s, wps = 86882, train loss = 4.3563 Iteration 5366, time = 1.18s, wps = 87008, train loss = 4.3860 Iteration 5386, time = 1.17s, wps = 87531, train loss = 4.4812 Iteration 5406, time = 1.17s, wps = 87193, train loss = 4.3741 Iteration 5426, time = 1.18s, wps = 87142, train loss = 4.4175 Iteration 5446, time = 1.18s, wps = 86432, train loss = 4.4173 Iteration 5466, time = 1.16s, wps = 88169, train loss = 4.3273 Iteration 5486, time = 1.18s, wps = 86824, train loss = 4.3961 Iteration 5506, time = 1.17s, wps = 87444, train loss = 4.3292 Iteration 5526, time = 1.17s, wps = 87349, train loss = 4.3975 Iteration 5546, time = 1.17s, wps = 87599, train loss = 4.4503 Iteration 5566, time = 1.19s, wps = 86391, train loss = 4.3092 Iteration 5586, time = 1.17s, wps = 87894, train loss = 4.3081 Iteration 5606, time = 1.16s, wps = 87936, train loss = 4.3715 Iteration 5626, time = 1.17s, wps = 87421, train loss = 4.4178 Iteration 5646, time = 1.16s, wps = 88188, train loss = 4.2964 Iteration 5666, time = 1.17s, wps = 87258, train loss = 4.3470 Iteration 5686, time = 1.18s, wps = 87113, train loss = 4.3484 Iteration 5706, time = 1.16s, wps = 88140, train loss = 4.4047 Iteration 5726, time = 1.17s, wps = 87295, train loss = 4.4588 Iteration 5746, time = 1.17s, wps = 87446, train loss = 4.3386 Iteration 5766, time = 1.17s, wps = 87449, train loss = 4.3397 Iteration 5786, time = 1.17s, wps = 87461, train loss = 4.5041 Iteration 5806, time = 1.18s, wps = 86791, train loss = 4.3241 Iteration 5826, time = 1.18s, wps = 86951, train loss = 4.3269 Iteration 5846, time = 1.17s, wps = 87240, train loss = 4.3537 Iteration 5866, time = 1.17s, wps = 87682, train loss = 4.3232 Iteration 5886, time = 1.18s, wps = 86540, train loss = 4.3861 Iteration 5906, time = 1.19s, wps = 86122, train loss = 4.3125 Iteration 5926, time = 1.16s, wps = 88203, train loss = 4.3865 Iteration 5946, time = 1.18s, wps = 86434, train loss = 4.3400 Iteration 5966, time = 1.17s, wps = 87407, train loss = 4.3240 Iteration 5986, time = 1.17s, wps = 87168, train loss = 4.3697 Iteration 6006, time = 1.19s, wps = 85975, train loss = 4.2806 Iteration 6026, time = 1.19s, wps = 86093, train loss = 4.3032 Iteration 6046, time = 1.18s, wps = 86683, train loss = 4.3329 Iteration 6066, time = 1.18s, wps = 86746, train loss = 4.2662 Iteration 6086, time = 1.19s, wps = 86322, train loss = 4.2806 Iteration 6106, time = 1.18s, wps = 86851, train loss = 4.4027 Iteration 6126, time = 1.18s, wps = 86853, train loss = 4.3096 Iteration 6146, time = 1.21s, wps = 84952, train loss = 4.3264 Iteration 6166, time = 1.18s, wps = 86644, train loss = 4.3743 Iteration 6186, time = 1.17s, wps = 87455, train loss = 4.3790 Iteration 6206, time = 1.19s, wps = 85877, train loss = 4.3216 Iteration 6226, time = 1.19s, wps = 85831, train loss = 4.4005 Iteration 6246, time = 1.19s, wps = 85932, train loss = 4.2875 Iteration 6266, time = 1.18s, wps = 87109, train loss = 4.3146 Iteration 6286, time = 1.19s, wps = 85952, train loss = 4.2914 Iteration 6306, time = 1.19s, wps = 86152, train loss = 4.2635 Iteration 6326, time = 1.18s, wps = 86463, train loss = 4.3885 Iteration 6346, time = 1.19s, wps = 86169, train loss = 4.3032 Iteration 6366, time = 1.18s, wps = 86484, train loss = 4.3572 Iteration 6386, time = 1.19s, wps = 86240, train loss = 4.2832 Iteration 6406, time = 1.20s, wps = 85632, train loss = 4.3699 /usr/local/lib/python3.5/dist-packages/tensorflow/python/summary/writer/writer.py:386: UserWarning: Attempting to use a closed FileWriter. The operation will be a noop unless the FileWriter is explicitly reopened. warnings.warn("Attempting to use a closed FileWriter. " real 3m10.363s user 16m34.085s sys 4m17.693s root@84792c578a3b:/workspace/nvidia-examples/big_lstm# time python single_lm_train.py --mode=train --logdir=./logs --num_gpus=1 --datadir=./data/1-billion-word- language-modeling-benchmark-r13output WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see: * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md * https://github.com/tensorflow/addons If you depend on functionality not listed there, please file an issue. *****HYPER PARAMETERS***** {'num_sampled': 8192, 'num_steps': 20, 'num_shards': 8, 'max_grad_norm': 10.0, 'emb_size': 512, 'num_gpus': 1, 'vocab_size': 793470, 'keep_prob': 0.9, 'do_summaries': False, 'batch_size': 128, 'state_size': 2048, 'num_layers': 1, 'max_time': 180, 'average_params': True, 'optimizer': 0, 'num_delayed_steps': 150, 'projected_size': 512, 'learning_rate': 0.2, 'run_profiler': False} ************************** WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/model_utils.py:33: UniformUnitScaling.__init__ (from tensorflow.python.ops.init_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.initializers.variance_scaling instead with distribution=uniform to get equivalent behavior. WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/language_model.py:75: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version. Instructions for updating: Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`. WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/language_model.py:107: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_impl.py:1444: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version. Instructions for updating: Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_grad.py:425: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. Current time: 1591377942.4566767 ALL VARIABLES WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/run_utils.py:18: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02. Instructions for updating: Please use tf.global_variables instead. model/emb_0:0 (99184, 512) /gpu:0 model/emb_1:0 (99184, 512) /gpu:0 model/emb_2:0 (99184, 512) /gpu:0 model/emb_3:0 (99184, 512) /gpu:0 model/emb_4:0 (99184, 512) /gpu:0 model/emb_5:0 (99184, 512) /gpu:0 model/emb_6:0 (99184, 512) /gpu:0 model/emb_7:0 (99184, 512) /gpu:0 model/lstm_0/LSTMCell/W_0:0 (1024, 8192) /gpu:0 model/lstm_0/LSTMCell/B:0 (8192,) /gpu:0 model/lstm_0/LSTMCell/W_P_0:0 (2048, 512) /gpu:0 model/softmax_w_0:0 (99184, 512) /gpu:0 model/softmax_w_1:0 (99184, 512) /gpu:0 model/softmax_w_2:0 (99184, 512) /gpu:0 model/softmax_w_3:0 (99184, 512) /gpu:0 model/softmax_w_4:0 (99184, 512) /gpu:0 model/softmax_w_5:0 (99184, 512) /gpu:0 model/softmax_w_6:0 (99184, 512) /gpu:0 model/softmax_w_7:0 (99184, 512) /gpu:0 model/softmax_b:0 (793470,) /gpu:0 model/global_step:0 () model/model/emb_0/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_1/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_2/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_3/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_4/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_5/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_6/Adagrad:0 (99184, 512) /gpu:0 model/model/emb_7/Adagrad:0 (99184, 512) /gpu:0 model/model/lstm_0/LSTMCell/W_0/Adagrad:0 (1024, 8192) /gpu:0 model/model/lstm_0/LSTMCell/B/Adagrad:0 (8192,) /gpu:0 model/model/lstm_0/LSTMCell/W_P_0/Adagrad:0 (2048, 512) /gpu:0 model/model/softmax_w_0/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_1/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_2/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_3/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_4/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_5/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_6/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_w_7/Adagrad:0 (99184, 512) /gpu:0 model/model/softmax_b/Adagrad:0 (793470,) /gpu:0 model/model/lstm_0/LSTMCell/W_0/ExponentialMovingAverage:0 (1024, 8192) /gpu:0 model/model/lstm_0/LSTMCell/B/ExponentialMovingAverage:0 (8192,) /gpu:0 model/model/lstm_0/LSTMCell/W_P_0/ExponentialMovingAverage:0 (2048, 512) /gpu:0 TRAINABLE VARIABLES model/emb_0:0 (99184, 512) /gpu:0 model/emb_1:0 (99184, 512) /gpu:0 model/emb_2:0 (99184, 512) /gpu:0 model/emb_3:0 (99184, 512) /gpu:0 model/emb_4:0 (99184, 512) /gpu:0 model/emb_5:0 (99184, 512) /gpu:0 model/emb_6:0 (99184, 512) /gpu:0 model/emb_7:0 (99184, 512) /gpu:0 model/lstm_0/LSTMCell/W_0:0 (1024, 8192) /gpu:0 model/lstm_0/LSTMCell/B:0 (8192,) /gpu:0 model/lstm_0/LSTMCell/W_P_0:0 (2048, 512) /gpu:0 model/softmax_w_0:0 (99184, 512) /gpu:0 model/softmax_w_1:0 (99184, 512) /gpu:0 model/softmax_w_2:0 (99184, 512) /gpu:0 model/softmax_w_3:0 (99184, 512) /gpu:0 model/softmax_w_4:0 (99184, 512) /gpu:0 model/softmax_w_5:0 (99184, 512) /gpu:0 model/softmax_w_6:0 (99184, 512) /gpu:0 model/softmax_w_7:0 (99184, 512) /gpu:0 model/softmax_b:0 (793470,) /gpu:0 LOCAL VARIABLES model/model/state_0_0:0 (128, 2560) /gpu:0 WARNING:tensorflow:From /opt/tensorflow/nvidia-examples/big_lstm/run_utils.py:32: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.MonitoredTrainingSession 2020-06-05 17:25:42.659484: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2900080000 Hz 2020-06-05 17:25:42.666778: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x6753d80 executing computations on platform Host. Devices: 2020-06-05 17:25:42.666821: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (0): , 2020-06-05 17:25:43.105761: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-06-05 17:25:43.143977: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-06-05 17:25:43.144711: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-06-05 17:25:43.146285: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x67537a0 executing computations on platform CUDA. Devices: 2020-06-05 17:25:43.146332: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (0): TITAN RTX, Compute Capability 7.5 2020-06-05 17:25:43.146341: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (1): TITAN RTX, Compute Capability 7.5 2020-06-05 17:25:43.146355: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (2): GeForce RTX 2080 Ti, Compute Capability 7.5 2020-06-05 17:25:43.146363: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (3): GeForce RTX 2080 Ti, Compute Capability 7.5 2020-06-05 17:25:43.147547: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77 pciBusID: 0000:01:00.0 totalMemory: 23.65GiB freeMemory: 23.22GiB 2020-06-05 17:25:43.147576: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties: name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77 pciBusID: 0000:21:00.0 totalMemory: 23.65GiB freeMemory: 23.48GiB 2020-06-05 17:25:43.147601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 2 with properties: name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.635 pciBusID: 0000:4a:00.0 totalMemory: 10.76GiB freeMemory: 10.60GiB 2020-06-05 17:25:43.147623: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 3 with properties: name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.635 pciBusID: 0000:4b:00.0 totalMemory: 10.76GiB freeMemory: 10.60GiB 2020-06-05 17:25:43.147759: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1, 2, 3 2020-06-05 17:25:43.917888: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-06-05 17:25:43.917935: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 1 2 3 2020-06-05 17:25:43.917940: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N N N N 2020-06-05 17:25:43.917945: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1: N N N N 2020-06-05 17:25:43.917950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 2: N N N N 2020-06-05 17:25:43.917955: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 3: N N N N 2020-06-05 17:25:43.918087: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22500 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:01:00.0, compute capability: 7.5) 2020-06-05 17:25:43.918379: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 22757 MB memory) -> physical GPU (device: 1, name: TITAN RTX, pci bus id: 0000:21:00.0, compute capability: 7.5) 2020-06-05 17:25:43.918682: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 10224 MB memory) -> physical GPU (device: 2, name: GeForce RTX 2080 Ti, pci bus id: 0000:4a:00.0, compute capability: 7.5) 2020-06-05 17:25:43.918884: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 10224 MB memory) -> physical GPU (device: 3, name: GeForce RTX 2080 Ti, pci bus id: 0000:4b:00.0, compute capability: 7.5) WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py:1070: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file utilities to get mtimes. Processing file: ./data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00014-of-00100 Finished processing! 2020-06-05 17:25:50.875973: I tensorflow/stream_executor/dso_loader.cc:153] successfully opened CUDA library libcublas.so.10 locally Iteration 6419, time = 3.55s, wps = 721, train loss = 4.5737 Iteration 6420, time = 1.76s, wps = 1452, train loss = 4.2541 Iteration 6421, time = 0.06s, wps = 45995, train loss = 4.2498 Iteration 6422, time = 0.05s, wps = 48710, train loss = 4.2730 Iteration 6423, time = 0.06s, wps = 46493, train loss = 4.3932 Iteration 6424, time = 0.04s, wps = 57480, train loss = 4.3524 Iteration 6425, time = 0.04s, wps = 59311, train loss = 4.3539 Iteration 6426, time = 0.04s, wps = 57592, train loss = 4.2571 Iteration 6427, time = 0.04s, wps = 57940, train loss = 4.2560 Iteration 6438, time = 0.50s, wps = 56168, train loss = 4.3645 Iteration 6458, time = 0.93s, wps = 55135, train loss = 4.4258 Iteration 6478, time = 0.93s, wps = 55077, train loss = 4.3142 Iteration 6498, time = 0.92s, wps = 55786, train loss = 4.3551 Iteration 6518, time = 0.91s, wps = 56016, train loss = 4.3420 Iteration 6538, time = 0.94s, wps = 54501, train loss = 4.3460 Iteration 6558, time = 0.93s, wps = 55116, train loss = 4.3421 Iteration 6578, time = 0.94s, wps = 54632, train loss = 4.3213 Iteration 6598, time = 0.93s, wps = 54850, train loss = 4.2725 Iteration 6618, time = 0.92s, wps = 55449, train loss = 4.3330 Iteration 6638, time = 0.92s, wps = 55712, train loss = 4.3283 Iteration 6658, time = 0.94s, wps = 54554, train loss = 4.3845 Iteration 6678, time = 0.93s, wps = 54927, train loss = 4.4351 Iteration 6698, time = 0.93s, wps = 55118, train loss = 4.2955 Iteration 6718, time = 0.94s, wps = 54683, train loss = 4.3536 Iteration 6738, time = 0.93s, wps = 55322, train loss = 4.2996 Iteration 6758, time = 0.93s, wps = 55201, train loss = 4.3596 Iteration 6778, time = 0.94s, wps = 54626, train loss = 4.2140 Iteration 6798, time = 0.93s, wps = 54837, train loss = 4.4175 Iteration 6818, time = 0.95s, wps = 53634, train loss = 4.2772 Iteration 6838, time = 0.93s, wps = 55243, train loss = 4.2362 Iteration 6858, time = 0.93s, wps = 54818, train loss = 4.3666 Iteration 6878, time = 0.92s, wps = 55740, train loss = 4.3032 Iteration 6898, time = 0.94s, wps = 54305, train loss = 4.4890 Iteration 6918, time = 0.93s, wps = 55152, train loss = 4.3839 Iteration 6938, time = 0.95s, wps = 54018, train loss = 4.3174 Iteration 6958, time = 0.94s, wps = 54579, train loss = 4.1433 Iteration 6978, time = 0.93s, wps = 55054, train loss = 4.3502 Iteration 6998, time = 0.93s, wps = 54940, train loss = 4.5000 Iteration 7018, time = 0.93s, wps = 54765, train loss = 4.2247 Iteration 7038, time = 0.92s, wps = 55585, train loss = 4.1796 Iteration 7058, time = 0.93s, wps = 55263, train loss = 4.3909 Iteration 7078, time = 0.95s, wps = 53914, train loss = 4.2892 Iteration 7098, time = 0.94s, wps = 54593, train loss = 4.3692 Iteration 7118, time = 0.94s, wps = 54524, train loss = 4.4102 Iteration 7138, time = 0.94s, wps = 54634, train loss = 4.4526 Iteration 7158, time = 0.94s, wps = 54242, train loss = 4.3425 Iteration 7178, time = 0.92s, wps = 55824, train loss = 4.3004 Iteration 7198, time = 0.95s, wps = 53874, train loss = 4.3381 Iteration 7218, time = 0.94s, wps = 54616, train loss = 4.3737 Iteration 7238, time = 0.94s, wps = 54241, train loss = 4.3850 Iteration 7258, time = 0.93s, wps = 55296, train loss = 4.2881 Iteration 7278, time = 0.95s, wps = 54114, train loss = 4.3289 Iteration 7298, time = 0.93s, wps = 54983, train loss = 4.3515 Iteration 7318, time = 0.93s, wps = 54937, train loss = 4.3257 Iteration 7338, time = 0.94s, wps = 54704, train loss = 4.2851 Iteration 7358, time = 0.92s, wps = 55581, train loss = 4.4100 Iteration 7378, time = 0.94s, wps = 54752, train loss = 4.2999 Iteration 7398, time = 0.95s, wps = 54149, train loss = 4.3766 Iteration 7418, time = 0.94s, wps = 54479, train loss = 4.3932 Iteration 7438, time = 0.94s, wps = 54240, train loss = 4.2097 Iteration 7458, time = 0.95s, wps = 53978, train loss = 4.2254 Iteration 7478, time = 0.93s, wps = 55065, train loss = 4.2931 Iteration 7498, time = 0.94s, wps = 54431, train loss = 4.4113 Iteration 7518, time = 0.95s, wps = 53888, train loss = 4.2929 Iteration 7538, time = 0.94s, wps = 54299, train loss = 4.3587 Iteration 7558, time = 0.92s, wps = 55783, train loss = 4.3654 Iteration 7578, time = 0.94s, wps = 54670, train loss = 4.2976 Iteration 7598, time = 0.95s, wps = 54121, train loss = 4.3389 Iteration 7618, time = 0.95s, wps = 53891, train loss = 4.3505 Iteration 7638, time = 0.95s, wps = 53853, train loss = 4.2338 Iteration 7658, time = 0.93s, wps = 55103, train loss = 4.3190 Iteration 7678, time = 0.94s, wps = 54745, train loss = 4.3988 Iteration 7698, time = 0.94s, wps = 54615, train loss = 4.1612 Iteration 7718, time = 0.94s, wps = 54193, train loss = 4.3092 Iteration 7738, time = 0.93s, wps = 54939, train loss = 4.2097 Iteration 7758, time = 0.95s, wps = 53733, train loss = 4.2809 Iteration 7778, time = 0.94s, wps = 54461, train loss = 4.4522 Iteration 7798, time = 0.94s, wps = 54400, train loss = 4.2306 Iteration 7818, time = 0.94s, wps = 54737, train loss = 4.3637 Iteration 7838, time = 0.94s, wps = 54306, train loss = 4.4005 Iteration 7858, time = 0.94s, wps = 54662, train loss = 4.4109 Iteration 7878, time = 0.95s, wps = 53630, train loss = 4.4060 Iteration 7898, time = 0.94s, wps = 54404, train loss = 4.3670 Iteration 7918, time = 0.92s, wps = 55370, train loss = 4.3189 Iteration 7938, time = 0.94s, wps = 54203, train loss = 4.2912 Iteration 7958, time = 0.93s, wps = 54776, train loss = 4.2855 Iteration 7978, time = 0.95s, wps = 54001, train loss = 4.4126 Iteration 7998, time = 0.95s, wps = 53645, train loss = 4.3625 Iteration 8018, time = 0.95s, wps = 53783, train loss = 4.3451 Iteration 8038, time = 0.95s, wps = 54158, train loss = 4.2961 Iteration 8058, time = 0.95s, wps = 54125, train loss = 4.3314 Iteration 8078, time = 0.94s, wps = 54573, train loss = 4.3438 Iteration 8098, time = 0.95s, wps = 54000, train loss = 4.3830 Iteration 8118, time = 0.94s, wps = 54411, train loss = 4.3120 Iteration 8138, time = 0.93s, wps = 55144, train loss = 4.4421 Iteration 8158, time = 0.93s, wps = 54845, train loss = 4.3822 Iteration 8178, time = 0.95s, wps = 53991, train loss = 4.2584 Iteration 8198, time = 0.95s, wps = 53842, train loss = 4.3359 Iteration 8218, time = 0.94s, wps = 54516, train loss = 4.3337 Iteration 8238, time = 0.95s, wps = 54113, train loss = 4.2164 Iteration 8258, time = 0.94s, wps = 54539, train loss = 4.3587 Iteration 8278, time = 0.95s, wps = 53653, train loss = 4.3637 Iteration 8298, time = 0.94s, wps = 54384, train loss = 4.4171 Iteration 8318, time = 0.94s, wps = 54382, train loss = 4.2392 Iteration 8338, time = 0.95s, wps = 54134, train loss = 4.1423 Iteration 8358, time = 0.95s, wps = 53946, train loss = 4.3779 Iteration 8378, time = 0.95s, wps = 53682, train loss = 4.2650 Iteration 8398, time = 0.93s, wps = 55037, train loss = 4.3131 Iteration 8418, time = 0.95s, wps = 53727, train loss = 4.3791 Iteration 8438, time = 0.94s, wps = 54575, train loss = 4.2353 Iteration 8458, time = 0.95s, wps = 53733, train loss = 4.3368 Iteration 8478, time = 0.94s, wps = 54233, train loss = 4.4048 Iteration 8498, time = 0.95s, wps = 53988, train loss = 4.2859 Iteration 8518, time = 0.95s, wps = 53960, train loss = 4.4402 Iteration 8538, time = 0.93s, wps = 54842, train loss = 4.2788 Iteration 8558, time = 0.96s, wps = 53269, train loss = 4.2604 Iteration 8578, time = 0.94s, wps = 54549, train loss = 4.2312 Iteration 8598, time = 0.95s, wps = 53776, train loss = 4.2411 Iteration 8618, time = 0.95s, wps = 54122, train loss = 4.2975 Iteration 8638, time = 0.96s, wps = 53607, train loss = 4.2905 Iteration 8658, time = 0.95s, wps = 54091, train loss = 4.2738 Iteration 8678, time = 0.95s, wps = 53685, train loss = 4.3317 Iteration 8698, time = 0.95s, wps = 53879, train loss = 4.2418 Iteration 8718, time = 0.95s, wps = 53813, train loss = 4.2495 Iteration 8738, time = 0.98s, wps = 52458, train loss = 4.2941 Iteration 8758, time = 0.94s, wps = 54252, train loss = 4.4690 Iteration 8778, time = 0.93s, wps = 55090, train loss = 4.3093 Iteration 8798, time = 0.96s, wps = 53304, train loss = 4.3083 Iteration 8818, time = 0.94s, wps = 54219, train loss = 4.1360 Iteration 8838, time = 0.98s, wps = 52298, train loss = 4.1915 Iteration 8858, time = 0.95s, wps = 53754, train loss = 4.3999 Iteration 8878, time = 0.95s, wps = 53949, train loss = 4.2549 Iteration 8898, time = 0.94s, wps = 54191, train loss = 4.1737 Iteration 8918, time = 0.96s, wps = 53476, train loss = 4.2690 Iteration 8938, time = 0.95s, wps = 53919, train loss = 4.2815 Iteration 8958, time = 0.95s, wps = 53892, train loss = 4.3671 Iteration 8978, time = 0.96s, wps = 53591, train loss = 4.2383 Iteration 8998, time = 0.95s, wps = 53688, train loss = 4.2959 Iteration 9018, time = 0.97s, wps = 52920, train loss = 4.2123 Iteration 9038, time = 0.97s, wps = 52572, train loss = 4.3679 Iteration 9058, time = 0.97s, wps = 52851, train loss = 4.3595 Iteration 9078, time = 0.96s, wps = 53140, train loss = 4.3042 Iteration 9098, time = 0.96s, wps = 53089, train loss = 4.2379 Iteration 9118, time = 0.97s, wps = 53001, train loss = 4.2935 Iteration 9138, time = 0.97s, wps = 52934, train loss = 4.4307 Iteration 9158, time = 0.97s, wps = 52714, train loss = 4.3589 Iteration 9178, time = 0.98s, wps = 52468, train loss = 4.2330 Iteration 9198, time = 0.97s, wps = 52733, train loss = 4.4096 Iteration 9218, time = 0.96s, wps = 53188, train loss = 4.2183 Iteration 9238, time = 0.97s, wps = 53039, train loss = 4.2910 Iteration 9258, time = 0.98s, wps = 52120, train loss = 4.3355 Iteration 9278, time = 0.97s, wps = 52747, train loss = 4.2735 Iteration 9298, time = 0.97s, wps = 52811, train loss = 4.2288 Iteration 9318, time = 0.98s, wps = 52012, train loss = 4.1356 Iteration 9338, time = 0.98s, wps = 52258, train loss = 4.2330 Iteration 9358, time = 0.98s, wps = 52306, train loss = 4.2568 Iteration 9378, time = 0.99s, wps = 51915, train loss = 4.2361 Iteration 9398, time = 0.99s, wps = 51698, train loss = 4.3483 Iteration 9418, time = 0.98s, wps = 52193, train loss = 4.3033 Iteration 9438, time = 0.97s, wps = 52683, train loss = 4.2658 Iteration 9458, time = 0.99s, wps = 51697, train loss = 4.3563 Iteration 9478, time = 0.99s, wps = 51941, train loss = 4.3200 Iteration 9498, time = 1.00s, wps = 51400, train loss = 4.3022 Iteration 9518, time = 0.98s, wps = 52073, train loss = 4.2145 Iteration 9538, time = 0.99s, wps = 51814, train loss = 4.3472 Iteration 9558, time = 0.98s, wps = 52015, train loss = 4.2731 Processing file: ./data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled/news.en-00070-of-00100 Finished processing! Iteration 9578, time = 2.57s, wps = 19924, train loss = 4.2973 Iteration 9598, time = 0.99s, wps = 51882, train loss = 4.1916 Iteration 9618, time = 1.01s, wps = 50665, train loss = 4.3555 Iteration 9638, time = 1.01s, wps = 50896, train loss = 4.2850 Iteration 9658, time = 1.00s, wps = 51388, train loss = 4.2418 Iteration 9678, time = 1.00s, wps = 51046, train loss = 4.2735 Iteration 9698, time = 0.99s, wps = 51773, train loss = 4.1666 Iteration 9718, time = 1.01s, wps = 50732, train loss = 4.2460 Iteration 9738, time = 1.02s, wps = 50364, train loss = 4.1576 Iteration 9758, time = 1.01s, wps = 50781, train loss = 4.1570 Iteration 9778, time = 0.99s, wps = 51495, train loss = 4.2571 Iteration 9798, time = 1.01s, wps = 50795, train loss = 4.1531 Iteration 9818, time = 1.01s, wps = 50875, train loss = 4.3380 Iteration 9838, time = 0.98s, wps = 52114, train loss = 4.2539 Iteration 9858, time = 1.02s, wps = 50099, train loss = 4.2766 Iteration 9878, time = 1.02s, wps = 50070, train loss = 4.2875 Iteration 9898, time = 1.02s, wps = 49984, train loss = 4.2890 Iteration 9918, time = 1.02s, wps = 50350, train loss = 4.2801 Iteration 9938, time = 1.01s, wps = 50663, train loss = 4.2661 /usr/local/lib/python3.5/dist-packages/tensorflow/python/summary/writer/writer.py:386: UserWarning: Attempting to use a closed FileWriter. The operation will be a noop unless the FileWriter is explicitly reopened. warnings.warn("Attempting to use a closed FileWriter. " real 3m8.753s user 9m25.117s sys 3m0.831s root@84792c578a3b:/workspace/nvidia-examples/big_lstm# cat /etc/os-release NAME="Ubuntu" VERSION="16.04.6 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.6 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" VERSION_CODENAME=xenial UBUNTU_CODENAME=xenial root@84792c578a3b:/workspace/nvidia-examples/big_lstm# nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Fri_Feb__8_19:08:17_PST_2019 Cuda compilation tools, release 10.1, V10.1.105 root@84792c578a3b:/workspace/nvidia-examples/big_lstm# cd data root@84792c578a3b:/workspace/nvidia-examples/big_lstm/data# ls 1-billion-word-language-modeling-benchmark-r13output root@84792c578a3b:/workspace/nvidia-examples/big_lstm/data# cd 1-billion-word-language-modeling-benchmark-r13output root@84792c578a3b:/workspace/nvidia-examples/big_lstm/data/1-billion-word-language-modeling-benchmark-r13output# ls 1b_word_vocab.txt heldout-monolingual.tokenized.shuffled README training-monolingual.tokenized.shuffled root@84792c578a3b:/workspace/nvidia-examples/big_lstm/data/1-billion-word-language-modeling-benchmark-r13output# cd training-monolingual.tokenized.shuffled root@84792c578a3b:/workspace/nvidia-examples/big_lstm/data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled# ls news.en-00001-of-00100 news.en-00034-of-00100 news.en-00067-of-00100 news.en-00002-of-00100 news.en-00035-of-00100 news.en-00068-of-00100 news.en-00003-of-00100 news.en-00036-of-00100 news.en-00069-of-00100 news.en-00004-of-00100 news.en-00037-of-00100 news.en-00070-of-00100 news.en-00005-of-00100 news.en-00038-of-00100 news.en-00071-of-00100 news.en-00006-of-00100 news.en-00039-of-00100 news.en-00072-of-00100 news.en-00007-of-00100 news.en-00040-of-00100 news.en-00073-of-00100 news.en-00008-of-00100 news.en-00041-of-00100 news.en-00074-of-00100 news.en-00009-of-00100 news.en-00042-of-00100 news.en-00075-of-00100 news.en-00010-of-00100 news.en-00043-of-00100 news.en-00076-of-00100 news.en-00011-of-00100 news.en-00044-of-00100 news.en-00077-of-00100 news.en-00012-of-00100 news.en-00045-of-00100 news.en-00078-of-00100 news.en-00013-of-00100 news.en-00046-of-00100 news.en-00079-of-00100 news.en-00014-of-00100 news.en-00047-of-00100 news.en-00080-of-00100 news.en-00015-of-00100 news.en-00048-of-00100 news.en-00081-of-00100 news.en-00016-of-00100 news.en-00049-of-00100 news.en-00082-of-00100 news.en-00017-of-00100 news.en-00050-of-00100 news.en-00083-of-00100 news.en-00018-of-00100 news.en-00051-of-00100 news.en-00084-of-00100 news.en-00019-of-00100 news.en-00052-of-00100 news.en-00085-of-00100 news.en-00020-of-00100 news.en-00053-of-00100 news.en-00086-of-00100 news.en-00021-of-00100 news.en-00054-of-00100 news.en-00087-of-00100 news.en-00022-of-00100 news.en-00055-of-00100 news.en-00088-of-00100 news.en-00023-of-00100 news.en-00056-of-00100 news.en-00089-of-00100 news.en-00024-of-00100 news.en-00057-of-00100 news.en-00090-of-00100 news.en-00025-of-00100 news.en-00058-of-00100 news.en-00091-of-00100 news.en-00026-of-00100 news.en-00059-of-00100 news.en-00092-of-00100 news.en-00027-of-00100 news.en-00060-of-00100 news.en-00093-of-00100 news.en-00028-of-00100 news.en-00061-of-00100 news.en-00094-of-00100 news.en-00029-of-00100 news.en-00062-of-00100 news.en-00095-of-00100 news.en-00030-of-00100 news.en-00063-of-00100 news.en-00096-of-00100 news.en-00031-of-00100 news.en-00064-of-00100 news.en-00097-of-00100 news.en-00032-of-00100 news.en-00065-of-00100 news.en-00098-of-00100 news.en-00033-of-00100 news.en-00066-of-00100 news.en-00099-of-00100 root@84792c578a3b:/workspace/nvidia-examples/big_lstm/data/1-billion-word-language-modeling-benchmark-r13output/training-monolingual.tokenized.shuffled# exit exit [chibi@centos8 ~]$ cat /etc/redhat-release CentOS Linux release 8.1.1911 (Core) [chibi@centos8 ~]$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Wed_Oct_23_19:24:38_PDT_2019 Cuda compilation tools, release 10.2, V10.2.89 [chibi@centos8 ~]$ sudo hddtemp /dev/sda [sudo] chibi のパスワード: /dev/sda: Samsung SSD 840 PRO Series: 30°C [chibi@centos8 ~]$ sensors eth0-pci-4400 Adapter: PCI adapter PHY Temperature: +49.1°C k10temp-pci-00c3 Adapter: PCI adapter Tdie: +36.5°C (high = +70.0°C) Tctl: +36.5°C iwlwifi-virtual-0 Adapter: Virtual device temp1: +37.0°C [chibi@centos8 ~]$ nvidia-smi nvlink -c GPU 0: TITAN RTX (UUID: GPU-7fb51c1d-c1e7-35cc-aad7-66971f05ddb7) GPU 1: TITAN RTX (UUID: GPU-5a71d61e-f130-637a-b33d-4df555b0ed88) GPU 2: GeForce RTX 2080 Ti (UUID: GPU-1ac935c2-557f-282e-14e5-3f749ffd63ac) GPU 3: GeForce RTX 2080 Ti (UUID: GPU-13277ce5-e1e9-0cb1-8cee-6c9e6618e774) [chibi@centos8 ~]$ cat /proc/meminfo MemTotal: 65640612 kB MemFree: 55586096 kB MemAvailable: 62926708 kB Buffers: 1060 kB Cached: 7672560 kB SwapCached: 0 kB Active: 1164548 kB Inactive: 7004016 kB Active(anon): 467756 kB Inactive(anon): 9572 kB Active(file): 696792 kB Inactive(file): 6994444 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 24 kB Writeback: 0 kB AnonPages: 485576 kB Mapped: 294988 kB Shmem: 11112 kB KReclaimable: 383988 kB Slab: 1359328 kB SReclaimable: 383988 kB SUnreclaim: 975340 kB KernelStack: 26848 kB PageTables: 25764 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 32820304 kB Committed_AS: 3657288 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted: 0 kB AnonHugePages: 208896 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 1948100 kB DirectMap2M: 29370368 kB DirectMap1G: 35651584 kB [chibi@centos8 ~]$ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 49 model name : AMD Ryzen Threadripper 3990X 64-Core Processor stepping : 0 microcode : 0x8301025 cpu MHz : 3601.172 cache size : 512 KB physical id : 0 siblings : 128 core id : 0 cpu cores : 64 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 16 wp : yes