By using AWS re:Post, you agree to the Terms of Use

Questions tagged with Amazon EC2

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

1
answers
0
votes
10
views
asked a day ago

CuDNN Library Not Working Out-of-the-Box

I am using the g4dn.xlarge instance type with the Deep Learning AMI GPU TensorFlow 2.10.0 (Amazon Linux 2) 20220927. Upon logging in for the first time, I test the installation and get: ``` [ec2-user@ip-10-0-0-133 ~]$ /usr/local/bin/python3.9 -c "import tensorflow" 2022-10-06 07:09:39.691571: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-10-06 07:09:39.815546: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 2022-10-06 07:09:39.848413: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2022-10-06 07:09:40.583444: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/amazon/efa/lib64:/opt/amazon/openmpi/lib64:/usr/local/cuda/efa/lib:/usr/local/cuda/lib:/usr/local/cuda:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/targets/x86_64-linux/lib:/usr/local/lib:/usr/lib:/lib: 2022-10-06 07:09:40.583574: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/amazon/efa/lib64:/opt/amazon/openmpi/lib64:/usr/local/cuda/efa/lib:/usr/local/cuda/lib:/usr/local/cuda:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/targets/x86_64-linux/lib:/usr/local/lib:/usr/lib:/lib: 2022-10-06 07:09:40.583594: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. ``` The error was worrying, and so I decided to check the status of CUDA and cuDNN. ``` [ec2-user@ip-10-0-0-133 ~]$ whereis nvcc nvcc: /usr/local/cuda-11.2/bin/nvcc.profile /usr/local/cuda-11.2/bin/nvcc [ec2-user@ip-10-0-0-133 ~]$ whereis cudnn.h cudnn: ``` The lack of path for cuDNN is of course a problem. To better confirm, however, I ran the scripts in this answer: https://stackoverflow.com/a/47436840 I get the outputs: ``` libcudart.so.11.0 -> libcudart.so.11.2.152 libcuda.so.1 -> libcuda.so.510.47.03 libcuda.so.1 -> libcuda.so.510.47.03 libcuda is installed libcudart.so.11.0 -> libcudart.so.11.2.152 libcudart is installed ``` and ``` ERROR: libcudnn is NOT installed ``` However, when navigating to the CUDA folder, I see the the cuDNN files are actually present, and so I'm unsure what the problem is.
0
answers
0
votes
9
views
asked a day ago