Error Executing Lambda Function from Container Image: libcufft.so.10 Shared Object Mapping Failure

0

I was deploying my PyTorch model on AWS Lambda with the container image following these steps in the Developer Guide. I can get reasonable output from the running container locally, so I upload the image to ECR and create the lambda function. However, the execution of the function failed. The detailed error message is:

errorMessage": "/var/lang/lib/python3.8/site-packages/nvidia/cufft/lib/libcufft.so.10: failed to map segment from shared object",

errorType": "OSError",

stackTrace": [

File "/var/lang/lib/python3.8/imp.py", line 234, in load_module\n return load_source(name, filename, file)\n",

File "/var/lang/lib/python3.8/imp.py", line 171, in load_source\n module = _load(spec)\n",

File "<frozen importlib._bootstrap>", line 702, in _load\n",

File "<frozen importlib._bootstrap>", line 671, in _load_unlocked\n",

File "<frozen importlib._bootstrap_external>", line 843, in exec_module\n",

File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed\n",

File "/var/task/app.py", line 2, in <module>\n from transformers import AutoTokenizer, AutoModelForQuestionAnswering\n",

File "/var/lang/lib/python3.8/site-packages/transformers/init.py", line 34, in <module>\n from . import dependency_versions_check\n",

File "/var/lang/lib/python3.8/site-packages/transformers/dependency_versions_check.py", line 34, in <module>\n from .file_utils import is_tokenizers_available\n",

File "/var/lang/lib/python3.8/site-packages/transformers/file_utils.py", line 59, in <module>\n import torch\n",

File "/var/lang/lib/python3.8/site-packages/torch/init.py", line 228, in <module>\n _load_global_deps()\n",

File "/var/lang/lib/python3.8/site-packages/torch/init.py", line 189, in _load_global_deps\n _preload_cuda_deps(lib_folder, lib_name)\n",

File "/var/lang/lib/python3.8/site-packages/torch/init.py", line 155, in _preload_cuda_deps\n ctypes.CDLL(lib_path)\n",

File "/var/lang/lib/python3.8/ctypes/init.py", line 373, in init\n self._handle = _dlopen(self._name, mode)\n"

I noticed that the error is from the package "nvidia" but my function does not need GPU as I have test locally. What is causing this error message on deployment? How can I fix it?

No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions