Error Executing Lambda Function from Container Image: libcufft.so.10 Shared Object Mapping Failure

0

I was deploying my PyTorch model on AWS Lambda with the container image following these steps in the Developer Guide. I can get reasonable output from the running container locally, so I upload the image to ECR and create the lambda function. However, the execution of the function failed. The detailed error message is:

errorMessage": "/var/lang/lib/python3.8/site-packages/nvidia/cufft/lib/libcufft.so.10: failed to map segment from shared object",

errorType": "OSError",

stackTrace": [

File "/var/lang/lib/python3.8/imp.py", line 234, in load_module\n return load_source(name, filename, file)\n",

File "/var/lang/lib/python3.8/imp.py", line 171, in load_source\n module = _load(spec)\n",

File "<frozen importlib._bootstrap>", line 702, in _load\n",

File "<frozen importlib._bootstrap>", line 671, in _load_unlocked\n",

File "<frozen importlib._bootstrap_external>", line 843, in exec_module\n",

File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed\n",

File "/var/task/app.py", line 2, in <module>\n from transformers import AutoTokenizer, AutoModelForQuestionAnswering\n",

File "/var/lang/lib/python3.8/site-packages/transformers/init.py", line 34, in <module>\n from . import dependency_versions_check\n",

File "/var/lang/lib/python3.8/site-packages/transformers/dependency_versions_check.py", line 34, in <module>\n from .file_utils import is_tokenizers_available\n",

File "/var/lang/lib/python3.8/site-packages/transformers/file_utils.py", line 59, in <module>\n import torch\n",

File "/var/lang/lib/python3.8/site-packages/torch/init.py", line 228, in <module>\n _load_global_deps()\n",

File "/var/lang/lib/python3.8/site-packages/torch/init.py", line 189, in _load_global_deps\n _preload_cuda_deps(lib_folder, lib_name)\n",

File "/var/lang/lib/python3.8/site-packages/torch/init.py", line 155, in _preload_cuda_deps\n ctypes.CDLL(lib_path)\n",

File "/var/lang/lib/python3.8/ctypes/init.py", line 373, in init\n self._handle = _dlopen(self._name, mode)\n"

I noticed that the error is from the package "nvidia" but my function does not need GPU as I have test locally. What is causing this error message on deployment? How can I fix it?

Keine Antworten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen