Including a transformer model in the container for AWS Lambda

0

To speed up cold starts of my AWS Lambda function, I tried to include the HuggingFace transformer model that the function uses into the container by writing the following in the Dockerfile:

ENV HF_HOME=${LAMBDA_TASK_ROOT}/huggingface
ENV TRANSFORMERS_CACHE=${LAMBDA_TASK_ROOT}
ENV XDG_CACHE_HOME=${LAMBDA_TASK_ROOT}
COPY .cache/torch ${LAMBDA_TASK_ROOT}/torch

The function works and CloudWatch logs indicate that the model is not being downloaded. However, the logs contain the error:

There was a problem when trying to write in your cache folder (/var/task). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.

Even thought the function works, I cannot ignored the error, because the initialization stage during a cold start is still slow and I am being charged for it. The latter fact indicates that a lot of work is being done to initialize the model and that is precisely what I am trying to avoid.

When I set the environment variables to point to /tmp and copy the model there, the model is downloaded during the cold start. It looks like /tmp is cleaned before the cold start.

So, what is the correct way to include the model in the container to make the billed initialization phase during the cold start fast?

P.S. The question at SO: https://stackoverflow.com/q/77075765/2725810

2 個答案
0

My current solution is to copy the model from ${LAMBDA_TASK_ROOT} to /tmp in the initialization part of the Lambda function itself. I wonder whether there is a more performant solution.

已回答 9 個月前
-1

Try copying the file to original location and setting the environment variable to the /tmp folder.

profile pictureAWS
專家
Uri
已回答 9 個月前
  • If the environment variable points to /tmp, how will SentenceTransformers find the model in ${LAMBDA_TASK_ROOT}? Trying it...

  • It downloads the model into /tmp (I print out the results of find / | grep all-MiniLM-L6-v2 before and after model initialization, so I know exactly what was there before model initialization and what got downloaded).

  • I understand. Try copying the file to the root, and set the env variable to /tmp, and copy the file in the init from the root to /tmp. This will probably be faster that downloading the model.

  • @Uri No, this is terribly slow. At least in the initial invoke, it times out three times -- 90 sec in total, all of which are billed! Quite unexpected for copying 300 MB.

  • I understand. You can try downloading it from S3. It will probably be quicker than downloading it from the internet somewhere.

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南