Including a transformer model in the container for AWS Lambda

0

To speed up cold starts of my AWS Lambda function, I tried to include the HuggingFace transformer model that the function uses into the container by writing the following in the Dockerfile:

ENV HF_HOME=${LAMBDA_TASK_ROOT}/huggingface
ENV TRANSFORMERS_CACHE=${LAMBDA_TASK_ROOT}
ENV XDG_CACHE_HOME=${LAMBDA_TASK_ROOT}
COPY .cache/torch ${LAMBDA_TASK_ROOT}/torch

The function works and CloudWatch logs indicate that the model is not being downloaded. However, the logs contain the error:

There was a problem when trying to write in your cache folder (/var/task). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.

Even thought the function works, I cannot ignored the error, because the initialization stage during a cold start is still slow and I am being charged for it. The latter fact indicates that a lot of work is being done to initialize the model and that is precisely what I am trying to avoid.

When I set the environment variables to point to /tmp and copy the model there, the model is downloaded during the cold start. It looks like /tmp is cleaned before the cold start.

So, what is the correct way to include the model in the container to make the billed initialization phase during the cold start fast?

P.S. The question at SO: https://stackoverflow.com/q/77075765/2725810

2 Answers
0

My current solution is to copy the model from ${LAMBDA_TASK_ROOT} to /tmp in the initialization part of the Lambda function itself. I wonder whether there is a more performant solution.

answered 8 months ago
-1

Try copying the file to original location and setting the environment variable to the /tmp folder.

profile pictureAWS
EXPERT
Uri
answered 8 months ago
  • If the environment variable points to /tmp, how will SentenceTransformers find the model in ${LAMBDA_TASK_ROOT}? Trying it...

  • It downloads the model into /tmp (I print out the results of find / | grep all-MiniLM-L6-v2 before and after model initialization, so I know exactly what was there before model initialization and what got downloaded).

  • I understand. Try copying the file to the root, and set the env variable to /tmp, and copy the file in the init from the root to /tmp. This will probably be faster that downloading the model.

  • @Uri No, this is terribly slow. At least in the initial invoke, it times out three times -- 90 sec in total, all of which are billed! Quite unexpected for copying 300 MB.

  • I understand. You can try downloading it from S3. It will probably be quicker than downloading it from the internet somewhere.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions