Including a transformer model in the container for AWS Lambda

0

To speed up cold starts of my AWS Lambda function, I tried to include the HuggingFace transformer model that the function uses into the container by writing the following in the Dockerfile:

ENV HF_HOME=${LAMBDA_TASK_ROOT}/huggingface
ENV TRANSFORMERS_CACHE=${LAMBDA_TASK_ROOT}
ENV XDG_CACHE_HOME=${LAMBDA_TASK_ROOT}
COPY .cache/torch ${LAMBDA_TASK_ROOT}/torch

The function works and CloudWatch logs indicate that the model is not being downloaded. However, the logs contain the error:

There was a problem when trying to write in your cache folder (/var/task). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.

Even thought the function works, I cannot ignored the error, because the initialization stage during a cold start is still slow and I am being charged for it. The latter fact indicates that a lot of work is being done to initialize the model and that is precisely what I am trying to avoid.

When I set the environment variables to point to /tmp and copy the model there, the model is downloaded during the cold start. It looks like /tmp is cleaned before the cold start.

So, what is the correct way to include the model in the container to make the billed initialization phase during the cold start fast?

P.S. The question at SO: https://stackoverflow.com/q/77075765/2725810

2개 답변
0

My current solution is to copy the model from ${LAMBDA_TASK_ROOT} to /tmp in the initialization part of the Lambda function itself. I wonder whether there is a more performant solution.

답변함 9달 전
-1

Try copying the file to original location and setting the environment variable to the /tmp folder.

profile pictureAWS
전문가
Uri
답변함 9달 전
  • If the environment variable points to /tmp, how will SentenceTransformers find the model in ${LAMBDA_TASK_ROOT}? Trying it...

  • It downloads the model into /tmp (I print out the results of find / | grep all-MiniLM-L6-v2 before and after model initialization, so I know exactly what was there before model initialization and what got downloaded).

  • I understand. Try copying the file to the root, and set the env variable to /tmp, and copy the file in the init from the root to /tmp. This will probably be faster that downloading the model.

  • @Uri No, this is terribly slow. At least in the initial invoke, it times out three times -- 90 sec in total, all of which are billed! Quite unexpected for copying 300 MB.

  • I understand. You can try downloading it from S3. It will probably be quicker than downloading it from the internet somewhere.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠