Here is the part of my AWS Lambda (packaged as a container) related to initialization:
import json
import subprocess
import time
def ms_now():
return int(time.time_ns() / 1000000)
class Timer():
def __init__(self):
self.start = ms_now()
def stop(self):
return ms_now() - self.start
def shell_command(command):
print("Executing: ", command, flush=True)
result = subprocess.run(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
if result.returncode == 0:
print("Success", flush=True)
else:
print("Error:", result.stderr, flush=True)
def copy():
shell_command(f"cp -r /var/task/torch /tmp/")
shell_command(f"cp -r /var/task/huggingface /tmp/")
# Copying models to /tmp
timer = Timer()
copy()
print("Time to copy:", timer.stop(), flush=True)
# Initializing the models
timer = Timer()
from sentence_transformers import SentenceTransformer
from punctuators.models import PunctCapSegModelONNX
model_name = "pcs_en"
model_sentences = PunctCapSegModelONNX.from_pretrained(model_name)
model_embeddings = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
print("Time to initialize:", timer.stop(), flush=True)
Here is what I see in CloudWatch for a cold start (but not the first invocation after uploading the model to which another question was dedicated) with the above initialization:
However, the detailed output shows:
2023-09-11T19:03:26.948+03:00 Executing: cp -r /var/task/torch /tmp/
2023-09-11T19:03:32.251+03:00 Success
2023-09-11T19:03:32.251+03:00 Executing: cp -r /var/task/huggingface /tmp/
2023-09-11T19:03:33.962+03:00 Success
2023-09-11T19:03:33.962+03:00 Time to copy: 7015
2023-09-11T19:03:37.048+03:00 Executing: cp -r /var/task/torch /tmp/
2023-09-11T19:03:41.126+03:00 Success
2023-09-11T19:03:41.126+03:00 Executing: cp -r /var/task/huggingface /tmp/
2023-09-11T19:03:41.395+03:00 Success
2023-09-11T19:03:41.395+03:00 Time to copy: 4347
2023-09-11T19:03:43.516+03:00 The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.
2023-09-11T19:03:43.519+03:00 0it [00:00, ?it/s] 0it [00:00, ?it/s]
2023-09-11T19:03:45.552+03:00 Time to initialize: 4157
Note two things:
- The copy operation is duplicated and
- the time shown in the first snapshot does not include the first copy.
What is going on here?