I have produced a simple experiment whereby a parent Lambda function invokes in parallel 10 instances of the child Lambda function. For some reason, one of the child instances does not begin for almost a second after the other 9 instances returned their response. Here is the code for easy reproduction of the problem:
The child Lambda function:
import json
import time
from stats import Timer, ms_now
def lambda_handler(event, context):
begin = ms_now()
timer = Timer()
time.sleep(0.5)
return {
'statusCode': 200,
'body': json.dumps(
{'time': timer.stop(),
'begin': begin,
'end': ms_now()
})
}
The parent Lambda function:
import boto3
import json
import concurrent.futures
from stats import Timer, ms_now
client = boto3.session.Session().client('lambda')
def do_lambda():
mybegin = ms_now()
timer = Timer()
response = client.invoke(
FunctionName = 'arn:aws:lambda:us-east-1:497857710590:function:wherewasit-search',
InvocationType = 'RequestResponse',
Payload = json.dumps({}))
payload = json.loads(json.load(response['Payload'])['body'])
print(f"Thread began: {mybegin} Child Lambda began at {payload['begin']} and took {payload['time']}ms Response after {timer.stop()}ms")
def do_work(n_threads):
timer = Timer()
with concurrent.futures.ThreadPoolExecutor(max_workers=n_threads + 1) as executor:
futures = []
for i in range(n_threads):
futures.append(executor.submit(do_lambda))
for future in concurrent.futures.as_completed(futures):
pass
print(f"Joined all: {timer.stop()}ms")
def lambda_handler(event, context):
do_work(10)
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
Both Lambda functions use the stats
module for time measurements:
import time
def ms_now():
return int(time.time_ns() / 1000000)
class Timer():
def __init__(self, timestamp_function=ms_now):
self.timestamp_function = timestamp_function
self.start = self.timestamp_function()
def stop(self):
return self.timestamp_function() - self.start
Here is the output of the parent Lamda function:
Thread began: 1699451717589 Child Lambda began at 1699451717602 and took 500ms Response after 517ms
Thread began: 1699451717585 Child Lambda began at 1699451717602 and took 500ms Response after 522ms
Thread began: 1699451717587 Child Lambda began at 1699451717604 and took 501ms Response after 521ms
Thread began: 1699451717591 Child Lambda began at 1699451717606 and took 500ms Response after 518ms
Thread began: 1699451717593 Child Lambda began at 1699451717608 and took 501ms Response after 519ms
Thread began: 1699451717596 Child Lambda began at 1699451717610 and took 500ms Response after 517ms
Thread began: 1699451717594 Child Lambda began at 1699451717609 and took 500ms Response after 519ms
Thread began: 1699451717598 Child Lambda began at 1699451717615 and took 501ms Response after 521ms
Thread began: 1699451717600 Child Lambda began at 1699451717614 and took 501ms Response after 519ms
Thread began: 1699451717602 Child Lambda began at 1699451718837 and took 501ms Response after 1739ms
Notice how the timestamp of the beginning of the child Lambda in the last line is much later than the rest of the instances. It looks like I am hitting some limit, but I could not find anything relevant in the documentation. What is this limit and how do I work around it?
P.S. The question at SO