Snapstart enabled lambda getting error "Runtime restore hook execution timed out" even though we aren't using afterRestore()

0

Occasionally, we are seeing the following calling a snapstart enabled lambda: Runtime restore hook execution timed out after 2 seconds (Service: Lambda, Status Code: 408, Request ID: 18a9e2cc-18b1-4197-968b-6ab1fa738ca6)

From what I see in the docs, there is a 2 second timeout for restore hooks ("Restore hooks (afterRestore()) time out after 2 seconds"). However, afterRestore() does not exist in our code. Is there some other restore hook time perhaps in our code that might be causing this even though we don't use afterRestore()?

Also, note that when this happens, there are no lambda cloudwatch log entries with that message in them. I see the from the result of the caller (in this case, a step function) Enter image description here

I do see several high restore duration values during the timeframe where the error happened (that I can't explain), but those cases succeeded: restore-duration

asked a year ago784 views
3 Answers
0

Hello,

I understand that you are using SnapStart enabled Lambda function, where you are seeing following error intermittently:

Runtime restore hook execution timed out after 2 seconds (Service: Lambda, Status Code: 408

Please check to confirm whether you have implemented runtime hook, in your application. If yes, then the issue could be that the interface method afterRestore() times out.

"Resource – An interface with two methods, beforeCheckpoint() and afterRestore(). Use these methods to implement the code that you want to run before a snapshot and after a restore."

Also, when SnapStart is activated, your initialization code can run for up to 15 minutes. Checkpoint hooks beforeCheckpoint()) count towards the 15 minutes. Restore hooks (afterRestore()) time out after 2 seconds.

If this does not address your issue, we require details that are non-public information for further troubleshooting. Please open a support case, with AWS using this link, and make sure to include details on Lambda function ARN, Request IDs and Timestamp.

AWS
SUPPORT ENGINEER
Isha_K
answered a year ago
  • Thanks for the follow-up.

    Please check to confirm whether you have implemented runtime hook, in your application.

    We for sure didn't have afterRestore() in our code. That would have required the crac library dependency which we don't have included in our jar.

    At this point, we have opted not to use SnapStart for the time being.

0

hello, the 2 second timeout limit applies to loading the runtime (JVM) and the afterRestore() hooks. Please find the updated documentation here. Can you please let us know if there's a support ticket submitted that we can investigate to understand why your restore is timing out?

profile pictureAWS
Tarun
answered a year ago
  • I have encountered this too today. Is there a CTI I can submit a ticket too?

0

Can you please let us know if there's a support ticket submitted that we can investigate to understand why your restore is timing out?

We have not created a ticket since our team decided not to spend more time on SnapStart for now.

Note that the only thing I can think of that might be relevant about the errored invocation was that there were about 500 concurrent invocations of the same lambda at the time of the error. We were not using afterRestore() for this lambda.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions