Snapstart enabled lambda getting error "Runtime restore hook execution timed out" even though we aren't using afterRestore()

0

Occasionally, we are seeing the following calling a snapstart enabled lambda: Runtime restore hook execution timed out after 2 seconds (Service: Lambda, Status Code: 408, Request ID: 18a9e2cc-18b1-4197-968b-6ab1fa738ca6)

From what I see in the docs, there is a 2 second timeout for restore hooks ("Restore hooks (afterRestore()) time out after 2 seconds"). However, afterRestore() does not exist in our code. Is there some other restore hook time perhaps in our code that might be causing this even though we don't use afterRestore()?

Also, note that when this happens, there are no lambda cloudwatch log entries with that message in them. I see the from the result of the caller (in this case, a step function) Enter image description here

I do see several high restore duration values during the timeframe where the error happened (that I can't explain), but those cases succeeded: restore-duration

質問済み 1年前825ビュー
3回答
0

Hello,

I understand that you are using SnapStart enabled Lambda function, where you are seeing following error intermittently:

Runtime restore hook execution timed out after 2 seconds (Service: Lambda, Status Code: 408

Please check to confirm whether you have implemented runtime hook, in your application. If yes, then the issue could be that the interface method afterRestore() times out.

"Resource – An interface with two methods, beforeCheckpoint() and afterRestore(). Use these methods to implement the code that you want to run before a snapshot and after a restore."

Also, when SnapStart is activated, your initialization code can run for up to 15 minutes. Checkpoint hooks beforeCheckpoint()) count towards the 15 minutes. Restore hooks (afterRestore()) time out after 2 seconds.

If this does not address your issue, we require details that are non-public information for further troubleshooting. Please open a support case, with AWS using this link, and make sure to include details on Lambda function ARN, Request IDs and Timestamp.

AWS
サポートエンジニア
Isha_K
回答済み 1年前
  • Thanks for the follow-up.

    Please check to confirm whether you have implemented runtime hook, in your application.

    We for sure didn't have afterRestore() in our code. That would have required the crac library dependency which we don't have included in our jar.

    At this point, we have opted not to use SnapStart for the time being.

0

hello, the 2 second timeout limit applies to loading the runtime (JVM) and the afterRestore() hooks. Please find the updated documentation here. Can you please let us know if there's a support ticket submitted that we can investigate to understand why your restore is timing out?

profile pictureAWS
Tarun
回答済み 1年前
  • I have encountered this too today. Is there a CTI I can submit a ticket too?

0

Can you please let us know if there's a support ticket submitted that we can investigate to understand why your restore is timing out?

We have not created a ticket since our team decided not to spend more time on SnapStart for now.

Note that the only thing I can think of that might be relevant about the errored invocation was that there were about 500 concurrent invocations of the same lambda at the time of the error. We were not using afterRestore() for this lambda.

回答済み 1年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ