CloudHSM connection from Lambda going stale

1

I'm trying to use the CloudHSM JCE provider (SDK 5) from a Java lambda. The lambda is deployed using a docker image, which is built with the JCE provider dependencies, similar to what is outlined in this blog post

When the lambda is deployed, the initial call succeeds, as do subsequent calls done with little delay. However, if there is a longer delay (> 3 minutes), the connection goes stale and the call fails. I understand that this is due to lambda purging inactive connections. However, the logs show that the CloudHSM provider is attempting to make keep-alive calls, which are failing:

[cloudhsm_provider::common::keep_alive] Keep-alive failed for <HSM_IP>. Internal Error: Internal error occurred. Error: Failed to send request to the HSM. Failed to send request to the Server. Error: Failed to append unique_id to packet.

Can someone help me understand why these keep-alive calls could be failing? Or if there is any work-around for resetting the connection manually?

  • In which part of the Lambda execution lifecycle do you initiate the CloudHSM connectivity? See AWS documentation about execution lifecycle https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtime-environment.html

    Does it make any difference if you always initiate the connection in the "Invoke" phase?

  • The connection is initiated during the invoke phase. There is static code to add CloudHSM as a java.Security provider. The provider must only be added once or it will throw an exception. Adding the provider initializes the connection. The storage/management of this connection is handled by the CloudHSM JCE library. So once it's initialized, there's no interface for managing the connection.

    As far as I can tell, there's no way to initiate the connection once per invocation. It only happens once per execution environment. And once it goes stale, the calls will fail until there is a long enough delay that the execution environment shuts down.

  • This is related to CloudHSM library issue with Lambda "warm start" and it was fixed since CloudHSM SDK 5.8.0. Would you please upgrade to latest JCE SDK if it is possible.

asked a year ago141 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions