Skip to content

Lambda Web Adapter Layer and Python Snapstart

1

I have a Python web application running on Lambda. It is using the Lambda Web Adapter as a layer which runs my application and repackages the responses from the end points to API Gateway so that I can get streaming. (https://github.com/aws/aws-lambda-web-adapter/tree/main)

Prior to switching to using LWA, I had snapstart working. To use the LWA and achieve streaming responses, I had to change my Lambda function entry point to a bash script to run instead of my python script, because of how the LWA works. This change is two part: 1) AWS_LAMBDA_EXEC_WRAPPER is set to /opt/bootstrap (to call into the LWA starter) and my handler property is set to "run.sh".

After switching this, the functions I have registered via register_after_restore are no longer called.

Is there a way I can use register_after_restore this way? Or is there another mechanism by which my python scripts can detect that they are running from a previously frozen snapshot, and regenerate any uniqueness necessary?

I'd say that I could reset everything unique the first time a web request is received, but the LWA pings the application to ensure it's running, so that won't work exactly.

Any advice from anyone on this would be greatly appreciated.

asked a month ago42 views
3 Answers
3

As to my understanding, the issue arises because the Lambda Web Adapter (LWA) and your custom run.sh entry point change the process hierarchy. When you use AWS_LAMBDA_EXEC_WRAPPER, your Python application runs as a child process. Consequently, the native register_after_restore hooks often fail to receive the necessary signals from the Lambda Runtime because they are being "shadowed" or intercepted by the wrapper and the shell script. Instead of relying on the native Python runtime hooks, you can leverage the LWA's own lifecycle behavior to achieve the same result more reliably:

Since LWA must "ping" your application to ensure it is ready to handle traffic after a restore, you can use that first request as your trigger:

  1. Define a dedicated Health Endpoint: Ensure your Python web app (FastAPI, Flask, etc.) has a specific /health or readiness endpoint.

  2. Use a Global State Flag: Initialize a global variable (e.g., _NEEDS_RESTORE = True) outside your request handlers.

  3. Execute Logic on First Ping: In your health check logic, check this flag. If it is True, run your "after restore" logic (regenerating tokens, re-establishing connections, etc.) and then set the flag to False.

To me, this approach feels more robust because it doesn't depend on the underlying Lambda Runtime signal propagation, which can be disrupted by the LWA/Bash wrapper. It ensures your uniqueness logic runs exactly once before any real user traffic hits your functional endpoints.

EXPERT
answered a month ago
2

Thanks for the responses in here already. Much appreciated.

For anyone else wondering how to do this in FastAPI, here's what I came up with. I install this middleware, and it handles things for me. I suspect it only works with the Lambda Web Adapter, but the logic would apply to any other web server application.

from fastapi                        import Request
from starlette.middleware.base      import BaseHTTPMiddleware
from enum                           import Enum

class SnapstartStage(Enum):
    TAKING_SNAP_SHOT = "TAKING_SNAP_SHOT"
    SNAP_SHOT_FINISHED = "SNAP_SHOT_FINISHED"
    SNAP_SHOT_RESTORED = "SNAP_SHOT_RESTORED"

class SnapstartStateMonitor(BaseHTTPMiddleware):
    def __init__(self, app):
        super().__init__(app)
        
        self._state = SnapstartStage.TAKING_SNAP_SHOT

    async def dispatch(self, request: Request, call_next):

        # This is all done with the assumption that we are using an Lambda Web Adapter layer setup in api-stack.ts.
        # On deployed lambda servers, Snapstart will take a snapshot of the running lambda machine state. During that process,
        # the LWA runs and pings our FastAPI app repeatedly until it responds. This happens during the snapshotting process.
        # So the first request to our FastAPI app that succeeds is during the snapshot process.
        # The first request after that is sent to a lambda that is from a reloaded snapshot image.
        # We use that second request as an indicator that we're loading from a snapshot and we reset anything
        # that needed to be unique or reset, such as database connections
        #
        # This is easier to do from a method registered with snapshot_restore_py.register_after_restore
        # but those don't work when using the LWA layer, so we do all this instead.

        if self._state == SnapstartStage.TAKING_SNAP_SHOT:
            self._state = SnapstartStage.SNAP_SHOT_FINISHED
        elif self._state == SnapstartStage.SNAP_SHOT_FINISHED:
            self.after_lambda_snapshot_restore()
            
            self._state = SnapstartStage.SNAP_SHOT_RESTORED


        response = await call_next(request)

        return response
    
    def after_lambda_snapshot_restore(self):
        reset_database_connections()

And in my FastAPI application initialization code:

if (os.environ.get("AWS_LAMBDA_INITIALIZATION_TYPE", 'missing') == 'snap-start'):
    # if we're using snapstart, as indicated by the initialization type, we have to monitor the state
    # so we can properly reset database connections when reloaded from a snapshot.
    api.add_middleware(SnapstartStateMonitor)

This would be a little easier / cleaner if Lambda had some way to indicate that it's running from a Snapstart snapshot, or something else that could be hooked into. The Python hooks for snapshot_restore_py.register_after_restore don't actually do the work, they just register callbacks that something else hooks into. Even documenting where and how that other things works would be useful, if it's not just baked direct into the version of Python being used by the Lambdas.

answered a month ago
EXPERT
reviewed a month ago
1

Based on the information available about Lambda runtime hooks and snapshots, the issue you're experiencing is likely related to how the Lambda Web Adapter changes your function's execution model. Runtime hooks like register_after_restore are designed to work with Lambda's native handler invocation model, where your Python code is the direct entry point.

When you switch to using a bash script as your handler with the Lambda Web Adapter, you're fundamentally changing how Lambda initializes and invokes your function. The runtime hooks are typically registered during the Python runtime's initialization phase, but with LWA acting as an intermediary through the bash script, the Python runtime may not be initializing in the expected way for these hooks to be properly registered and triggered.

Here are some potential approaches to work around this:

  1. Manual initialization detection: Since runtime hooks aren't being called, you could implement your own detection mechanism. Consider using a file-based flag in the /tmp directory (which persists across invocations within the same execution environment). On your application startup, check if a specific marker file exists in /tmp. If it doesn't exist, you know this is a fresh start (potentially from a snapshot restore), and you can perform your initialization logic and create the marker file. If it does exist, skip the initialization.

  2. Lazy initialization on first real request: While you mentioned the LWA pings your application, you could differentiate between health check requests and actual user requests. Implement a flag that tracks whether you've processed a real user request in this execution environment, and perform your uniqueness regeneration on the first non-health-check request.

  3. Application-level startup hooks: Instead of relying on Lambda's runtime hooks, implement your own startup logic within your web application framework. Most web frameworks have their own initialization hooks or startup events that run when the application starts, which would execute after a snapshot restore.

The key challenge is that by using the bash script entry point, you've moved outside the standard Python Lambda execution model where runtime hooks operate. You'll need to implement snapshot-aware initialization logic at your application level rather than relying on Lambda's built-in runtime hook mechanism.
Sources
Implement code before or after Lambda function snapshots - AWS Lambda
Understanding the Lambda programming model - AWS Lambda

answered a month ago
EXPERT
reviewed a month ago
  • Thanks for the breakdown and suggestions. I came up with a way to do this from FastAPI, based on what you suggested and others in this thread.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.