API Gateway HTTP API Authorizer Caching 500 Errors

0

We have a HTTP API using a custom Lambda authorizer with caching enabled as described here. It works great except when an unexpected error occurs in the Lambda (e.g. a transient database connection error). Since there was an error, authorization cannot be determined and the request will need to be retried. Ideally, the user would get a 500 or 401 and try again. However, API Gateway caches this 500 failure! So our users are locked out until the TTL expires or they get a new token. The fact that API Gateway will cache 500s from an authorizer renders this feature unusable in any production environment. Is there a workaround? Is this a bug?

2 Answers
2

This is not a bug but limitation on API Gateway handling caching, the behavior you're experiencing with API Gateway caching 500 errors from a Lambda authorizer is a recognized limitation when utilizing authorization caching. API Gateway indiscriminately caches all responses from the authorizer, including error responses, which may result in the issue you've observed.

Please consider:

  • Return 401 or 403 instead of 500
  • Disable Caching Temporarily
  • Reduce Cache TTL
  • Custom your own Error handling
EXPERT
answered a month ago
  • None of those options work.

    • "Return 401 or 403 instead of 500" In the case where you are unable to determine if the request is authorized at all, like when the db is down for a few seconds, the request needs to be retried. As you pointed out, the response is cached indiscriminately so it doesn't matter what code you return
    • "Disable Caching Temporarily; Reduce Cache TTL" These are manual interventions, not something you could/would do automatically. Unless there's something the auth lambda can return in its response to temporarily disable cache
    • "Custom your own Error handling" It doesn't matter what you do in your own code, anything can happen. AWS Lambda could have an error before it even gets to your code and evidently APIG would cache that
0

This is a very insightful observation, and you're absolutely right that the current behavior of API Gateway caching any response from a Lambda authorizer — including transient 500s — can cause significant issues in production environments. While this is not technically a bug, it is a serious limitation in the design.

Since there is no way for the authorizer to signal "do not cache this response", and the cache indiscriminately stores even failure results, we're left in a position where the architecture must be reconsidered for fault-tolerance and resilience.

Instead of offering a single definitive solution, here are a few directions you might consider — each with trade-offs — depending on what flexibility you have in your system design:


Option 1: Disable caching or set authorizerResultTtlInSeconds to 0


Option 2: Use a JWT authorizer (e.g., Cognito or OIDC)

  • If you're issuing JWTs, consider configuring an Amazon Cognito authorizer or a generic JWT authorizer.
  • These perform verification directly in API Gateway — no Lambda function is involved.
  • This avoids the issue of authorizer Lambda failures being cached, because there's no Lambda execution to fail.

Option 3: Separate authentication and authorization logic

  • Let API Gateway handle authentication (via JWT verification or Cognito), and move fine-grained authorization checks to your backend services.
  • This decouples transient infrastructure issues (e.g., DB connectivity) from the authentication phase, which is where the caching issue arises.

profile picture
answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions