API Gateway endpoint stopped working

0

We have physical gateways deployed all around the country, they've been working for months, and out of the blue they've stopped posting data.

Our architecture:

  1. nRF thingy91 as the gateway
  2. Serverless with API gateway and Lambdas
  3. HTTPS POST request from the gateway to the endpoint
  4. MongoDB as the database

**Things we've discovered: **

  1. Gateways stopped sending data a week ago
  2. Testing the gateway, we found that it does post, but receives an error from the backend
  3. No changes were made to the backend and no redeployments
  4. All gateways have enough data on their sims
  5. Serverless offline works
  6. Lambdas aren't being invoked, but the API gateway is getting some traffic, though I'm having trouble deciphering what and how exactly
  7. CloudWatch shows no new logs
  8. Postman doesn't work either
  9. Don't see any billing issues
  10. Redeploying doesn't resolve the issue
  11. Receiving and handling requests works just fine and dandy from our mobile and web apps, which communicate with different endpoints but are under the same stage.
  • Have you enabled both execution and access logs or x ray? If api gw is hit you should at least get entries there

  • Yes, I can see execution logs for the relevant stage, nothing is received from the gateways. However (and I'll add this to my original question), I am receiving and handling requests just fine from the mobile and web app, which communicate with different endpoints but are under the same stage.

  • You mentioned execution logs, but what about acces logs? Those usually indicates whether your api gw is called or not. If there is no call, the the request is blocked before reaching the api gateway, mabe by resource policy or other security mechanisms.

  • I see I do have access logs enabled

1 Answer
0

Hello, In order to resolve this you can troubleshoot in the below order.

If it's a REST API then go the API Console, select the API, click on POST method and do a "TEST" from the Console. Once you hit on TEST after entering the headers and body, you will get some logs printed on right hand side. Review the logs properly, and see if Lambda function is invoked in the logs as it will have those details. If the function is not invoked then it might be possible that the Access to invoke the Lambda function has been removed so it might not invoke Lambda.

Now, how to add the access back, just go to the "Integration Method", if "proxy integration" checkbox is not selected then select "proxy integration" checkbox and click ok and then unselect it and if It was already selected then remove it and chick on the checkbox again to add it. This will re-add the permissions for API Gateway to invoke the Lambda function.

If you notice that the function got in the previous logs, like 429 then check you account level concurrency details and also check the function concurrency details.

Lastly if function is getting triggered then go to the function and check the function logs to troubleshoot further.

Also, it might be possible that you have configured IAM role in API Gateway "Integration Method" to invoke Lambda function so check the IAM role and see whether it has the right permissions along with trust policy.

If it's a HTTP Api then it doesn't have test functionality so go to the Lambda function and click on Configuration -> permissions -> scroll down and see if resource based policy is added for the API or not.

Additionally, regarding your comment about "no new cloudwatch logs", if you are referring the API Gateway cloudwatch logs then review the IAM role which is used to send logs to cloudwatch. Verify that it exists in the account and has the correct permissions. Check out this re:post KC Article.

Shivam
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions