How to "health check" a DynamoDB connection (C++ SDK)

0

Dear Experts,

I have a long-running process that connects to DynamoDB using the C++ SDK.

I have Route 53 health checks enabled, to which this process responds. If the process crashes, or the whole server goes down, Route 53 notices and fails over to a backup.

However if the process' connection to DynamoDB fails then the health checker doesn't currently notice this because the health check response doesn't exercise that code. This has recently caused some downtime.

I can of course modify the process to do a dummy DynamoDB action of some sort before responding to the health check. But I have a couple of reservations about that:

  • That's a lot of extra DynamoDB requests. I was wondering if there is some sort of simpler "ping" that I can send to DynamoDB. Or if there is a more generic service health ping common to all AWS services.

  • Alternatively, I could track the success/failure of the most recent non-health-check request and report that for the health check - but in that case, once Route 53 has failed over there are no more non-health-check requests, and it will be stuck in the failed state.

It won't be difficult for me to hack something together, but the danger is constantly adding code that detects the last failure but won't detect the next one. I am posting this to ask if anyone has any best practice advice for how to do this "properly".

Thanks, Phil.

asked 2 years ago1585 views
1 Answer
0

I don't know of any "dummy" call you could make to DynamoDB that would prove that the database is in any particular state. Calling the control plane to (say) describe a table isn't going to mean that data can be retrieved or stored in the table.

If it were me, I'd make a GetItem call at a period less than the Route53 health check timeout (or, as you say - when the health check comes in). Not that even this isn't a 100% test but it should be "good enough". And it isn't a guarantee that the next call won't fail - because things fail all the time (great quote from Werner Vogels).

Tracking the last call status is pretty good but you've also pointed out the problem there.

Yes, extra calls and cost to get an up-to-date status. Really depends on whether that is worth it for you.

profile pictureAWS
EXPERT
answered 2 years ago
  • Agreed, the cost of a half RCU for each health check is cheap. How many health checks do you get for a dollar? If using provisioned capacity, about 50 million calls, by my math. Health check every 5 seconds? That dollar buys you 8 years of health checks.

  • Regarding the cost, my calculation (after I posted the question) was

    • I seem to get 32 Route 53 health check requests per minute.
    • That's about 1.4 million per month.
    • The main action that this process does on the database is DeleteItem, and DeleteItem("nonexistant-key") is a no-op, so that's what I'm using as the "ping".
    • I believe that this costs one WCU per action, so that's 1.4 million WCUs per month.
    • WCUs cost about $1.40 per million, so the total cost is about $2 per month.

    In my experience with AWS pricing, everything is always "ridiculously cheap" or "ridiculously expensive". Calculations like this just need to make sure they aren't wrong by a factor of a thousand or a million, due to e.g. confusing per-second and per-month prices!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions