How to alert if an event doesn't happen within a period of time?

0

Scenario: For compliance reasons a monitor needs to be in place to alert if a backup either fails or otherwise doesn't succeed.

How can we alert if the the rds event category "backup" and id "0002" [Backup Finished] isn't emitted in the last 48 hours?

3 Answers
1

Checking for a negative is always a little tricky.

In this case, I'd have something that is triggered by the positive event (Backup Finished) which stores a timestamp somewhere. Then have another process which checks that timestamp at specified intervals - this process would emit an alert if the timestamp is too old.

profile picture
EXPERT
answered 2 months ago
1

Please note, as the other contributors have mentioned here, checking for negative is tricky on the managed events and there is no built-in mechanism to achieve this test case. Hence, I would suggest you to consider the below work-around.

  1. Create a lambda function.

  2. Utilize the below script. Kindly make the necessary changes to the instance details.

  3. Kindly make sure the IAM role associated with this lambda function has appropriate permission to describe the snapshots and to subscribe to SNS.

  4. Configure via lambda script to execute every 24 hours

Python script to capture the latest snapshot details

  import json
    import dateutil.tz
    def lambda_handler(event, context):
    mydbInstances = ['sandboxinstance']
    for mydbInstance in mydbInstances:
        snaps1 = [[]*2]
        snapshot = []
       for snapshot in rds.describe_db_snapshots(DBInstanceIdentifier=mydbInstance,SnapshotType='manual')['DBSnapshots']:
           if snapshot['Status']=='available':
            snaps1.append([snapshot['DBSnapshotArn'],snapshot['SnapshotCreateTime']])
        snaps1.remove(snaps1[0])
        snaps1.sort(key=lambda x:x[1], reverse=True)
        print ("RDS Snapshot name " ,snaps1[0])
        SourceDBSnapshotIdentifierARN=snaps1[0][0]
        install_time = snaps1[0][1]
        right_now = datetime.datetime.now(dateutil.tz.tzlocal())
        diff = right_now - install_time
        diff_minutes = (diff.days * 24 * 60) + (diff.seconds/60)
        print diff_minutes

Trigger E-Mail based on the difference in minutes observed.

   notification = "Here is the SNS notification for Lambda function tutorial."
        client = boto3.client('sns')
        response = client.publish (
              TargetArn = "arn:aws:sns:us-east-1:xxxxxx:RDSNote",
              Message = json.dumps({'default': notification}),
              MessageStructure = 'json'
        )
SUPPORT ENGINEER
answered 2 months ago
0

You could also create a file after the job finishes and then check if the file or object in a bucket exists after the certain time period expires. You'd have to delete it at some point, too, of course, like before the job starts.

answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions