Skip to content

AWS cloudwatch log alarm which steps should I be looking at ?

0

I have a cloudwatch log file that contains messages from multiple cameras. Below is a snip from one camera that shows it going offline and then coming back online. The log can contain messages from multiple cameras in 100's

2024-11-15 13:10:11.234+00:00 [   167] WARNING    - bdab-b307-4df3-8596  CameraDecoder (testc00013.test0001ev) - Camera 1  Device communication error (NoDataException). Error: GetMediaDataBlock returned no data.

2024-11-15 13:10:18.602+00:00 [   167] INFO       - bdab-b307-4df3-8596  CameraDecoder (testc00013.test0001ev) - Camera 1  Device communication established

The unique cameraid is

testc00013.test0001ev

and the offline string is Device communication error and online is Device communication established

What I would like to do is create and alarm for each camera if it goes offline and doesn't come back online within 5 mins. I've looked at creating a log insight and metric filters but I cannot understand how I can use those feature to achieve an alarm for each unique cameraid ? I need to understand the steps should be looking at to achieve my outcome.

4 Answers
1

Hi, did you look into contributor insights? If you create a rule to detect communication errors per camera id, another rule to detect communication established per camera id, you can combine them using metric math (e.g. graph e2 - e1, where e2 is the id of the 2nd rule that detects communication established and e1 is the id of the first rule). This would allow you to fetch the ones where the diff is positive, graph with a 5 min period should be close to what you are looking for? See https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ContributorInsights-GraphReportData.html for an example of contributor insight setup. The difficulty I see in your use case is to find a syntax that allows Contributor insights to pick your value, as your log is not JSON structured, it feels to me easier to write rules when logs are structured, but depends on how familiar you are with contributor insights way of parsing logs

AWS
answered a year ago
  • Thanks for the reply I will look at the post. In the end we have gone for a lambda function to store offline states in DynamoDB and lambda to scan the DB for offline cameras greater than X mins.

0
Accepted Answer

In the end we have gone for a lambda function to store offline states in DynamoDB and lambda another lambda function to scan the DB for offline cameras greater than X mins.

answered a year ago
0
  1. Create a CloudWatch Log Group: Ensure your camera logs are being sent to a CloudWatch Log Group.

  2. Create Metric Filters: You'll need to create two metric filters for each camera:

    • One to detect when a camera goes offline
    • Another to detect when a camera comes back online

    The filter pattern for offline events would look something like: [date, time, ..., cameraid="testc00013.test0001ev", ..., message="Device communication error"]

    And for online events: [date, time, ..., cameraid="testc00013.test0001ev", ..., message="Device communication established"]

    You'll need to create these filters for each unique camera ID.

  3. Create Custom Metrics: When creating the metric filters, assign each a custom metric name. For example:

    • CameraOffline_testc00013_test0001ev
    • CameraOnline_testc00013_test0001ev
  4. Create CloudWatch Alarms: For each camera, create an alarm that triggers if the camera is offline for more than 5 minutes. You can do this by:

    • Using the "CameraOffline" metric as the alarm trigger
    • Setting the threshold to 1 (indicating an offline event occurred)
    • Setting the evaluation period to 5 minutes
    • Using the "CameraOnline" metric to reset the alarm
  5. Set up Notifications: Configure Amazon SNS (Simple Notification Service) to send notifications when alarms are triggered.

To implement this for hundreds of cameras, you may want to consider automating the process:

  1. Use CloudWatch Logs Insights to query and identify all unique camera IDs in your logs.

  2. Create a script or use AWS CloudFormation to automatically generate the necessary metric filters, custom metrics, and alarms for each camera ID.

  3. Use AWS Lambda in combination with CloudWatch Events to dynamically create new alarms when new camera IDs are detected in the logs.

This approach allows you to create individual alarms for each camera, monitoring their offline status and alerting you if they don't come back online within 5 minutes. The automation suggestions can help manage this at scale for hundreds of cameras.
Sources
Alarming on logs - Amazon CloudWatch
Use CloudWatch alarms to monitor CloudTrail events | AWS re:Post
AWS announces CloudWatch Logs Anomaly Detection and Pattern analysis

answered a year ago
EXPERT
reviewed a year ago
  • not really what I am after

0

Hello Andy, I saw your comment on the re:Post Agent post. Could you please share with me why the solution proposed does not work for you?

AWS
answered a year ago
  • The log file could have potentially 1000's of cameraIDs in it and its unknown how many may go offline. So manually creating Metric filters for each is not feasible plus costly. I want to be able to automatically and dynamically create an alarm based on each unique camera ID. The Solution mentions using Lambda + cloud watch events but doesn't give any detail.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.