Cloudwatch seems to return inconsistent results

0

I log to cloudwatch using a serilog sink. I was querying using Log Insights yesterday to track down a problem. I was refining my query, following a specific set of log entries, and then at some point, my query stopped returning the log entries. At first I thought it was that I needed to increase the timespan for the log, so I opened it up to the full day, however, even as I removed filters, widening my search, I found that the queries simply don't return the log entries that I had been researching only a few minutes earlier. Does cloudwatch have a problem with losing logs? I was searching the whole log group, and not just a log stream. The log I was searching was written using structure logging, and so I was searching based upon fields found within the log entries.

jmarsch
asked 2 months ago338 views
2 Answers
0

It's possible for logs to disappear from CloudWatch Logs Insights queries.

  • The logs may not have finished being ingested into CloudWatch Logs yet. There is a delay of a few minutes for new logs to be available for querying.
  • The filter conditions in your query may have stopped matching the logs. Check that the filter is still valid.
  • Logs are only stored for a limited period depending on the log retention setting. Very old logs may no longer be available.
  • There could be an issue with log ingestion or the CloudWatch Logs service. Try querying without filters to see all logs or check the CloudWatch Logs dashboard for warnings.
  • To troubleshoot, first widen the time range and remove filters from the query. Compare logs available versus what is expected. Also check log retention settings and look for errors on the CloudWatch Logs dashboard or in the CloudWatch agent logs.
profile picture
EXPERT
answered 2 months ago
0

By default, log data is stored in CloudWatch indefinitely. The logs can be "lost" only under two circumstances: 1. Retention period was set to delete the logs https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Working-with-log-groups-and-streams.html#SettingLogRetention 2. There was a DeleteLogGroup API to remove the Log Group https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_DeleteLogGroup.html

From AWS engineer's perspective, I would perform the following troubleshooting steps to investigate the issue:

1.) Check the time interval selector of the query in Logs Insights. If interval is outside the range, it will not pull any results. 2.) Ensure the region of the query is correct. Check that regions match on CloudWatch Service (next to logging details) and in DropDown box "Choose region" under the Search bar. 3.) As the results of the query will be available for 7 days (Query results availability - https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html), I would refer to CloudTrail in Event History for "StartQuery" Event Name (https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_StartQuery.html) and take a note of queryId for both successful and unsuccessful queries. 4.) With the CLI command or API call GetQueryResults bring up the results of the both queries - https://docs.aws.amazon.com/cli/latest/reference/logs/get-query-results.html

CLI query will look like this: aws logs get-query-results --query-id XXXXX

5.) Compare the queries and see what got changed.

Do not hesitate to open a support case with AWS Support, if you would like us to assist you further in investigating.

AWS
Katya_Z
answered 2 months ago
profile picture
EXPERT
reviewed a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions