Cloudwatch seems to return inconsistent results

0

I log to cloudwatch using a serilog sink. I was querying using Log Insights yesterday to track down a problem. I was refining my query, following a specific set of log entries, and then at some point, my query stopped returning the log entries. At first I thought it was that I needed to increase the timespan for the log, so I opened it up to the full day, however, even as I removed filters, widening my search, I found that the queries simply don't return the log entries that I had been researching only a few minutes earlier. Does cloudwatch have a problem with losing logs? I was searching the whole log group, and not just a log stream. The log I was searching was written using structure logging, and so I was searching based upon fields found within the log entries.

jmarsch
已提问 2 个月前347 查看次数
2 回答
0

It's possible for logs to disappear from CloudWatch Logs Insights queries.

  • The logs may not have finished being ingested into CloudWatch Logs yet. There is a delay of a few minutes for new logs to be available for querying.
  • The filter conditions in your query may have stopped matching the logs. Check that the filter is still valid.
  • Logs are only stored for a limited period depending on the log retention setting. Very old logs may no longer be available.
  • There could be an issue with log ingestion or the CloudWatch Logs service. Try querying without filters to see all logs or check the CloudWatch Logs dashboard for warnings.
  • To troubleshoot, first widen the time range and remove filters from the query. Compare logs available versus what is expected. Also check log retention settings and look for errors on the CloudWatch Logs dashboard or in the CloudWatch agent logs.
profile picture
专家
已回答 2 个月前
0

By default, log data is stored in CloudWatch indefinitely. The logs can be "lost" only under two circumstances: 1. Retention period was set to delete the logs https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Working-with-log-groups-and-streams.html#SettingLogRetention 2. There was a DeleteLogGroup API to remove the Log Group https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_DeleteLogGroup.html

From AWS engineer's perspective, I would perform the following troubleshooting steps to investigate the issue:

1.) Check the time interval selector of the query in Logs Insights. If interval is outside the range, it will not pull any results. 2.) Ensure the region of the query is correct. Check that regions match on CloudWatch Service (next to logging details) and in DropDown box "Choose region" under the Search bar. 3.) As the results of the query will be available for 7 days (Query results availability - https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html), I would refer to CloudTrail in Event History for "StartQuery" Event Name (https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_StartQuery.html) and take a note of queryId for both successful and unsuccessful queries. 4.) With the CLI command or API call GetQueryResults bring up the results of the both queries - https://docs.aws.amazon.com/cli/latest/reference/logs/get-query-results.html

CLI query will look like this: aws logs get-query-results --query-id XXXXX

5.) Compare the queries and see what got changed.

Do not hesitate to open a support case with AWS Support, if you would like us to assist you further in investigating.

AWS
Katya_Z
已回答 2 个月前
profile picture
专家
已审核 1 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则