Why are aggregate results in a Log Insights query nonsensical (count < count_distinct for the same variable)?

0

The following log insights query on a single log group returns negative numbers for the variable @distinct_unique_keys_delta:

parse @message /(?<@unique_key>Processing key: \w+\/[\w=_-]+\/\w+\.\d{4}-\d{2}-\d{2}-\d{2}\.[\w-]+\.\w+\.\w+)/
| filter @message like /Processing key: \w+\/[\w=_-]+\/\w+\.\d{4}-\d{2}-\d{2}-\d{2}\.[\w-]+\.\w+\.\w+/
| stats count(@unique_key) - count_distinct(@unique_key) as @distinct_unique_keys_delta
        by datefloor(@timestamp, 1d) as @_datefloor 
| sort @_datefloor asc

My understanding is that the number of unique values of a variable can never be more than the total number of values of a variable. When I ran this query I was concerned that I might be misunderstanding the correct usage of datefloor, so I tried this query:

parse @message /(?<@unique_key>Processing key: \w+\/[\w=_-]+\/\w+\.\d{4}-\d{2}-\d{2}-\d{2}\.[\w-]+\.\w+\.\w+)/
| filter @message like /Processing key: \w+\/[\w=_-]+\/\w+\.\d{4}-\d{2}-\d{2}-\d{2}\.[\w-]+\.\w+\.\w+/
| stats count(@unique_key) - count_distinct(@unique_key) as @distinct_unique_keys_delta

The result of this query for the time range I chose (a whole day), was -20,347 for the @distinct_unique_keys_delta variable.

To me this result seems completely nonsensical. Am I doing something wrong, interpreting the results wrong or is there a bug in the code running this log insights query?

質問済み 2年前4122ビュー
1回答
0

I have discovered that the count_distinct function in AWS Log Insights queries doesn't really return a distinct count! As per the documentation

Returns the number of unique values for the field. If the field has very high cardinality (contains many unique values), the value returned by count_distinct is just an approximation.

Apparently I can't just assume that a function returns an accurate result.

The documentation page.

回答済み 2年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ