Picking the correct Opensearch index date from the Kinesis Delivery Stream

0

When using Kinesis firehose to opensearch / elasticsearch, while the delivery stream is super convenient, one major limitation I find is that one cannot override the timestamp field that is use to decide on the destination index (it always uses the estimated arrival time). This means that for backfill jobs (which are super common for us) all the data ends up in the current daily index, which reduces read/write scalability and also makes index-based management much more difficult (e.g. compact / archive index older than N days). Another example is when querying dashboards, some indices have data for a large range of dates and end up becoming a bottleneck for performance.

Ideally one could pick a timestamp field from the events, or set it as part of the lambda processing, so that a record with date D goes to index of date D. Thanks for suggestions!

没有答案

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则