Best way to move data from Athena to OpenSearch

0

What is the best way to get the data from an Athena query result to the OpenSearch index?

Right now we use a combination of Athena to S3, then Step Functions, and Lambdas and so on, but it's too fragile and costly.

Are there any better ways to do this? I am surprised there isn't some native service that can do this. Or maybe I am overlooking it.

profile picture
m0ltar
asked 10 months ago556 views
2 Answers
0
Accepted Answer

To ingest data from S3 into OpenSearch , there are two options :

  • Using S3 source plugin. This will require the setup for a SQS queue that receives S3 Event Notification for the arrival of the new objects.
  • Using a Lambda driven approach. Lambda will be triggered every time there is a new object and will load that into OpenSearch. You can see an example of reference architecture here (look at the log analytics use case) and an example of the Lambda function here.

Second option is suitable for the cases that you need to transform incoming data before load into OpenSearch whilst the first option loads the data with no transformation.

AWS
answered 10 months ago
  • The S3 source looks great. When we developed our solution Data Prepper was not released yet, had no idea about it. Great pointer! Thanks.

0

Amazon Athena is an ad-hoc/interactive querying service and does not provide machinery to be in the middle of your datapipeline. In short, something should 'trigger' Athena queries (this can be another service through SDK or a human in an interactive manner). If you want to use SQL queries automatically whenever new data arrives at your S3 data source, you can use Glue jobs using SparkSQL commands with the same SQL query to assemble the data set and feed it into S3 and from there you can use native integration of OpenSearch with S3 to pull data. Based on the velocity of your data (batch vs. streaming) you might need to think about different components (e.g. using GlueStreaming instead of Glue jobs or Kinesis Data Firehose delivery to OpenSearch instead of S3 integration).

AWS
answered 10 months ago
  • Ok, I think I miscommunicated my question. We do trigger Athena and the results land in S3. That works well, and we have no issues there. The question is what happens after. How do we get these Athena results into OpenSearch? We can output any Athena-supported format.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions