Write Spark data frame to OpenSearch via Kinesis Data Firehose delivery stream

0

I have a Spark transformation batch job runs on EMR. I want to write final transformed data in Spark data frame to OpenSearch via AWS Kinesis Data Firehose. From the Firehose wiki, I do not see Spark data frame can be used as a data source in creating Firehose delivery stream. https://docs.aws.amazon.com/firehose/latest/dev/create-name.html

Question: Does Firehose delivery stream support Spark data frame as a data source? If so, what is the best way to implementing it?

Jin
asked 9 months ago447 views
1 Answer
1

I don't think EMR can be configured as a data source for Kinesis Data Firehose.
Therefore, it would be necessary to have a mechanism to output data once from the EMR to S3 and then send the S3 data to Kinesis Data Firehose from Lambda, etc.
The following document documents the settings for output from the EMR to S3.
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-output.html

profile picture
EXPERT
answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions