Write Spark data frame to OpenSearch via Kinesis Data Firehose delivery stream

0

I have a Spark transformation batch job runs on EMR. I want to write final transformed data in Spark data frame to OpenSearch via AWS Kinesis Data Firehose. From the Firehose wiki, I do not see Spark data frame can be used as a data source in creating Firehose delivery stream. https://docs.aws.amazon.com/firehose/latest/dev/create-name.html

Question: Does Firehose delivery stream support Spark data frame as a data source? If so, what is the best way to implementing it?

Jin
已提问 10 个月前471 查看次数
1 回答
1

I don't think EMR can be configured as a data source for Kinesis Data Firehose.
Therefore, it would be necessary to have a mechanism to output data once from the EMR to S3 and then send the S3 data to Kinesis Data Firehose from Lambda, etc.
The following document documents the settings for output from the EMR to S3.
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-output.html

profile picture
专家
已回答 10 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则