Write Spark data frame to OpenSearch via Kinesis Data Firehose delivery stream

0

I have a Spark transformation batch job runs on EMR. I want to write final transformed data in Spark data frame to OpenSearch via AWS Kinesis Data Firehose. From the Firehose wiki, I do not see Spark data frame can be used as a data source in creating Firehose delivery stream. https://docs.aws.amazon.com/firehose/latest/dev/create-name.html

Question: Does Firehose delivery stream support Spark data frame as a data source? If so, what is the best way to implementing it?

Jin
已提問 10 個月前檢視次數 472 次
1 個回答
1

I don't think EMR can be configured as a data source for Kinesis Data Firehose.
Therefore, it would be necessary to have a mechanism to output data once from the EMR to S3 and then send the S3 data to Kinesis Data Firehose from Lambda, etc.
The following document documents the settings for output from the EMR to S3.
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-output.html

profile picture
專家
已回答 10 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南