Write Spark data frame to OpenSearch via Kinesis Data Firehose delivery stream

0

I have a Spark transformation batch job runs on EMR. I want to write final transformed data in Spark data frame to OpenSearch via AWS Kinesis Data Firehose. From the Firehose wiki, I do not see Spark data frame can be used as a data source in creating Firehose delivery stream. https://docs.aws.amazon.com/firehose/latest/dev/create-name.html

Question: Does Firehose delivery stream support Spark data frame as a data source? If so, what is the best way to implementing it?

Jin
posta 10 mesi fa472 visualizzazioni
1 Risposta
1

I don't think EMR can be configured as a data source for Kinesis Data Firehose.
Therefore, it would be necessary to have a mechanism to output data once from the EMR to S3 and then send the S3 data to Kinesis Data Firehose from Lambda, etc.
The following document documents the settings for output from the EMR to S3.
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-output.html

profile picture
ESPERTO
con risposta 10 mesi fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande