EMR-S Application Configuration (spark default) is not shown for tasks submitted using airflow

0

Hi Team,

We are invoking our EMR-S jobs using an airflow EMR job submit operator. The EMR application is configured with a set of spark default run time configurations. while invoking the job from airflow those configurations are not listed in the job configuration section for that job run. however, if we submit a directly using the console, the configurations from the application is taken and listed in the specific job run. how to enable this for the jobs that are triggered from airflow. Basically I have application configuration as Java 17 ,but if the job is triggered from airflow its still using JAva 8.

2 Risposte
0

Hello,

Could you share your DAG code after removing any sensitive information, if any?

Thanks!

AWS
TECNICO DI SUPPORTO
Nitin_S
con risposta un mese fa
0

Hello,

In general as mentioned in the document[1], there are three options when overriding the configurations in the job run for an EMR Serverless application.

  1. Override an existing configuration
  2. Add an additional configuration
  3. Remove an existing configuration

I suspect that the MWAA EMR serverless Job run/submit operator might be removing an existing configuration by passing an empty set which is overriding the the default configuration defined at application level.

To confirm this, i would recommend you to compare the Cloudtrail events of StartJobRun invoked from API Console and from MWAA.

References: [1] https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/default-configs.html#default-configs-override

AWS
con risposta un mese fa
profile picture
ESPERTO
verificato un mese fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande