How to ensure EMR process steps in sequence?

0

I'm submitting multiple steps with AWS python SDK

steps = []
for job in jobs:
    args = [
        'spark-submit',
        '--py-files',
        's3://bucket/scripts/*',
        's3://bucket/scripts/main.py',
    ]
    args = args + job.params
    step = {
        'Name': job.name,
        'ActionOnFailure': 'CONTINUE',
        'HadoopJarStep': {
            'Jar': 'command-runner.jar',
            'Args': args
        }
    }
    steps.append(step)
response = self.client.add_job_flow_steps(JobFlowId=job_flow_id, Steps=steps)

But EMR does not process the step in the sequence as the steps array, is there a way to ensure it processes in sequence?

preguntada hace 2 años1030 visualizaciones
1 Respuesta
0

I believe that steps are submitted and run in the order, so to confirm the same I went ahead and tested the same on a test cluster using your code with little changes as below.

steps = []
for i in range(1,10):
    args = [
        'spark-example',
        '--deploy-mode',
        'cluster',
        'SparkPi',
        '10'
    ]
    step = {
        'Name': "TestStepOrder" + str(i),
        'ActionOnFailure': 'CONTINUE',
        'HadoopJarStep': {
            'Jar': 'command-runner.jar',
            'Args': args
        }
    }
    steps.append(step)
response = client.add_job_flow_steps(JobFlowId=clusterId, Steps=steps)

I can confirm the order is maintained as expected.

I ran a second round of tests for the same with concurrency set as 5 to see if that has any impacts on this. In this case by looking at the Start Time I can confirm the order is still maintained.

Interested to know more about how you get the order mixed up, please share reproduction steps to reproduce the behavior you are observing.

Note: I'm using the latest boto3 version(1.20.26), not sure if that makes it any different

AWS
INGENIERO DE SOPORTE
respondido hace 2 años

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas