How to ensure EMR process steps in sequence?

0

I'm submitting multiple steps with AWS python SDK

steps = []
for job in jobs:
    args = [
        'spark-submit',
        '--py-files',
        's3://bucket/scripts/*',
        's3://bucket/scripts/main.py',
    ]
    args = args + job.params
    step = {
        'Name': job.name,
        'ActionOnFailure': 'CONTINUE',
        'HadoopJarStep': {
            'Jar': 'command-runner.jar',
            'Args': args
        }
    }
    steps.append(step)
response = self.client.add_job_flow_steps(JobFlowId=job_flow_id, Steps=steps)

But EMR does not process the step in the sequence as the steps array, is there a way to ensure it processes in sequence?

demandé il y a 2 ans1030 vues
1 réponse
0

I believe that steps are submitted and run in the order, so to confirm the same I went ahead and tested the same on a test cluster using your code with little changes as below.

steps = []
for i in range(1,10):
    args = [
        'spark-example',
        '--deploy-mode',
        'cluster',
        'SparkPi',
        '10'
    ]
    step = {
        'Name': "TestStepOrder" + str(i),
        'ActionOnFailure': 'CONTINUE',
        'HadoopJarStep': {
            'Jar': 'command-runner.jar',
            'Args': args
        }
    }
    steps.append(step)
response = client.add_job_flow_steps(JobFlowId=clusterId, Steps=steps)

I can confirm the order is maintained as expected.

I ran a second round of tests for the same with concurrency set as 5 to see if that has any impacts on this. In this case by looking at the Start Time I can confirm the order is still maintained.

Interested to know more about how you get the order mixed up, please share reproduction steps to reproduce the behavior you are observing.

Note: I'm using the latest boto3 version(1.20.26), not sure if that makes it any different

AWS
INGÉNIEUR EN ASSISTANCE TECHNIQUE
répondu il y a 2 ans

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions