How to pass parameters from an event rule through a glue workflow trigger to a job

1

I have an event rule rule that triggers a glue job. I would like to pass information from the event details as a parameter to the Glue job. For example, say my event contains: {"details": {"database_name": "my_database"}} and and my job has a parameter --DATABASE_NAME. Is it possible to pass database_name from the event as the --DATABASE_NAME parameter in the job? I haven't been able to find anything in the documentation or online on how to do it.

Thanks!

demandé il y a 2 ans5081 vues
2 réponses
1

There is no native option to pass EventBridge event details to Glue job. But I used the following workaround to do this:

  1. You need to get an event ID from Glue workflow properties
event_id = glue_client.get_workflow_run_properties(Name=self.args['WORKFLOW_NAME'],
                       RunId=self.args['WORKFLOW_RUN_ID'])['RunProperties']['aws:eventIds'][1:-1]
  1. Get all NotifyEvent events for the last several minutes. It's up to you to decide how much time can pass between the workflow start and your job start.
response = event_client.lookup_events(LookupAttributes=[{'AttributeKey': 'EventName',
                                                         'AttributeValue': 'NotifyEvent'}],
                                                         StartTime=(datetime.datetime.now() - datetime.timedelta(minutes=5)),
                                                         EndTime=datetime.datetime.now())['Events']
  1. Check which event has an enclosed event with the event ID we get from Glue workflow.
for i in range(len(response)):
      event_payload = json.loads(response[i]['CloudTrailEvent'])['requestParameters']['eventPayload']
      if event_payload['eventId'] == event_id:
                event = json.loads(event_payload['eventBody'])

In the event variable you get the full content of the event that triggered Glue workflow.

profile picture
répondu il y a un an
0

To pass a parameter to a workflow, use StartWorkflowRun[1] API with RunProperties which takes key-value pairs and it can be accessed in the any job in the workflow using GetWorkflowRunProperties[2]. Further, if required, it can be modified in any job in the workflow using PutWorkflowRunProperties[3].

If triggering an ETL job instead of a workflow, use StartJobRun[4] and set the job arguments. To access these job arguments in the script, use getResolvedOptions[5]

[1] https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-workflow.html#aws-glue-api-workflow-StartWorkflowRun

[2] https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-workflow.html#aws-glue-api-workflow-GetWorkflowRunProperties

[3] https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-workflow.html#aws-glue-api-workflow-PutWorkflowRunProperties

[4] https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-runs.html#aws-glue-api-jobs-runs-StartJobRun

[5] https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-pyspark-extensions-get-resolved-options.html

AWS
répondu il y a 2 ans
  • Hi Sachin,

    The one part that isn't clear to me from your answer is if I have a trigger configured in glue based on an EventBridge event, how would you map the event contents to the job as params? The example in the question is the field database_name. If that value can be different for each event how do you configure the trigger to take that field and pass it in as a key \ value? If you can't configure it in the trigger is there a way once the job starts to get the event body in the job to extract the info?

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions