How to pass parameters from an event rule through a glue workflow trigger to a job

1

I have an event rule rule that triggers a glue job. I would like to pass information from the event details as a parameter to the Glue job. For example, say my event contains: {"details": {"database_name": "my_database"}} and and my job has a parameter --DATABASE_NAME. Is it possible to pass database_name from the event as the --DATABASE_NAME parameter in the job? I haven't been able to find anything in the documentation or online on how to do it.

Thanks!

gefragt vor 2 Jahren5081 Aufrufe
2 Antworten
1

There is no native option to pass EventBridge event details to Glue job. But I used the following workaround to do this:

  1. You need to get an event ID from Glue workflow properties
event_id = glue_client.get_workflow_run_properties(Name=self.args['WORKFLOW_NAME'],
                       RunId=self.args['WORKFLOW_RUN_ID'])['RunProperties']['aws:eventIds'][1:-1]
  1. Get all NotifyEvent events for the last several minutes. It's up to you to decide how much time can pass between the workflow start and your job start.
response = event_client.lookup_events(LookupAttributes=[{'AttributeKey': 'EventName',
                                                         'AttributeValue': 'NotifyEvent'}],
                                                         StartTime=(datetime.datetime.now() - datetime.timedelta(minutes=5)),
                                                         EndTime=datetime.datetime.now())['Events']
  1. Check which event has an enclosed event with the event ID we get from Glue workflow.
for i in range(len(response)):
      event_payload = json.loads(response[i]['CloudTrailEvent'])['requestParameters']['eventPayload']
      if event_payload['eventId'] == event_id:
                event = json.loads(event_payload['eventBody'])

In the event variable you get the full content of the event that triggered Glue workflow.

profile picture
beantwortet vor einem Jahr
0

To pass a parameter to a workflow, use StartWorkflowRun[1] API with RunProperties which takes key-value pairs and it can be accessed in the any job in the workflow using GetWorkflowRunProperties[2]. Further, if required, it can be modified in any job in the workflow using PutWorkflowRunProperties[3].

If triggering an ETL job instead of a workflow, use StartJobRun[4] and set the job arguments. To access these job arguments in the script, use getResolvedOptions[5]

[1] https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-workflow.html#aws-glue-api-workflow-StartWorkflowRun

[2] https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-workflow.html#aws-glue-api-workflow-GetWorkflowRunProperties

[3] https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-workflow.html#aws-glue-api-workflow-PutWorkflowRunProperties

[4] https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-runs.html#aws-glue-api-jobs-runs-StartJobRun

[5] https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-pyspark-extensions-get-resolved-options.html

AWS
beantwortet vor 2 Jahren
  • Hi Sachin,

    The one part that isn't clear to me from your answer is if I have a trigger configured in glue based on an EventBridge event, how would you map the event contents to the job as params? The example in the question is the field database_name. If that value can be different for each event how do you configure the trigger to take that field and pass it in as a key \ value? If you can't configure it in the trigger is there a way once the job starts to get the event body in the job to extract the info?

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen