How to retrieve the event captured by EventBridge and which triggered my Sagemaker pipeline

0

Hi there! I have configured my S3 bucket and created a rule in EventBridge to trigger my Sagemaker pipeline when a new object is created in the S3 bucket. I would like to retrieve the event (more specifically details on the newly created object in the S3 bucket) that triggered the Sagemaker pipeline within the pipeline itself. Is it possible ? If so, how do I retrieve the event ?

Many thanks in advance

  • I've added a Sagemaker pipeline parameter in the configuration of my EventBridge rule as follow:

    • name: x-events-object-key
    • value: $.detail.object.key

    But now the rule is broken and does not trigger the pipeline anymore. I chose the value to try to extract the prefix key from the S3 event according to the message format I found here https://docs.aws.amazon.com/AmazonS3/latest/userguide/ev-events.html. Is it the correct way to do it ?

1 Answer
0

It is possible to retrieve information about the event that triggered your SageMaker pipeline from within the pipeline itself. When an event is triggered in an S3 bucket and passed through EventBridge, an event payload is generated, which includes details about the event. In your SageMaker pipeline, you can access this event payload to get information about the newly created object in the S3 bucket. Here are the general steps to retrieve the event information within your SageMaker pipeline:

Configure EventBridge Rule: Ensure that your EventBridge rule is correctly configured to trigger when a new object is created in the S3 bucket. The rule should be set up to send events to the SageMaker pipeline.

Define Input and Output Artifacts in SageMaker Pipeline: In your SageMaker pipeline definition, define input and output artifacts for the steps that need to access information from the S3 event. Specify the S3 URI where the event information will be stored as an artifact.

Access Event Information in SageMaker Script: Within your SageMaker script or code, you can access the S3 event information from the input artifact. Depending on the language or framework you are using in your SageMaker script, the approach to accessing event information may vary.

Here's an example using Python and the SageMaker Python SDK:

import os
import json
event_file_path = '/opt/ml/processing/input/event.json'
if os.path.exists(event_file_path):
    with open(event_file_path, 'r') as event_file:
        event_data = json.load(event_file)
        # Now 'event_data' contains information about the S3 event
        print(event_data)
else:
    print("Event file not found.")

In this example, it's assumed that the event information is stored as a JSON file named 'event.json' in the input artifact directory. Adjust the code based on the actual format and location of the event information in your pipeline. Remember to configure your SageMaker processing step to consume the S3 event information as an input artifact. This involves specifying the S3 URI where the event information is stored in the SageMaker processing input configuration.

AWS
answered 5 months ago
profile picture
EXPERT
reviewed 22 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions