Dynamic s3 prefix in s3 trigger for Lambda Function

0

Hi Team, Just started exploring s3 and AWS Lambda. I have a s3 folder with dynamic date/time/hour prefix i,e s3://mybucket/hostData/yyyy/mm/dd/hh/mm where yyyy = 2023, mm = 08 dd = 01, hh = 01, mm = 00. What would be the appropriate prefix to handle the dynamic datetime when triggering the lambda function ? I want to trigger an API based on the data arrived in the above s3 folder. Also, how can I mark a file as processed to avoid duplicate API call ? Thanks a lot in advance.

asked 9 months ago1176 views
1 Answer
1
Accepted Answer

Hello.
Lambda can be executed when an object is created in S3 by using the settings described in the following document.
https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html

It is also a good idea to tag processed objects.
Tag the object once processing is complete using "put_object_tagging" as described in the following documentation.
Since the tags are not set on objects for which processing has not been completed, it is recommended that a judgment be made as to whether the tags have been set or not, using if statements or other means.
Alternatively, moving the files to a different folder when the process is complete would be a good response. https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/put_object_tagging.html

profile picture
EXPERT
answered 9 months ago
profile picture
EXPERT
reviewed 9 months ago
profile picture
EXPERT
reviewed 9 months ago
  • Thanks Riku. Appreciate your help. What about the dynamic date arguments in the s3 prefix ?

  • Lambda is executed when an object is created in the folder "s3://mybucket/hostData/yyyyy/mm/dd/hh/mm". When the Lambda is executed, the message is passed to the Lambda handler as an "event". The message contains S3 folder information. The sample code shows that the object key is obtained by "key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')".

    import json
    import urllib.parse
    import boto3
    
    print('Loading function')
    
    s3 = boto3.client('s3')
    
    
    def lambda_handler(event, context):
        #print("Received event: " + json.dumps(event, indent=2))
    
        # Get the object from the event and show its content type
        bucket = event['Records'][0]['s3']['bucket']['name']
        key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
        try:
            response = s3.get_object(Bucket=bucket, Key=key)
            print("CONTENT TYPE: " + response['ContentType'])
            return response['ContentType']
        except Exception as e:
            print(e)
            print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
            raise e
    
  • Thanks a lot Riku. I am able to invoke the API via Lambda.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions