Lambda - Delete file from S3 || Default VPC attached with Lambda

0

Hello AWS experts, I am new with aws and facing one challenge.

What I am trying to do

  • On event file posted in S3 bucket
  • Lambda function execute, with in a lambda function copy data from csv to redshift table and lastly delete the csv file from S3 bucket.

Challenge: My code works perfectly, till copy the data into redshift serverless table. challenge is only at last step, when I tried to delete the processed file from S3 bucket. I used below mention code to delete the file from S3 bucket. For code simplicity, I removed the part to copy data from csv to redshift, also hardcode few things like bucket name and filename.

FYI... I observed, when I attached VPC with my lambda function, I am not able to delete file from S3 bucket, as soon as I delete all worked fine, but in that case, other part of code not working (copy data into redshift).

Please let me know, what I missed or doing wrong. Any alternative most welcomed.

--Lambda function -- Attached default VPC attached with Lambda -- Attached LambdaRole mention below. import os import boto3 def lambda_handler(event, context): for record in event['Records']: try: keyString = record['s3']['object']['key'] s3_bucket = record['s3']['bucket']['name'] s3_client = boto3.client("s3") s3_bucket = "datalake-anaxi" keyString = "test.csv.gz" print(s3_bucket) print(keyString) response = s3_client.delete_object(Bucket=s3_bucket, Key=keyString) print(response) except Exception as e: print(e) print("Error") print("Done") print("after for")

--LambdaRole Role attached with Lambda function AmazonS3FullAccess AWSLambdaFullAccess CloudWatchFullAccess LambdaPolicy

--LambdaPolicy Policy attached with LambdaRole: { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "ec2:DescribeNetworkInterfaces", "ec2:CreateNetworkInterface", "ec2:DeleteNetworkInterface", "ec2:DescribeInstances", "ec2:AttachNetworkInterface" ], "Resource": "*" } ] }

asked 8 months ago295 views
4 Answers
1

Hello.

To access S3 when Lambda is connected to a VPC, you will need to set up an S3 VPC endpoint or NAT Gateway.
https://www.cloudtechsimplified.com/aws-lambda-vpc-s3/

If you don't need to go out to the public network from Lambda, try setting up an S3 VPC endpoint.
Please check the document below for how to create it.
https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-s3.html#create-gateway-endpoint-s3

profile picture
EXPERT
answered 8 months ago
profile pictureAWS
EXPERT
reviewed 8 months ago
0
Accepted Answer

When a Lambda function is placed within a VPC, it will only have access to resources within that VPC by default. This means it won't have internet access, and therefore won't be able to access AWS services like S3 which are outside of the VPC, unless you configure it to do so.

Here are the steps you can take to ensure your Lambda function can access both S3 and Redshift while it's inside a VPC:

1. Create a VPC Endpoint for S3:

  • Navigate to the VPC Dashboard in the AWS Console.
  • Go to "Endpoints" and click "Create Endpoint".
  • Choose the service name that corresponds to S3 (com.amazonaws.region.s3).
  • Associate it with the VPC that your Lambda function is connected to.
  • Ensure that the security group associated with the VPC endpoint allows outbound connections on port 443 (HTTPS). By creating this VPC endpoint, resources within your VPC (like your Lambda function) can communicate with S3 without requiring internet access.

2. Update Lambda Security Group Rules:

  • Allow outbound connections to the internet (required to connect to Redshift and S3).
  • If you're using a NAT Gateway to provide internet access to resources inside the VPC, ensure that the Lambda function's security group allows outbound connections to the NAT Gateway. Additionally, the NAT Gateway should be in a public subnet with a route to the internet.

3. Lambda Execution Role:

  • Ensure that the execution role attached to the Lambda function has the necessary permissions to perform the desired operations on S3 and Redshift.

4. Lambda VPC Configuration:

  • Ensure that the Lambda function is associated with the appropriate subnets and security groups within the VPC. If you're using a NAT Gateway, make sure the Lambda function is placed in a private subnet that routes outbound traffic through the NAT Gateway.

5. Logging and Monitoring:

  • Use CloudWatch Logs to monitor the execution of your Lambda function. This will help you identify any issues or errors that might occur during execution.

6. Error Handling:

  • Improve the error handling in your Lambda function. Instead of just printing the error, consider logging it to CloudWatch Logs or sending a notification when an error occurs.

Lastly, remember that when you're testing, the hardcoded values of s3_bucket and keyString will always overwrite the values extracted from the S3 event, so when you're ready for production, you might want to remove or comment out the hardcoded values.

Try these steps and see if your Lambda function can successfully delete files from S3 while inside a VPC.

profile picture
answered 8 months ago
profile picture
EXPERT
reviewed 8 months ago
0

Thanks Jose and Ercan, Very much appreciated. You don't believe, I spend completely 1 day to find out where is the issue. Thanks again.

answered 8 months ago
0

Hi, Just an another question, related with above scenario on lambda function. 2 big CSVs posted --> S3 bucket notify the Lambda function --> Lambda function terminates, in mid of the process to upload files in redshift, due to time limitation (15 minutes). Upload process not completed in the given timeframe. In this case, how can we process remain files.

answered 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions