To convert all files in an AWS S3 bucket to zip files

0

To convert all files in an AWS S3 bucket to zip files

KARTHIK
已提问 8 个月前7292 查看次数
2 回答
4
已接受的回答

S3 is an object storage, it's not a file system. So you'll have to download the files first from S3, zip them and then upload again back to s3, if you don't do in-memory operation. Once you verify that zip upload is successful for all the objects, you can consider archiving/deleting the objects based on data criticality.

You can either do it through an EC2 instance or lambda function or may be your local machine(through CLI). With lambda there is a limitation of 900 seconds max time. Consider local machine for downloading/uploading as last option as that may add data transfer cost.

Hope this explanation helps and provide you a direction for how to move forward.

Comment here if you have additional questions.

Happy to help.

Abhishek

profile pictureAWS
专家
已回答 8 个月前
profile pictureAWS
专家
iBehr
已审核 8 个月前
1

To convert all files in your s3 bucket into one single zip file you can use use AWS Lambda (Python) with the AWS SDK for Python (Boto3).

  • The below is code to convert all content of bucket into one single zip file
import boto3
import zipfile
import io

s3 = boto3.client('s3')

def lambda_handler(event, context):
    source_bucket = 'your-source-bucket'
    target_bucket = 'your-target-bucket'
   
    # List objects in the source bucket
    response = s3.list_objects_v2(Bucket=source_bucket)
   
    if 'Contents' in response:
        objects = response.get('Contents', [])
       
        for obj in objects:
            key = obj.get('Key')
            if key:
                # Get object content
                response = s3.get_object(Bucket=source_bucket, Key=key)
                content = response['Body'].read()
               
                # Create a zip file in-memory
                zip_buffer = io.BytesIO()
                with zipfile.ZipFile(zip_buffer, 'w', zipfile.ZIP_DEFLATED) as zipf:
                    zipf.writestr(key, content)
               
                # Upload the zip file to the target bucket
                zip_buffer.seek(0)
                target_key = key + '.zip'
                s3.upload_fileobj(zip_buffer, target_bucket, target_key)
    else:
        print("No objects found in the source bucket.")

  • To covert each object into a zip file you can use this code. (If you have 10 objects, you will have 10 zip files)
import boto3
import zipfile
import io

s3 = boto3.client('s3')

def lambda_handler(event, context):
    source_bucket = 'your-source-bucket'
    target_bucket = 'your-target-bucket'
    zip_file_name = 'all_files.zip'  # Name of the zip file
    
    # List objects in the source bucket
    response = s3.list_objects_v2(Bucket=source_bucket)
   
    if 'Contents' in response:
        objects = response.get('Contents', [])
       
        # Create a zip file in-memory
        zip_buffer = io.BytesIO()
        with zipfile.ZipFile(zip_buffer, 'w', zipfile.ZIP_DEFLATED) as zipf:
            for obj in objects:
                key = obj.get('Key')
                if key:
                    # Get object content
                    response = s3.get_object(Bucket=source_bucket, Key=key)
                    content = response['Body'].read()
                    
                    # Write the file to the zip archive
                    zipf.writestr(key, content)
            
        # Upload the zip file to the target bucket
        zip_buffer.seek(0)
        s3.upload_fileobj(zip_buffer, target_bucket, zip_file_name)
    else:
        print("No objects found in the source bucket.")

已回答 8 个月前
profile picture
专家
已审核 1 个月前
    1. did you get the descriptor of the script backward on each one? the top one looks like individual files, the bottom looks like a single zip, based on the variable requesting the name of the zip in the bottom one and based on using target_key = key + '.zip' in the top.

    2. if my bucket contained a collection of folders, would each folder be zipped into a separate zip file in the target bucket? or would it recursively go through each folder and make a thousand little zips?

    Thanks.

  • I used your script to zip up select folders. The script only did a single folder path inside of the target folder. Such that, if folderA contains six subfolders, only the first subfolder will be copied to the zip. It does give me a starting point, though. going to try something with shutil.make_archive

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则