- Newest
- Most votes
- Most comments
S3 is an object storage, it's not a file system. So you'll have to download the files first from S3, zip them and then upload again back to s3, if you don't do in-memory operation. Once you verify that zip upload is successful for all the objects, you can consider archiving/deleting the objects based on data criticality.
You can either do it through an EC2 instance or lambda function or may be your local machine(through CLI). With lambda there is a limitation of 900 seconds max time. Consider local machine for downloading/uploading as last option as that may add data transfer cost.
Hope this explanation helps and provide you a direction for how to move forward.
Comment here if you have additional questions.
Happy to help.
Abhishek
To convert all files in your s3 bucket into one single zip file you can use use AWS Lambda (Python) with the AWS SDK for Python (Boto3).
- The below is code to convert all content of bucket into one single zip file
import boto3
import zipfile
import io
s3 = boto3.client('s3')
def lambda_handler(event, context):
source_bucket = 'your-source-bucket'
target_bucket = 'your-target-bucket'
# List objects in the source bucket
response = s3.list_objects_v2(Bucket=source_bucket)
if 'Contents' in response:
objects = response.get('Contents', [])
for obj in objects:
key = obj.get('Key')
if key:
# Get object content
response = s3.get_object(Bucket=source_bucket, Key=key)
content = response['Body'].read()
# Create a zip file in-memory
zip_buffer = io.BytesIO()
with zipfile.ZipFile(zip_buffer, 'w', zipfile.ZIP_DEFLATED) as zipf:
zipf.writestr(key, content)
# Upload the zip file to the target bucket
zip_buffer.seek(0)
target_key = key + '.zip'
s3.upload_fileobj(zip_buffer, target_bucket, target_key)
else:
print("No objects found in the source bucket.")
- To covert each object into a zip file you can use this code. (If you have 10 objects, you will have 10 zip files)
import boto3
import zipfile
import io
s3 = boto3.client('s3')
def lambda_handler(event, context):
source_bucket = 'your-source-bucket'
target_bucket = 'your-target-bucket'
zip_file_name = 'all_files.zip' # Name of the zip file
# List objects in the source bucket
response = s3.list_objects_v2(Bucket=source_bucket)
if 'Contents' in response:
objects = response.get('Contents', [])
# Create a zip file in-memory
zip_buffer = io.BytesIO()
with zipfile.ZipFile(zip_buffer, 'w', zipfile.ZIP_DEFLATED) as zipf:
for obj in objects:
key = obj.get('Key')
if key:
# Get object content
response = s3.get_object(Bucket=source_bucket, Key=key)
content = response['Body'].read()
# Write the file to the zip archive
zipf.writestr(key, content)
# Upload the zip file to the target bucket
zip_buffer.seek(0)
s3.upload_fileobj(zip_buffer, target_bucket, zip_file_name)
else:
print("No objects found in the source bucket.")
Relevant content
- asked 7 months ago
- asked 2 years ago
- asked 2 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated 4 months ago
did you get the descriptor of the script backward on each one? the top one looks like individual files, the bottom looks like a single zip, based on the variable requesting the name of the zip in the bottom one and based on using target_key = key + '.zip' in the top.
if my bucket contained a collection of folders, would each folder be zipped into a separate zip file in the target bucket? or would it recursively go through each folder and make a thousand little zips?
Thanks.
I used your script to zip up select folders. The script only did a single folder path inside of the target folder. Such that, if folderA contains six subfolders, only the first subfolder will be copied to the zip. It does give me a starting point, though. going to try something with shutil.make_archive