How can I copy a list of specific files from one bucket directly to another bucket keeping the same folder structure in the target bucket, using aws CLI, without having to download the files first

0

Hello, I'm trying to find a procedure to copy a list of files from a source-bucket, to a target-bucket, keeping the same folder structure. I'm using aws CLI as it's a large number of files (over 3 million). I extracted specific files that met a specific criteria but that are in different folders in source-bucket and I want to be able to copy them directly to a target-bucket, with the same folder structure, but without having to download the files to a local computer directory.

source-bucket content:

source-bucket/file1.txt source-bucket/innerfolder/file2.txt

desired target-bucket content:

target-bucket target-bucket/file1.txt target-bucket/innerfolder/file2.txt

Any help will be deeply appreciated. Thanks.

Letty
asked 3 months ago202 views
2 Answers
1
Accepted Answer

Here's how you can do it:

Ensure those files are listed in a text file where each line contains the S3 path of one file to be copied. For example, your list might look like this:

s3://source-bucket/file1.txt
s3://source-bucket/innerfolder/file2.txt

Then, you'll need to use a script to read through your list of files and use the AWS CLI to copy each one.

#!/bin/bash

INPUT_LIST=path_to_your_list/list.txt
TARGET_BUCKET=target-bucket

while IFS= read -r line
do
    FILE_PATH=$(echo "$line" | sed 's/s3:\/\/[^\/]*\///')

    aws s3 cp "$line" "s3://$TARGET_BUCKET/$FILE_PATH"
done < "$INPUT_LIST"

This script reads each line from your list, extracts the file path excluding the bucket name, and then copies the file to the target bucket, preserving the folder structure.

Then, need to make the script executable, and execute the script:

chmod +x path_to_your_script/script.sh
./path_to_your_script/script.sh

If this has answered your question or was helpful, accepting the answer would be greatly appreciated. Thank you!

profile picture
EXPERT
answered 3 months ago
  • Hello Mina, many thanks for your kind answer. I tested the script changing the txt file path and target-bucket name, however I get these errors: : not found2: : not found5: script.sh: 10: Syntax error: "done" unexpected (expecting "do") I'm running the script on Ubuntu 22.04.3 LTS on Windows 10. Many thanks again for your kind assistance.

  • Hello, I was able to run the script succesfully, however the scripts copies only the file on the first level of the bucket: s3://source-bucket/file1.txt, but it doesn't copy the file under the innerfolder: s3://source-bucket/innerfolder/file2.txt, to the target-bucket and nor creates de folder. I've been researching the "sed" command but haven't figure out how it should work. Please, any help would be really appreciated. Many thanks in advance!

0
profile pictureAWS
EXPERT
kentrad
answered 3 months ago
  • Thank you for the information, it doesn't show a way to use a list of files already filtered, neither it uses de aws CLI, on top, there are charges for use of replication.

  • Dear Team, I found that the issue was the last line on the list.txt file was not being processed, I corrected it and it worked great!! Many thanks for your kind, valuable and amazing assistance!!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions