How can I copy a list of specific files from one bucket directly to another bucket keeping the same folder structure in the target bucket, using aws CLI, without having to download the files first

0

Hello, I'm trying to find a procedure to copy a list of files from a source-bucket, to a target-bucket, keeping the same folder structure. I'm using aws CLI as it's a large number of files (over 3 million). I extracted specific files that met a specific criteria but that are in different folders in source-bucket and I want to be able to copy them directly to a target-bucket, with the same folder structure, but without having to download the files to a local computer directory.

source-bucket content:

source-bucket/file1.txt source-bucket/innerfolder/file2.txt

desired target-bucket content:

target-bucket target-bucket/file1.txt target-bucket/innerfolder/file2.txt

Any help will be deeply appreciated. Thanks.

Letty
已提问 3 个月前225 查看次数
2 回答
1
已接受的回答

Here's how you can do it:

Ensure those files are listed in a text file where each line contains the S3 path of one file to be copied. For example, your list might look like this:

s3://source-bucket/file1.txt
s3://source-bucket/innerfolder/file2.txt

Then, you'll need to use a script to read through your list of files and use the AWS CLI to copy each one.

#!/bin/bash

INPUT_LIST=path_to_your_list/list.txt
TARGET_BUCKET=target-bucket

while IFS= read -r line
do
    FILE_PATH=$(echo "$line" | sed 's/s3:\/\/[^\/]*\///')

    aws s3 cp "$line" "s3://$TARGET_BUCKET/$FILE_PATH"
done < "$INPUT_LIST"

This script reads each line from your list, extracts the file path excluding the bucket name, and then copies the file to the target bucket, preserving the folder structure.

Then, need to make the script executable, and execute the script:

chmod +x path_to_your_script/script.sh
./path_to_your_script/script.sh

If this has answered your question or was helpful, accepting the answer would be greatly appreciated. Thank you!

profile picture
专家
已回答 3 个月前
  • Hello Mina, many thanks for your kind answer. I tested the script changing the txt file path and target-bucket name, however I get these errors: : not found2: : not found5: script.sh: 10: Syntax error: "done" unexpected (expecting "do") I'm running the script on Ubuntu 22.04.3 LTS on Windows 10. Many thanks again for your kind assistance.

  • Hello, I was able to run the script succesfully, however the scripts copies only the file on the first level of the bucket: s3://source-bucket/file1.txt, but it doesn't copy the file under the innerfolder: s3://source-bucket/innerfolder/file2.txt, to the target-bucket and nor creates de folder. I've been researching the "sed" command but haven't figure out how it should work. Please, any help would be really appreciated. Many thanks in advance!

0
profile pictureAWS
专家
kentrad
已回答 3 个月前
  • Thank you for the information, it doesn't show a way to use a list of files already filtered, neither it uses de aws CLI, on top, there are charges for use of replication.

  • Dear Team, I found that the issue was the last line on the list.txt file was not being processed, I corrected it and it worked great!! Many thanks for your kind, valuable and amazing assistance!!

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则