How can I copy a list of specific files from one bucket directly to another bucket keeping the same folder structure in the target bucket, using aws CLI, without having to download the files first

0

Hello, I'm trying to find a procedure to copy a list of files from a source-bucket, to a target-bucket, keeping the same folder structure. I'm using aws CLI as it's a large number of files (over 3 million). I extracted specific files that met a specific criteria but that are in different folders in source-bucket and I want to be able to copy them directly to a target-bucket, with the same folder structure, but without having to download the files to a local computer directory.

source-bucket content:

source-bucket/file1.txt source-bucket/innerfolder/file2.txt

desired target-bucket content:

target-bucket target-bucket/file1.txt target-bucket/innerfolder/file2.txt

Any help will be deeply appreciated. Thanks.

Letty
demandé il y a 3 mois226 vues
2 réponses
1
Réponse acceptée

Here's how you can do it:

Ensure those files are listed in a text file where each line contains the S3 path of one file to be copied. For example, your list might look like this:

s3://source-bucket/file1.txt
s3://source-bucket/innerfolder/file2.txt

Then, you'll need to use a script to read through your list of files and use the AWS CLI to copy each one.

#!/bin/bash

INPUT_LIST=path_to_your_list/list.txt
TARGET_BUCKET=target-bucket

while IFS= read -r line
do
    FILE_PATH=$(echo "$line" | sed 's/s3:\/\/[^\/]*\///')

    aws s3 cp "$line" "s3://$TARGET_BUCKET/$FILE_PATH"
done < "$INPUT_LIST"

This script reads each line from your list, extracts the file path excluding the bucket name, and then copies the file to the target bucket, preserving the folder structure.

Then, need to make the script executable, and execute the script:

chmod +x path_to_your_script/script.sh
./path_to_your_script/script.sh

If this has answered your question or was helpful, accepting the answer would be greatly appreciated. Thank you!

profile picture
EXPERT
répondu il y a 3 mois
  • Hello Mina, many thanks for your kind answer. I tested the script changing the txt file path and target-bucket name, however I get these errors: : not found2: : not found5: script.sh: 10: Syntax error: "done" unexpected (expecting "do") I'm running the script on Ubuntu 22.04.3 LTS on Windows 10. Many thanks again for your kind assistance.

  • Hello, I was able to run the script succesfully, however the scripts copies only the file on the first level of the bucket: s3://source-bucket/file1.txt, but it doesn't copy the file under the innerfolder: s3://source-bucket/innerfolder/file2.txt, to the target-bucket and nor creates de folder. I've been researching the "sed" command but haven't figure out how it should work. Please, any help would be really appreciated. Many thanks in advance!

0
profile pictureAWS
EXPERT
kentrad
répondu il y a 3 mois
  • Thank you for the information, it doesn't show a way to use a list of files already filtered, neither it uses de aws CLI, on top, there are charges for use of replication.

  • Dear Team, I found that the issue was the last line on the list.txt file was not being processed, I corrected it and it worked great!! Many thanks for your kind, valuable and amazing assistance!!

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions