s3 - date folder structure for etl data processes (airflow, etc)

0

Trying to verify the best practice. Say I have a "folder" structure like so:

s3://my-bucket/events/2022/09/01
s3://my-bucket/events/2022/09/02
...

As an example, On 9/3, the client should export into the "folder" s3://my-bucket/events/2022/09/02 (after the 9/2 day).

We have a client who is running their etl on 9/3 and then placing the files on 2022/09/03 folder and we're trying to convince them to put it in the 9/2 folder.

Am I correct is saying that this is the best practice?

1 réponse
0

Hello,

I would like to inform you that there is no general guidance or best practices prescribed by AWS for the query that you are asking. That being said, it is a subjective question and depends from use-case to use-case. Personally, I would agree with you in convincing your client to put the ETL processed data in the same named folder as the date on which the data was produced.

AWS
INGÉNIEUR EN ASSISTANCE TECHNIQUE
anil_d
répondu il y a 2 ans

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions