Trying to surface daily csvs in S3 in Redshift via AWS Glue Studio but databases aren't showing up

0

I am trying to use the AWS Glue Studio to build a simple ETL workflow. Basically, I have a bunch of csv files in different directories in S3. I want those csvs to be accessible via a database and have chosen Redshift for the job. The directories and will be updated every day with new csv files. The file structure is:

YYYY-MM-DD (e.g. 2023-03-07) |---- groupName1 |---- groupName1.csv |---- groupName2 |---- groupName2.csv ... |---- groupNameN |---- groupNameN.csv

We will be keeping historical data, so every day I will have a new date-based directory.

I've read that AWS Glue can automatically copy data on a schedule but I can't see my Redshift databases or tables (screenshot below). I'm using my AWS admin account and I do have AWSGlueConsoleFullAccess permission (screenshot below)

Enter image description here

Enter image description here

1 個回答
0

Those databases and tables are from the Glue Catalog, not Redshift.
The way it's intended to work is having a crawler map the Redshift tables to Catalog tables and they will be listed there for you to use.
Sorry for the inconvenience, the team is aware that this is something to improve.

profile pictureAWS
專家
已回答 1 年前
  • So if I have hundreds of new .csv files every day in new directories in S3, what is a recommended approach to scalably load that data into Redshift tables? Also, what is the best way of creating those hundreds of Redshift tables to begin with?

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南