Trying to surface daily csvs in S3 in Redshift via AWS Glue Studio but databases aren't showing up

0

I am trying to use the AWS Glue Studio to build a simple ETL workflow. Basically, I have a bunch of csv files in different directories in S3. I want those csvs to be accessible via a database and have chosen Redshift for the job. The directories and will be updated every day with new csv files. The file structure is:

YYYY-MM-DD (e.g. 2023-03-07) |---- groupName1 |---- groupName1.csv |---- groupName2 |---- groupName2.csv ... |---- groupNameN |---- groupNameN.csv

We will be keeping historical data, so every day I will have a new date-based directory.

I've read that AWS Glue can automatically copy data on a schedule but I can't see my Redshift databases or tables (screenshot below). I'm using my AWS admin account and I do have AWSGlueConsoleFullAccess permission (screenshot below)

Enter image description here

Enter image description here

已提问 1 年前188 查看次数
1 回答
0

Those databases and tables are from the Glue Catalog, not Redshift.
The way it's intended to work is having a crawler map the Redshift tables to Catalog tables and they will be listed there for you to use.
Sorry for the inconvenience, the team is aware that this is something to improve.

profile pictureAWS
专家
已回答 1 年前
  • So if I have hundreds of new .csv files every day in new directories in S3, what is a recommended approach to scalably load that data into Redshift tables? Also, what is the best way of creating those hundreds of Redshift tables to begin with?

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则