- 新しい順
- 投票が多い順
- コメントが多い順
Hello, You can use Amazon Athena to query the csv data from S3 buckets by creating external tables. Keep the csv files belongs to two tables in two separate folders in S3. Then you can create a database / reuse exiting in Athena console and create tables from these csv files.
Alternatively, you can combine AWS Glue and Athena together to achieve this. You can create a crawler in Glue and add data source pointing to S3 buckets and run the crawler which will create the tables automatically for you. These tables will then appear in the Athena console and you can query them using SQLs and join.
Please refer to below where similar use case explained. https://stackoverflow.com/questions/73864748/query-to-multiple-csv-fles-at-s3-through-athena
https://stackoverflow.com/questions/74009011/access-s3-csv-file-in-amazon-athena
Please refer here for IAM for policies needed for accessing data catalogs. Also, you can use the AmazonAthenaFullAccess AWS managed policy assigned.
Additional resources for your reference : https://docs.aws.amazon.com/athena/latest/ug/data-sources-glue.html
https://docs.aws.amazon.com/athena/latest/ug/glue-best-practices.html
Athena will store the query result on the temporary S3 path, you can also ask Athena to create a table with the result of the query.
https://docs.aws.amazon.com/athena/latest/ug/ctas.html
If you use Glue Studio you would need to output the result of the join to some sink such an s3 path or table, but if you just want to do the join without needing extra transformations of connections, Athena is probably more practical if you are familiar with SQL
関連するコンテンツ
- AWS公式更新しました 3年前
I believe you are looking for a solution to join 2 tables where the data files are in csv. You can follow the below solutions.