Athena Compilation Error: HIVE_PATH_ALREADY_EXISTS with Partitioned S3 Objects

0

I'm experiencing an issue with AWS Athena when trying to compile a table from millions of small, partitioned S3 objects. I haven't specified where the table should be saved, and I'm encountering a HIVE_PATH_ALREADY_EXISTS error. The error message suggests that the target directory for my table already exists. Here's the error log:

Runtime Error in model table (models/compress_tables/table.sql)
HIVE_PATH_ALREADY_EXISTS: Target directory for table 'database.table' already exists: s3://aws-athena-query-results-1234556789-us-east-1/dbt/tables/dbt/table/e365789-fd5f-4343-b1ca-c234244f1307. You may need to manually clean the data at location...

I'm not sure why this error is occurring, as I didn't manually specify the output location for the table, and I thought Athena managed these paths automatically. Has anyone faced this issue before, and how did you resolve it? Is there a way to force Athena to overwrite the existing directory or to specify a new output location that avoids this issue?

Jim
asked 2 months ago162 views
1 Answer
1

Greetings,

Thank you for reaching out to AWS via this post.

From your post I understand that you are receiving a "HIVE_PATH_ALREADY_EXISTS" error when trying to create a table in Athena.

Please correct me if I have misunderstood anything.

Usually this error occurs when providing an "external_location" that is not empty when running a CREATE TABLE AS SELECT (CTAS) query in Amazon Athena. If you use the "external_location" parameter in the CTAS query, then be sure to specify an Amazon Simple Storage Service (Amazon S3) location that's empty. Please refer to resources [1] and [2] for more information.

Additionally, please refer to resource [3] for guidance on creating tables in Amazon Athena.

In the event you are using the AWS SDK for pandas to create the table, please note that "ctas_approach=True" by default as explained in resource [4]. If applicable, please update the "ctas_approach" parameter accordingly.

I hope you found the information provided helpful.

Wishing you a great day further!

Resources:

[1] Resolving the "HIVE_PATH_ALREADY_EXISTS" error - https://repost.aws/knowledge-center/athena-hive-path-already-exists

[2] Considerations and limitations for CTAS queries - https://docs.aws.amazon.com/athena/latest/ug/ctas-considerations-limitations.html

[3] Creating tables in Athena - https://docs.aws.amazon.com/athena/latest/ug/creating-tables.html

[4] AWS SDK for pandas - https://aws-sdk-pandas.readthedocs.io/en/stable/stubs/awswrangler.athena.read_sql_query.html

AWS
answered 2 months ago
profile picture
EXPERT
reviewed a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions