Governed Tables not deleting smaller files after compaction

0

working on a POC to understand how data Governed Tables compaction work, after governed table is created and data getting loaded into the table using a Glue job,

compaction is getting triggered automatically and creates a larger file but it doesn't seem to delete the smaller files which are there.

could you please let me know if any other configuration changes has to be done for the compaction process to delete the file after the run completes.

asked 2 years ago304 views
1 Answer
0

Hello,

In AWS Lake Formation, Governed Tables offer a way to manage and compact smaller files into larger ones to enhance query performance. It appears from your question that the smaller files aren't being deleted automatically after compaction.

As far as I know, AWS Lake Formation doesn't include a built-in feature to delete smaller files post-compaction. The compaction process is primarily intended to enhance read performance by minimizing the number of files that need to be read, and it doesn't necessarily aim to conserve space.

If it's necessary to delete the smaller files, you can do this manually. Be cautious with this process, as it could potentially result in data loss if not executed properly. You should ensure that the data is correctly written into the larger file before the smaller ones are deleted.

For step-by-step instructions on how to manually delete files, you may need to consult the AWS documentation or forums, as providing these instructions is beyond the scope of this response.

If you need more specific guidance, consider reaching out to the AWS support team or the AWS community. They should be able to provide up-to-date information and potentially inform you about new features or best practices for handling your situation.

Here are some helpful links for your reference:

Here are some helpful links for your reference: https://docs.aws.amazon.com/lake-formation/latest/dg/what-is-lake-formation.html https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-glue-data-catalog-hive.html https://forums.aws.amazon.com/index.jspa https://aws.amazon.com/premiumsupport/knowledge-center/ As a general rule, always ensure your data is backed up before performing any deletion operations.

AWS
answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions