- Newest
- Most votes
- Most comments
The situation you're experiencing with aws_s3.table_import_from_s3 hanging on Glacier Deep Archive files is indeed challenging. Let's address your questions one by one:
- Terminating the process: Unfortunately, if pg_cancel_backend and pg_terminate_backend are not working, there might not be a straightforward way to terminate the process within PostgreSQL itself. These are typically the most effective methods for ending long-running queries. The fact that they return true but the process persists suggests that the function might be stuck in a state where it's not responding to these termination signals.
As a last resort, you might need to consider restarting the PostgreSQL database. However, this approach comes with significant risks, including potential data loss and downtime for all connected applications. It should only be considered if absolutely necessary and after careful evaluation of the impact.
-
Altering aws_s3.table_import_from_s3 to handle Glacier Deep Archive: The aws_s3.table_import_from_s3 function is part of the AWS-provided extension and cannot be directly modified by users. However, you can create a wrapper function or procedure that includes additional checks before calling aws_s3.table_import_from_s3.
-
Checking storage class before import: To avoid this issue in the future, you can implement a check for the storage class of the S3 object before attempting to import it. Here's a high-level approach you could consider:
a. Use the AWS SDK or CLI to check the storage class of the S3 object. b. If the storage class is not Glacier Deep Archive, proceed with the import. c. If it is Glacier Deep Archive, either skip the file or initiate a restore request and wait for it to complete before importing.
You could implement this as a stored procedure in PostgreSQL that wraps the aws_s3.table_import_from_s3 function. The procedure would first use AWS API calls (via a PostgreSQL extension like aws_lambda if available, or through an external script) to check the storage class, and only proceed with the import if the file is readily accessible.
To prevent future occurrences, consider implementing a lifecycle policy on your S3 bucket that aligns with your data import needs. This could involve keeping data in Standard storage for a certain period before transitioning to Glacier, or using S3 Intelligent-Tiering for automatic cost optimization while maintaining immediate accessibility.
Remember to thoroughly test any new procedures in a non-production environment before implementing them in your production database.
Sources
How to terminate aws_s3.table_import_from_s3 calls? | AWS re:Post
End long-running queries in Amazon RDS for PostgreSQL | AWS re:Post
Importing data from Amazon S3 to your RDS for PostgreSQL DB instance - Amazon Relational Database Service
Relevant content
- asked 2 years ago
- asked 2 years ago