- Newest
- Most votes
- Most comments
Yes, AWS Glue does support reading from Data Catalog tables that are backed by PostgreSQL databases. However, the error you're encountering suggests there might be an issue with how the connection is being utilized or configured.
When working with PostgreSQL in AWS Glue, it's important to ensure that the connection is properly set up and that the necessary information is being passed to the Glue job. Here are a few things to check:
-
Verify that the connection type for your PostgreSQL RDS instance is correctly set to "JDBC" in the Glue connection settings.
-
Make sure the JDBC URL, username, and password are correctly configured in the connection properties.
-
Check that the VPC, subnet, and security group settings in the Glue connection match those of your RDS instance.
-
Ensure that the IAM role associated with your Glue job has the necessary permissions to access both the Glue Data Catalog and the RDS instance.
-
When creating your dynamic frame, you may need to explicitly specify the connection name. Try modifying your code to include the connection_type parameter:
node_name = glueContext.create_dynamic_frame.from_catalog( database="default", table_name="my_table_name", transformation_ctx="ctx_name", connection_type="postgresql" )
-
If you're using a custom JDBC driver for PostgreSQL, make sure it's properly uploaded and configured in your Glue environment.
-
Check if there are any version compatibility issues between your PostgreSQL database and the JDBC driver being used by Glue.
If you've verified all these points and are still encountering issues, it may be worth testing the connection directly using the AWS Glue connection test feature or the AWSSupport-TroubleshootGlueConnection runbook in AWS Systems Manager. These tools can help identify any network, IAM, or credential issues that might be preventing the connection from working properly.
Remember that AWS Glue Studio creates unified connections for PostgreSQL data sources, which may require additional steps for accessing Secrets Manager and VPC resources. Make sure you've completed any necessary additional configuration for these unified connections.
If the problem persists after checking these items, you may need to contact AWS support for further assistance, as there could be an underlying issue with the Glue service or how it's interacting with your specific PostgreSQL setup.
Sources
AWS Glue connection properties - AWS Glue
Troublehsoot AWS Glue connection to a JDBC data source | AWS re:Post
Using custom connectors and connections with AWS Glue Studio - AWS Glue
Relevant content
- asked a year ago
As mentioned, I have created a connection to the DB. I have created Glue jobs that can successfully read from the DB, directly. I have created a Crawler that can read from the DB and has created a Data Catalog table with the correct schema.
The problem is that I am unable to read from that table. Here is a more complete example:
This successfully prints out the
from_db
schema but fails after definingfrom_catalog
. In the logs, I see:DataCatalogWrapper: got connection from glue client with name: PostgreSQL Connection and type: POSTGRESQL
Just before:
getCatalogSource: We don't support this connection type: null