How to connect a Sagemaker Notebook to Glue Catalog


A customer wants to connect a Sagemaker notebook to Glue Catalog, but is not allowed to use developer endpoints because of security constraints.

I can't seem to find documentation on the Glue Catalog API that would allow this, or examples of how this might be done. Any links or pointers would be greatly appreciated.

Accepted Answer

So there is the catalog API which allows you to describe databases, tables, etc. Documentation regarding the calls and data structures can be found here:

Boto3 for get_table

If they have a restrictive security posture (as suggested by the avoidance of Dev Endpoints) you may also suggest a Glue VPC-E:

I would ask what are they accessing the catalog for, as the Dev Endpoint isn't entirely about the Glue Catalog, but about the compute resources andSparkMagic.

Also, think about steering them towards AWS Data Wrangler for interacting with Glue Catalog if they are using Pandas. Helpful snippets can be found here:

