Hi all,
I have followed the instructions https://docs.aws.amazon.com/athena/latest/ug/connect-data-source-serverless-app-repo.html to deploy Timestream as an additional data source to Athena and can succeassfully query timestream data via Athena console, by using catalog "TimestreamCatalog" I added.
Now I need to use the same catalog "TimestreamCatalog" when building a Glue job.
I run:
DataCatalogtable_node1 = glueContext.create_dynamic_frame.from_catalog(
catalog_id = "TimestreamCatalog",
database="mydb",
table_name="mytable",
transformation_ctx="DataCatalogtable_node1",
)
and run into this error, even when the role in question has Administrator policy i.e. action:* resource* attached (for the sake of experiment):
An error occurred while calling o86.getCatalogSource. User: arn:aws:sts::*******:assumed-role/AWSGlueServiceRole-andrei/GlueJobRunnerSession is not authorized to perform: glue:GetTable on resource: arn:aws:glue:eu-central-1:TimestreamCatalog:catalog (Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException; Request ID: 36d7e411-8ca9-4993-9066-b6ca1d7ea4a3; Proxy: null)
When calling aws athena list-data-catalogs
, I get:
{
"DataCatalogsSummary": [
{
"CatalogName": "AwsDataCatalog",
"Type": "GLUE"
},
{
"CatalogName": "TimestreamCatalog",
"Type": "LAMBDA"
}
]
}
I am not sure if using data source name as catalog_id is correct here, so any hint on what catalog_id is supposed to be for customer data source is appreciated, or any hint on how to resolve the issue above.
Thanks, Andrei