Hi,
- I have prepared the correct connection to Snowflake in AWS Glue. According to the instructions from here.
The URL looks like this:
jdbc:snowflake://account_name.snowflakecomputing.com/?user=user_name&db=sample&role=role_name&warehouse=warehouse_name
, exactly like in the documentation
- I created a crawler using a prepared connection for a selected table from Snowflake.
- The crawler started successfully, and all table elements were loaded correctly. The schema was loaded correctly
- In job Glue (Job is created by the script, not visual builder, etc.), using the
create_dynamic_frame_from_catalog
method, I get the following error:
23/06/22 10:58:18 ERROR ProcessLauncher: Error from Python:Traceback (most recent call last):
File "/tmp/sample.py", line 94, in <module>
GluePythonSampleJob().run()
File "/tmp/sample.py", line 52, in run
dyf = self.read_data_from_catalog(self.context)
File "/tmp/sample.py", line 63, in read_data_from_s3
table_name='test-table',
File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 186, in create_dynamic_frame_from_catalog
makeOptions(self._sc, additional_options), catalog_id),
File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco
raise converted from None
pyspark.sql.utils.IllegalArgumentException: No group with name <host>
Method body:
dyf = glue_context.create_dynamic_frame_from_catalog(
database='db-snowflake',
table_name='test-table',
transformation_ctx="datasource0")
When the table is prepared on the basis of CSV from S3, using a crawler - everything works fine.
This looks like a connection issue to Snowflake to me. However, I do not understand why this problem occurs, since the crawler correctly read all the data.
UPDATE
Exactly the same happens for the virtual job, based on this catalog.
What can I do?
sounds like a bug in the url handling since snowflake works with account instead of host, do you have the rest of the exception stackrtrace to see what is trying to get the host?
@twood Did you solve this issue. Im getting the same error. Can you post your answer here?