Hi, i have ran a crawler that connects by jdbc to SnowFlake and it creates a table into the Glue catalog database, that works nice.
now i want to use glue studio to take the source (AWS Glue Data Catalog, and the table created by the above crawler) and do some transformations and later write to a S3 bucket. the flow is:
AWS Glue Data Catalog
|
Filter (C_CUSTKEY=1)
|
S3
in Cloudwathc it's showing the next errors:
2023-05-17T14:12:10.801+02:00
Copy
23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis]
{
"Event": "GlueETLJobExceptionEvent",
"Timestamp": 1684325530798,
"Failure Reason": "Traceback (most recent call last):\n File "/tmp/SFCatalogToS3.py", line 20, in <module>\n transformation_ctx="AWSGlueDataCatalog_node1684308467551",\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 787, in from_catalog\n return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 186, in create_dynamic_frame_from_catalog\n makeOptions(self._sc, additional_options), catalog_id),\n File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call\n answer, self.gateway_client, self.target_id, self.name)\n File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco\n raise converted from None\npyspark.sql.utils.IllegalArgumentException: No group with name <host>",
"Stack Trace": [
{
"Declaring Class": "deco",
"Method Name": "raise converted from None",
"File Name": "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py",
"Line Number": 117
},
{
"Declaring Class": "call",
"Method Name": "answer, self.gateway_client, self.target_id, self.name)",
"File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py",
"Line Number": 1305
},
{
"Declaring Class": "create_dynamic_frame_from_catalog",
"Method Name": "makeOptions(self._sc, additional_options), catalog_id),",
"File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py",
"Line Number": 186
},
{
"Declaring Class": "from_catalog",
"Method Name": "return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)",
"File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py",
"Line Number": 787
},
{
"Declaring Class": "<module>",
"Method Name": "transformation_ctx="AWSGlueDataCatalog_node1684308467551",",
"File Name": "/tmp/SFCatalogToS3.py",
"Line Number": 20
}
],
"Last Executed Line number": 20,
"script": "SFCatalogToS3.py"
}
23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] {"Event":"GlueETLJobExceptionEvent","Timestamp":1684325530798,"Failure Reason":"Traceback (most recent call last):\n File "/tmp/SFCatalogToS3.py", line 20, in <module>\n transformation_ctx="AWSGlueDataCatalog_node1684308467551",\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 787, in from_catalog\n return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 186, in create_dynamic_frame_from_catalog\n makeOptions(self._sc, additional_options), catalog_id),\n File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call\n answer, self.gateway_client, self.target_id, self.name)\n File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco\n raise converted from None\npyspark.sql.utils.IllegalArgumentException: No group with name <host>","Stack Trace":[{"Declaring Class":"deco","Method Name":"raise converted from None","File Name":"/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py","Line Number":117},{"Declaring Class":"call","Method Name":"answer, self.gateway_client, self.target_id, self.name)","File Name":"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py","Line Number":1305},{"Declaring Class":"create_dynamic_frame_from_catalog","Method Name":"makeOptions(self._sc, additional_options), catalog_id),","File Name":"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py","Line Number":186},{"Declaring Class":"from_catalog","Method Name":"return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)","File Name":"/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py","Line Number":787},{"Declaring Class":"<module>","Method Name":"transformation_ctx="AWSGlueDataCatalog_node1684308467551",","File Name":"/tmp/SFCatalogToS3.py","Line Number":20}],"Last Executed Line number":20,"script":"SFCatalogToS3.py"}
2023-05-17T14:12:10.934+02:00
Copy
23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] Last Executed Line number from script SFCatalogToS3.py: 20
23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] Last Executed Line number from script SFCatalogToS3.py: 20
is it possible to do what i try to do ?, cheers
Hi, nop i'm not passing nothing, only mi source is "AWS Glue Data Catalog" apply one filter and the target is S3, the table was created by a crawler and exist in the catalog: Name Database snowflake_sample_data_tpch_sf1_customer testsf
Location Connection
SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER snowflake-glue-jdbc-connection2
Please open a support ticket, sounds there is something in the table or connection configuration that the job is not able to handle correctly