Glue: SnowFlake table cataloged, send to S3 after some transformations.

0

Hi, i have ran a crawler that connects by jdbc to SnowFlake and it creates a table into the Glue catalog database, that works nice. now i want to use glue studio to take the source (AWS Glue Data Catalog, and the table created by the above crawler) and do some transformations and later write to a S3 bucket. the flow is: AWS Glue Data Catalog | Filter (C_CUSTKEY=1) | S3 in Cloudwathc it's showing the next errors:

2023-05-17T14:12:10.801+02:00

Copy 23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] { "Event": "GlueETLJobExceptionEvent", "Timestamp": 1684325530798, "Failure Reason": "Traceback (most recent call last):\n File "/tmp/SFCatalogToS3.py", line 20, in <module>\n transformation_ctx="AWSGlueDataCatalog_node1684308467551",\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 787, in from_catalog\n return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 186, in create_dynamic_frame_from_catalog\n makeOptions(self._sc, additional_options), catalog_id),\n File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call\n answer, self.gateway_client, self.target_id, self.name)\n File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco\n raise converted from None\npyspark.sql.utils.IllegalArgumentException: No group with name <host>", "Stack Trace": [ { "Declaring Class": "deco", "Method Name": "raise converted from None", "File Name": "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", "Line Number": 117 }, { "Declaring Class": "call", "Method Name": "answer, self.gateway_client, self.target_id, self.name)", "File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", "Line Number": 1305 }, { "Declaring Class": "create_dynamic_frame_from_catalog", "Method Name": "makeOptions(self._sc, additional_options), catalog_id),", "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", "Line Number": 186 }, { "Declaring Class": "from_catalog", "Method Name": "return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)", "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", "Line Number": 787 }, { "Declaring Class": "<module>", "Method Name": "transformation_ctx="AWSGlueDataCatalog_node1684308467551",", "File Name": "/tmp/SFCatalogToS3.py", "Line Number": 20 } ], "Last Executed Line number": 20, "script": "SFCatalogToS3.py" }

23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] {"Event":"GlueETLJobExceptionEvent","Timestamp":1684325530798,"Failure Reason":"Traceback (most recent call last):\n File "/tmp/SFCatalogToS3.py", line 20, in <module>\n transformation_ctx="AWSGlueDataCatalog_node1684308467551",\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 787, in from_catalog\n return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 186, in create_dynamic_frame_from_catalog\n makeOptions(self._sc, additional_options), catalog_id),\n File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call\n answer, self.gateway_client, self.target_id, self.name)\n File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco\n raise converted from None\npyspark.sql.utils.IllegalArgumentException: No group with name <host>","Stack Trace":[{"Declaring Class":"deco","Method Name":"raise converted from None","File Name":"/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py","Line Number":117},{"Declaring Class":"call","Method Name":"answer, self.gateway_client, self.target_id, self.name)","File Name":"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py","Line Number":1305},{"Declaring Class":"create_dynamic_frame_from_catalog","Method Name":"makeOptions(self._sc, additional_options), catalog_id),","File Name":"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py","Line Number":186},{"Declaring Class":"from_catalog","Method Name":"return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)","File Name":"/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py","Line Number":787},{"Declaring Class":"<module>","Method Name":"transformation_ctx="AWSGlueDataCatalog_node1684308467551",","File Name":"/tmp/SFCatalogToS3.py","Line Number":20}],"Last Executed Line number":20,"script":"SFCatalogToS3.py"}

2023-05-17T14:12:10.934+02:00

Copy 23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] Last Executed Line number from script SFCatalogToS3.py: 20 23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] Last Executed Line number from script SFCatalogToS3.py: 20

is it possible to do what i try to do ?, cheers

Willi5
asked a year ago431 views
1 Answer
0

Sounds like you are passing some option to create_dynamic_frame_from_catalog that is incorrect, are you specifying something in addition to the catalog and table?

profile pictureAWS
EXPERT
answered a year ago
  • Hi, nop i'm not passing nothing, only mi source is "AWS Glue Data Catalog" apply one filter and the target is S3, the table was created by a crawler and exist in the catalog: Name Database snowflake_sample_data_tpch_sf1_customer testsf

    Location Connection
    SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER snowflake-glue-jdbc-connection2

  • Please open a support ticket, sounds there is something in the table or connection configuration that the job is not able to handle correctly

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions