Glue: SnowFlake table cataloged, send to S3 after some transformations.

0

Hi, i have ran a crawler that connects by jdbc to SnowFlake and it creates a table into the Glue catalog database, that works nice. now i want to use glue studio to take the source (AWS Glue Data Catalog, and the table created by the above crawler) and do some transformations and later write to a S3 bucket. the flow is: AWS Glue Data Catalog | Filter (C_CUSTKEY=1) | S3 in Cloudwathc it's showing the next errors:

2023-05-17T14:12:10.801+02:00

Copy 23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] { "Event": "GlueETLJobExceptionEvent", "Timestamp": 1684325530798, "Failure Reason": "Traceback (most recent call last):\n File "/tmp/SFCatalogToS3.py", line 20, in <module>\n transformation_ctx="AWSGlueDataCatalog_node1684308467551",\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 787, in from_catalog\n return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 186, in create_dynamic_frame_from_catalog\n makeOptions(self._sc, additional_options), catalog_id),\n File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call\n answer, self.gateway_client, self.target_id, self.name)\n File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco\n raise converted from None\npyspark.sql.utils.IllegalArgumentException: No group with name <host>", "Stack Trace": [ { "Declaring Class": "deco", "Method Name": "raise converted from None", "File Name": "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", "Line Number": 117 }, { "Declaring Class": "call", "Method Name": "answer, self.gateway_client, self.target_id, self.name)", "File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", "Line Number": 1305 }, { "Declaring Class": "create_dynamic_frame_from_catalog", "Method Name": "makeOptions(self._sc, additional_options), catalog_id),", "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", "Line Number": 186 }, { "Declaring Class": "from_catalog", "Method Name": "return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)", "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", "Line Number": 787 }, { "Declaring Class": "<module>", "Method Name": "transformation_ctx="AWSGlueDataCatalog_node1684308467551",", "File Name": "/tmp/SFCatalogToS3.py", "Line Number": 20 } ], "Last Executed Line number": 20, "script": "SFCatalogToS3.py" }

23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] {"Event":"GlueETLJobExceptionEvent","Timestamp":1684325530798,"Failure Reason":"Traceback (most recent call last):\n File "/tmp/SFCatalogToS3.py", line 20, in <module>\n transformation_ctx="AWSGlueDataCatalog_node1684308467551",\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 787, in from_catalog\n return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 186, in create_dynamic_frame_from_catalog\n makeOptions(self._sc, additional_options), catalog_id),\n File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call\n answer, self.gateway_client, self.target_id, self.name)\n File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco\n raise converted from None\npyspark.sql.utils.IllegalArgumentException: No group with name <host>","Stack Trace":[{"Declaring Class":"deco","Method Name":"raise converted from None","File Name":"/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py","Line Number":117},{"Declaring Class":"call","Method Name":"answer, self.gateway_client, self.target_id, self.name)","File Name":"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py","Line Number":1305},{"Declaring Class":"create_dynamic_frame_from_catalog","Method Name":"makeOptions(self._sc, additional_options), catalog_id),","File Name":"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py","Line Number":186},{"Declaring Class":"from_catalog","Method Name":"return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)","File Name":"/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py","Line Number":787},{"Declaring Class":"<module>","Method Name":"transformation_ctx="AWSGlueDataCatalog_node1684308467551",","File Name":"/tmp/SFCatalogToS3.py","Line Number":20}],"Last Executed Line number":20,"script":"SFCatalogToS3.py"}

2023-05-17T14:12:10.934+02:00

Copy 23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] Last Executed Line number from script SFCatalogToS3.py: 20 23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] Last Executed Line number from script SFCatalogToS3.py: 20

is it possible to do what i try to do ?, cheers

  • Did you solve this issue. Im getting the same error. Can you post your answer here?

Willi5
gefragt vor einem Jahr447 Aufrufe
2 Antworten
0

Sounds like you are passing some option to create_dynamic_frame_from_catalog that is incorrect, are you specifying something in addition to the catalog and table?

profile pictureAWS
EXPERTE
beantwortet vor einem Jahr
  • Hi, nop i'm not passing nothing, only mi source is "AWS Glue Data Catalog" apply one filter and the target is S3, the table was created by a crawler and exist in the catalog: Name Database snowflake_sample_data_tpch_sf1_customer testsf

    Location Connection
    SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER snowflake-glue-jdbc-connection2

  • Please open a support ticket, sounds there is something in the table or connection configuration that the job is not able to handle correctly

0

@Willi5 Are you able to fix this. Im getting the same error and can you post your answer here?

sb
beantwortet vor 3 Tagen

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen