Glue: SnowFlake table cataloged, send to S3 after some transformations.

0

Hi, i have ran a crawler that connects by jdbc to SnowFlake and it creates a table into the Glue catalog database, that works nice. now i want to use glue studio to take the source (AWS Glue Data Catalog, and the table created by the above crawler) and do some transformations and later write to a S3 bucket. the flow is: AWS Glue Data Catalog | Filter (C_CUSTKEY=1) | S3 in Cloudwathc it's showing the next errors:

2023-05-17T14:12:10.801+02:00

Copy 23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] { "Event": "GlueETLJobExceptionEvent", "Timestamp": 1684325530798, "Failure Reason": "Traceback (most recent call last):\n File "/tmp/SFCatalogToS3.py", line 20, in <module>\n transformation_ctx="AWSGlueDataCatalog_node1684308467551",\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 787, in from_catalog\n return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 186, in create_dynamic_frame_from_catalog\n makeOptions(self._sc, additional_options), catalog_id),\n File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call\n answer, self.gateway_client, self.target_id, self.name)\n File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco\n raise converted from None\npyspark.sql.utils.IllegalArgumentException: No group with name <host>", "Stack Trace": [ { "Declaring Class": "deco", "Method Name": "raise converted from None", "File Name": "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", "Line Number": 117 }, { "Declaring Class": "call", "Method Name": "answer, self.gateway_client, self.target_id, self.name)", "File Name": "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", "Line Number": 1305 }, { "Declaring Class": "create_dynamic_frame_from_catalog", "Method Name": "makeOptions(self._sc, additional_options), catalog_id),", "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", "Line Number": 186 }, { "Declaring Class": "from_catalog", "Method Name": "return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)", "File Name": "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", "Line Number": 787 }, { "Declaring Class": "<module>", "Method Name": "transformation_ctx="AWSGlueDataCatalog_node1684308467551",", "File Name": "/tmp/SFCatalogToS3.py", "Line Number": 20 } ], "Last Executed Line number": 20, "script": "SFCatalogToS3.py" }

23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] {"Event":"GlueETLJobExceptionEvent","Timestamp":1684325530798,"Failure Reason":"Traceback (most recent call last):\n File "/tmp/SFCatalogToS3.py", line 20, in <module>\n transformation_ctx="AWSGlueDataCatalog_node1684308467551",\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 787, in from_catalog\n return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)\n File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 186, in create_dynamic_frame_from_catalog\n makeOptions(self._sc, additional_options), catalog_id),\n File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call\n answer, self.gateway_client, self.target_id, self.name)\n File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco\n raise converted from None\npyspark.sql.utils.IllegalArgumentException: No group with name <host>","Stack Trace":[{"Declaring Class":"deco","Method Name":"raise converted from None","File Name":"/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py","Line Number":117},{"Declaring Class":"call","Method Name":"answer, self.gateway_client, self.target_id, self.name)","File Name":"/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py","Line Number":1305},{"Declaring Class":"create_dynamic_frame_from_catalog","Method Name":"makeOptions(self._sc, additional_options), catalog_id),","File Name":"/opt/amazon/lib/python3.6/site-packages/awsglue/context.py","Line Number":186},{"Declaring Class":"from_catalog","Method Name":"return self._glue_context.create_dynamic_frame_from_catalog(db, table_name, redshift_tmp_dir, transformation_ctx, push_down_predicate, additional_options, catalog_id, **kwargs)","File Name":"/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py","Line Number":787},{"Declaring Class":"<module>","Method Name":"transformation_ctx="AWSGlueDataCatalog_node1684308467551",","File Name":"/tmp/SFCatalogToS3.py","Line Number":20}],"Last Executed Line number":20,"script":"SFCatalogToS3.py"}

2023-05-17T14:12:10.934+02:00

Copy 23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] Last Executed Line number from script SFCatalogToS3.py: 20 23/05/17 12:12:10 ERROR GlueExceptionAnalysisListener: [Glue Exception Analysis] Last Executed Line number from script SFCatalogToS3.py: 20

is it possible to do what i try to do ?, cheers

  • Did you solve this issue. Im getting the same error. Can you post your answer here?

Willi5
preguntada hace un año447 visualizaciones
2 Respuestas
0

Sounds like you are passing some option to create_dynamic_frame_from_catalog that is incorrect, are you specifying something in addition to the catalog and table?

profile pictureAWS
EXPERTO
respondido hace un año
  • Hi, nop i'm not passing nothing, only mi source is "AWS Glue Data Catalog" apply one filter and the target is S3, the table was created by a crawler and exist in the catalog: Name Database snowflake_sample_data_tpch_sf1_customer testsf

    Location Connection
    SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER snowflake-glue-jdbc-connection2

  • Please open a support ticket, sounds there is something in the table or connection configuration that the job is not able to handle correctly

0

@Willi5 Are you able to fix this. Im getting the same error and can you post your answer here?

sb
respondido hace 2 días

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas