1 Answer
- Newest
- Most votes
- Most comments
0
you can modify your AWS Glue script
import pyspark.sql.functions as F
dfc = ChangeSchema_node1685651062990.toDF() # convert dynamic frame to dataframe
dfc = dfc.withColumn("request_payload", F.to_json("request_payload")) # convert struct to json string
ChangeSchema_node1685651062990 = DynamicFrame.fromDF(dfc, glueContext, "ChangeSchema_node1685651062990") # convert back to dynamic frame
Relevant content
- Accepted Answerasked 5 months ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated a year ago
Script:
AWSGlueDataCatalog_node1685651050820 = glueContext.create_dynamic_frame.from_catalog( database="dynamodb-to-analyticsdb", table_name="prompthistory", transformation_ctx="AWSGlueDataCatalog_node1685651050820", )
PromptHistory_df = AWSGlueDataCatalog_node1685651050820.toDF() PromptHistory_df = PromptHistory_df.withColumn("request_payload", F.to_json("request_payload")) PromptHistory_df = PromptHistory_df.withColumn("created", F.to_timestamp("created")) AWSGlueDataCatalog_node1685651050820 = DynamicFrame.fromDF(PromptHistory_df, glueContext, "AWSGlueDataCatalog_node1685651050820")
AWSGlueDataCatalog_node1685651092780 = glueContext.write_dynamic_frame.from_catalog( frame=AWSGlueDataCatalog_node1685651050820, database="ablt-ai-analytics-12-14", table_name="postgres_public_prompthistory_06e0ad580a8f6d2ff0e57e074d377329", transformation_ctx="AWSGlueDataCatalog_node1685651092780", )
job.commit()
This is close to what I need but there is still several issues:
This will actually have to be done one level up on "AWSGlueDataCatalog_node1685651050820" rather than on "ChangeSchema_node1685651062990" because to_json will not accept a string as an input type but will accept a struct type.
so I re-wrote the function but write still fails, checking the schema of the DataFrame and DynamicFrame both list the data type as string. So when I write to Postgres I still get the error: "An error occurred while calling o110.pyWriteDynamicFrame. ERROR: column "request_payload" is of type json but expression is of type character varying"