By using AWS re:Post, you agree to the Terms of Use

[Pandas] How to write data into JSON column of Postgres SQL

0

Hi,

I'm trying to write a dataframe into Postgres SQL table that has JSON column ("details"), using the following code

results = []
details_string = '{"name": "test"}'
json_object = json.loads(details_string)
results.append([1, json_object])
mySchema = StructType([ \
            StructField("event_id",IntegerType(), True), \
            StructField("details", StructType([StructField('name', StringType(), True)]), True) \
myResult = glueContext.createDataFrame(data = pd.DataFrame(results, columns=['event_id', 'details']), schema=mySchema)])
... then write to DB

However, there seems the issue with the mySchema field for JSON type. I've tried StructType, MapType, ArrayType, but each time I get different errors

this is for MapType

Job aborted due to stage failure: Task 4 in stage 182.0 failed 4 times, most recent failure: Lost task 4.3 in stage 182.0 (TID 1802, 172.36.213.211, executor 2): java.lang.IllegalArgumentException: Can't get JDBC type for map<string,string>

and this one for StructField("details", StructType([StructField('name', StringType(), True)]), True)

Job aborted due to stage failure: Task 3 in stage 211.0 failed 4 times, most recent failure: Lost task 3.3 in stage 211.0 (TID 2160, 172.36.18.91, executor 4): java.lang.IllegalArgumentException: Can't get JDBC type for structname:string

Does anyone have an example how to construct the schema for Dataframe to write the JSON into JSON Postgres SQL column?

1 Answer
0

Did you try using to_json function ?

from pyspark.sql.functions import to_json, struct
df.select(to_json(struct(struct([df[x] for x in df.columns]))).alias("jsonobject")
profile picture
answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions