[Pandas] How to write data into JSON column of Postgres SQL

0

Hi,

I'm trying to write a dataframe into Postgres SQL table that has JSON column ("details"), using the following code

results = []
details_string = '{"name": "test"}'
json_object = json.loads(details_string)
results.append([1, json_object])
mySchema = StructType([ \
            StructField("event_id",IntegerType(), True), \
            StructField("details", StructType([StructField('name', StringType(), True)]), True) \
myResult = glueContext.createDataFrame(data = pd.DataFrame(results, columns=['event_id', 'details']), schema=mySchema)])
... then write to DB

However, there seems the issue with the mySchema field for JSON type. I've tried StructType, MapType, ArrayType, but each time I get different errors

this is for MapType

Job aborted due to stage failure: Task 4 in stage 182.0 failed 4 times, most recent failure: Lost task 4.3 in stage 182.0 (TID 1802, 172.36.213.211, executor 2): java.lang.IllegalArgumentException: Can't get JDBC type for map<string,string>

and this one for StructField("details", StructType([StructField('name', StringType(), True)]), True)

Job aborted due to stage failure: Task 3 in stage 211.0 failed 4 times, most recent failure: Lost task 3.3 in stage 211.0 (TID 2160, 172.36.18.91, executor 4): java.lang.IllegalArgumentException: Can't get JDBC type for structname:string

Does anyone have an example how to construct the schema for Dataframe to write the JSON into JSON Postgres SQL column?

Michael
已提问 2 年前656 查看次数
1 回答
0

Did you try using to_json function ?

from pyspark.sql.functions import to_json, struct
df.select(to_json(struct(struct([df[x] for x in df.columns]))).alias("jsonobject")
profile pictureAWS
已回答 2 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则