By using AWS re:Post, you agree to the Terms of Use

Questions tagged with AWS Glue

Sort by most recent
  • 1
  • 12 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Data Mesh on AWS Lake Formation

Hi, I'm building a data mesh in AWS Lake Formation. The idea is to have 4 accounts: account 0: main account account 1: central data governance account 2: data producer account 3: data consumer I have been looking for information about how to implement the mesh in AWS and I'm following some tutorials that are very similar to what I'm doing: https://catalog.us-east-1.prod.workshops.aws/workshops/78572df7-d2ee-4f78-b698-7cafdb55135d/en-US/lakeformation-basics/cross-account-data-mesh https://aws.amazon.com/blogs/big-data/design-a-data-mesh-architecture-using-aws-lake-formation-and-aws-glue/ https://aws.amazon.com/blogs/big-data/build-a-data-sharing-workflow-with-aws-lake-formation-for-your-data-mesh/ However, after having created the bucket and uploaded some csv data to it (in the producer account), I don't know if I have to register first to the glue catalog in the producer account or I just do it in the lake formation like it says here: https://catalog.us-east-1.prod.workshops.aws/workshops/78572df7-d2ee-4f78-b698-7cafdb55135d/en-US/lakeformation-basics/databases (is this dependant on if one uses glue permissions or lake formation permissions in lake formation configuration?) Indeed I have done it first the database and the table in glue and then when I go to lake formation in the database and table sections the database and table created from glue appear there without doing anything. Even if I disable there the options: "Use only IAM access control for new databases" "Use only IAM access control for new tables in new databases" both the database and table appear there do you know if glue and lake formations share the data catalog? and I'm doing it correctly? thanks, John
0
answers
0
votes
12
views
asked a day ago

[Pandas] How to write data into JSON column of Postgres SQL

Hi, I'm trying to write a dataframe into Postgres SQL table that has JSON column ("details"), using the following code ``` results = [] details_string = '{"name": "test"}' json_object = json.loads(details_string) results.append([1, json_object]) mySchema = StructType([ \ StructField("event_id",IntegerType(), True), \ StructField("details", StructType([StructField('name', StringType(), True)]), True) \ myResult = glueContext.createDataFrame(data = pd.DataFrame(results, columns=['event_id', 'details']), schema=mySchema)]) ... then write to DB ``` However, there seems the issue with the mySchema field for JSON type. I've tried StructType, MapType, ArrayType, but each time I get different errors this is for MapType > Job aborted due to stage failure: Task 4 in stage 182.0 failed 4 times, most recent failure: Lost task 4.3 in stage 182.0 (TID 1802, 172.36.213.211, executor 2): java.lang.IllegalArgumentException: Can't get JDBC type for map<string,string> and this one for StructField("details", StructType([StructField('name', StringType(), True)]), True) > Job aborted due to stage failure: Task 3 in stage 211.0 failed 4 times, most recent failure: Lost task 3.3 in stage 211.0 (TID 2160, 172.36.18.91, executor 4): java.lang.IllegalArgumentException: Can't get JDBC type for struct<name:string> Does anyone have an example how to construct the schema for Dataframe to write the JSON into JSON Postgres SQL column?
0
answers
0
votes
25
views
asked 6 days ago
  • 1
  • 12 / page