Skip to content

AWS GLUE - ISSUES REGEX NOT CREATING A COLUMN IN THE FINAL TABLE

0

I am trying to create an aws glue rotine which consum an database table from datacatalog and an csv, in this way join this table based on two columns (on from each table). After that i added an regex extractor to create an column based on another column and removing the first character from it.

And in the data preview it works but when i try to run it and save in a S3 bucket it doesn't came. Somebody know what could be happening?

csv = csv.toDF()
baseaws = baseaws.toDF()
join_table = DynamicFrame.fromDF(csv.join(baseaws, (csv['key1'] == baseaws['key2']), "left"), glueContext, "Join_node1719507151724")

regex_add_column = join_table .gs_regex_extract(colName="column_needed", regex="^[+]", newCols="TOTAL")

final_table = glueContext.getSink(path="url/Results/", connection_type="s3", updateBehavior="UPDATE_IN_DATABASE", partitionKeys=[], enableUpdateCatalog=True, transformation_ctx="final_table")
final_table.setCatalogInfo(catalogDatabase="url", catalogTableName="final_table")
final_table.setFormat("glueparquet", compression="snappy")
final_table.writeFrame(regex_add_column )
job.commit()
asked 2 years ago305 views
1 Answer
0

Check the logs of your Glue job. If there were any errors during the execution of the job, they would appear in the logs. Make sure you’re looking for a .parquet file in the S3 bucket.

EXPERT
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.