Convert into date datatype for dynamic frame

0

Hello All Experts,

Please help with the below scenario.

Data is stored in the raw zone and a column "ga4_dt "is extracted as a string in the format 'yyyymmdd' example 20230108. I can't update the way the data is extracted.

I am using ApplyMapping.apply to rename attributes and cast to proper data types for columns on Dynamic frame. One of the examples (source, datatype, target, datatype) (engagementrate, string, engagement_rate, double),


I want to convert ga4_dt column in date datatype with the format yyyy-mm-dd (ga_dt, string, ga4_date, date),

When I apply direct date transformation all columns are populating as null.


I am aware that I can convert the dynamic frame into df and apply the transformation something like df.select(col("ga4_dt"),to_date(col("ga4_dt"),"yyyymmdd").show

though I am looking for some resolution with ApplyMapping.apply https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-pyspark-extensions-dynamic-frame.html#pyspark-apply_mapping-example

I am not able to find it in any documentation also Please help.

Thanks

asked 8 months ago1036 views
2 Answers
1
Accepted Answer

ApplyMapping doesn't bubble errors your code might have and that's why probably you get the empty columns.
Try to debug the function using plain Python or catch any exception inside the function and put the message into some string column so you can see it.
I think the alternative you point using DataFrame is easier and more robust.

profile pictureAWS
EXPERT
answered 8 months ago
1

ApplyMapping casting works for dates that are in the format of one of the ISO variants e.g. 2023-01-08. For custom formats you can convert it to DataFrame and specify the formats as you are already aware of this.

Just posting for your reference: https://sparkbyexamples.com/spark/spark-date-functions-how-to-parse-and-format-date/#Parsing-Date-from-String-object-to-Spark-DateType

As suggested by Gonzalo Herreros, converting it to Dataframe and applying transformation would involve less hassle and robust.

AWS
SUPPORT ENGINEER
answered 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions