AWS Glue, crawlers and issue with money datatype
I'm using a AWS Glue crawler for a RDS Aurora PostgreSQL, the table contains some columns with the money datatype, the crawler create the table with those columns as double.
AWSGlueDataCatalog_node1652381535100 = glueContext.create_dynamic_frame.from_catalog( database="tmp_db", table_name="postgres_public_list", transformation_ctx="AWSGlueDataCatalog_node1652381535100",)
But then when I start to working on the AWS Glue, any action, even a simple AWSGlueDataCatalog_node1652381535100.count() gives me this kind of error:
... Caused by: org.postgresql.util.PSQLException: Bad value for type double : 2,400.00 ...
Doesn't matter if I try to change the columns datatype to string, keeps saying the same error.
Any ideas how to solve this issue?
I would like to inform you that MONEY is not supported due to the JDBC driver for PostgreSQL can’t handle those types properly. To process this kind of data, please cast the column to decimal type and make sure to verify it using printschema. B
For dynamic frame you can use Applymapping:
## To verify the datatype of the column datasource0.printtSchema() ## Cast it to decimal using ApplyMapping applymapping1 = ApplyMapping.apply(frame = datasource0, mappings = [("v1", "string", "v1", "string"),("XX_threshold", "double", "XX_threshold", "decimal(25,10)")], transformation_ctx = "applymapping1") ##Verify the schma if it is counted to Decimal or not. applymapping1.printSchema() applymapping1.toDF().show()
For dataframe you can use cast function:
from pyspark.sql.types import * df=datasource0.toDF() df2 = df.withColumn("age",col("age").cast("decimal(25,10)")) df2.printSchema() df2.show()
Thank you your response. However this doesn't work, I tried the ApplyMapping before (even casting to string), and executing applymapping1.toDF().show() gives me the same error:
... Caused by: org.postgresql.util.PSQLException: Bad value for type double : ...
Is like even doing transformations AWS Glue keeps the original datatype from the crawler, at least thats the behaviour that I experience until now.
AWS Glue, crawlers and issue with money datatypeasked a month ago
backslash in CSV with glueasked 5 months ago
Partition schema mismatch in Glue Tableasked 24 days ago
AWS Glue crawler creating multiple tablesasked 4 months ago
AWS Glue crawlerasked a month ago
AWS Glue Datatype Numeric giving issueasked 6 months ago
Crawler generates table with more columns than expectedasked 3 months ago
How could we have Glue to get data from csv as String?Accepted Answerasked 2 months ago
Can Glue crawler be configured to include only the most recent partition in a table?asked a month ago
Glue Crawler error: Folder partition keys do not match table partition keysasked 4 months ago