Data issue when pushing data from s3 to redshift

0

Hey all, I currently have a data file in s3 which i am trying to output to redshift through glue visual studio. My data successfully got sent to redshift tables, but the amount field has changed the values to scientific type For example- 500303030 became 5.558543197E7 in redshift. How can i solve this issue? the column is varchar type in redshift, and string type in the glue node before which it gets copied to redshift tables. Thanks!

sg03
asked 4 months ago208 views
2 Answers
0
Accepted Answer

Hello,

Thanks for reaching out. For this scenario, a workaround is to add an intermediate node called "Change Schema" in Glue studio, which converts the input column into "string" before writing data to Redshift.

So the data flow is as below:

Data source, S3 => Change schema => Data target, Redshift.

The generated code for "Change schema" looks like below, for sample only:

# Script generated for node Change Schema
ChangeSchema_node1702948869240 = ApplyMapping.apply(
    frame=AmazonS3_node1702946282328,
    mappings=[
        ("invoiceno", "string", "invoiceno", "string"),
        ("amount", "bingint", "amount", "string"),
    ],
    transformation_ctx="ChangeSchema_node1702948869240",
)

Hope it helps.

AWS
Thi_N
answered 4 months ago
0

You may need to leverage CONVERT function to convert from scientific number format to float: https://docs.aws.amazon.com/redshift/latest/dg/r_CONVERT_function.html

profile picture
EXPERT
answered 4 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions