Data issue when pushing data from s3 to redshift

0

Hey all, I currently have a data file in s3 which i am trying to output to redshift through glue visual studio. My data successfully got sent to redshift tables, but the amount field has changed the values to scientific type For example- 500303030 became 5.558543197E7 in redshift. How can i solve this issue? the column is varchar type in redshift, and string type in the glue node before which it gets copied to redshift tables. Thanks!

sg03
已提問 5 個月前檢視次數 218 次
2 個答案
0
已接受的答案

Hello,

Thanks for reaching out. For this scenario, a workaround is to add an intermediate node called "Change Schema" in Glue studio, which converts the input column into "string" before writing data to Redshift.

So the data flow is as below:

Data source, S3 => Change schema => Data target, Redshift.

The generated code for "Change schema" looks like below, for sample only:

# Script generated for node Change Schema
ChangeSchema_node1702948869240 = ApplyMapping.apply(
    frame=AmazonS3_node1702946282328,
    mappings=[
        ("invoiceno", "string", "invoiceno", "string"),
        ("amount", "bingint", "amount", "string"),
    ],
    transformation_ctx="ChangeSchema_node1702948869240",
)

Hope it helps.

AWS
Thi_N
已回答 5 個月前
0

You may need to leverage CONVERT function to convert from scientific number format to float: https://docs.aws.amazon.com/redshift/latest/dg/r_CONVERT_function.html

profile picture
專家
已回答 5 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南