Dynamic Frame writing extra columns to Redshift.

0

I read data from s3 using as follow.

sec_id_dyf = glueContext.create_dynamic_frame.from_options(
    connection_type = 's3',
    connection_options={'paths':['s3://<path>/sector_id_mappings.csv']},
    format = "csv",
    format_options={ "withHeader" :True}
)

Then I do necessary transformations and finally type cast to as relevant to Redshift table. Then I load these data to AWS Redshift table as follow.

from awsglue.dynamicframe import DynamicFrame

sec_id_dyf_ct = sec_id_dyf.resolveChoice(specs=[("last_letter_cell_name", "cast:string"), ("sector_id", "cast:byte")])

my_conn_opt = {
    "dbtable":"public.Q_DATA",
    "database":"dev"
}


redshift_write = glueContext.write_dynamic_frame.from_jdbc_conf(
    frame = sec_id_dyf_ct, 
    catalog_connection = "redshift-conn", 
    connection_options = my_conn_opt, 
    redshift_tmp_dir = "s3://<path2>/", 
    transformation_ctx = "redshift_write"
)

The problem is even though after all type are matched, still Glue create new column in redshift.

Enter image description here

Enter image description here

How to avoid this behavior in AWS Glue and Redshift. It's really appreciated if you can provide some answers for this problem. Thank you.

  • That's odd, columns with type name normally mean there is still a choice to resolve but I see you have. What do you get if you print the schema just before the sink?

답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인