Glue Jobs have no access to current schemas (Glue Catalog)

0

Hi,

Context : I would like to extract a table from an oracle database and write the data in a parquet format on S3. I use " glue connection", "glue database" and "glue crawler". All works fine !

Issue: Glue set decimal(38,0) as column type in the data catalog rather than string. I updated the data Catalog with the new column type. Nevertheless in my Glue ETL job the column type is stil "decimal" ( I can see it because I print the schema). I extract the data using the create_dynamic_frame.from_catalog( database="", table_name="", transformation_ctx=""). I set the role with fullAcess to Glue/S3/Cloudwatch/EC2. When I print the schema, there is no difference after changing the column type in the data catalog.

Could you help me ?

MehdiE
質問済み 2年前347ビュー
1回答
0

Glue DynamicFrames are self-describing and no schema is required on creation. Instead, the schema is computed on-the-fly with inconsistent types encoded as choice types.

In your case, you can use the ApplyMapping class to specify the column type after reading the catalog table.

profile pictureAWS
回答済み 2年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ