Glue Jobs have no access to current schemas (Glue Catalog)

0

Hi,

Context : I would like to extract a table from an oracle database and write the data in a parquet format on S3. I use " glue connection", "glue database" and "glue crawler". All works fine !

Issue: Glue set decimal(38,0) as column type in the data catalog rather than string. I updated the data Catalog with the new column type. Nevertheless in my Glue ETL job the column type is stil "decimal" ( I can see it because I print the schema). I extract the data using the create_dynamic_frame.from_catalog( database="", table_name="", transformation_ctx=""). I set the role with fullAcess to Glue/S3/Cloudwatch/EC2. When I print the schema, there is no difference after changing the column type in the data catalog.

Could you help me ?

MehdiE
질문됨 2년 전347회 조회
1개 답변
0

Glue DynamicFrames are self-describing and no schema is required on creation. Instead, the schema is computed on-the-fly with inconsistent types encoded as choice types.

In your case, you can use the ApplyMapping class to specify the column type after reading the catalog table.

profile pictureAWS
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인