Glue Jobs have no access to current schemas (Glue Catalog)

0

Hi,

Context : I would like to extract a table from an oracle database and write the data in a parquet format on S3. I use " glue connection", "glue database" and "glue crawler". All works fine !

Issue: Glue set decimal(38,0) as column type in the data catalog rather than string. I updated the data Catalog with the new column type. Nevertheless in my Glue ETL job the column type is stil "decimal" ( I can see it because I print the schema). I extract the data using the create_dynamic_frame.from_catalog( database="", table_name="", transformation_ctx=""). I set the role with fullAcess to Glue/S3/Cloudwatch/EC2. When I print the schema, there is no difference after changing the column type in the data catalog.

Could you help me ?

MehdiE
feita há 2 anos347 visualizações
1 Resposta
0

Glue DynamicFrames are self-describing and no schema is required on creation. Instead, the schema is computed on-the-fly with inconsistent types encoded as choice types.

In your case, you can use the ApplyMapping class to specify the column type after reading the catalog table.

profile pictureAWS
respondido há 2 anos

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas