Catalog Dataframe - AWS Glue

Question

hello, I am creating a dataframe consuming from a Glue Catalog table, this table has fields of type bigint, which can be null. It turns out that when this information is null, the dataframe ignores these fields, which impacts the rest of the code, as I am using this table to merge into the destination.
Do you have any solution for this problem? Below is a snippet of the code:

IncrementalInputDyF = glueContext.create_dynamic_frame.from_catalog(
    database = "litio_sqlserver",
    table_name = "crawler_operation",
    transformation_ctx = "IncrementalInputDyF")
IncrementalInputDF = IncrementalInputDyF.toDF()

Accepted Answer

The issue is that you are not reading a DataFrame, you are reading a DynamicFrame that is dynamically inferring the schema (and thus omitting that column) and then converting it to DataFrame. 
You need a DataFrame using the corresponding API call or using the Spark API (as long as you are not using LakeFormation permissions).

Catalog Dataframe - AWS Glue

相關內容