pyspark.sql.utils.AnalysisException• Cannot resolve column name "previous_project_id among ()

0

Hi, I'm relatively new to AWS glue and was having trouble in the following transformation codes:

DataSource4 = glueContext.create_dynamic_frame.from_catalog(database = "beta", table_name = "[table_name]", transformation_ctx = "DataSource4")
Transform9 = ApplyMapping.apply(frame = DataSource4, mappings = [("project_unique_id", "int", "(src) project_unique_id", "int")], transformation_ctx = "Transform9")

Transform9DF = Transform9.toDF()
Transform3DF = Transform3.toDF()
Transform12 = DynamicFrame.fromDF(Transform3DF.join(Transform9DF, (Transform3DF['project_unique_id'] == Transform9DF['previous_project_id']), "leftanti"), glueContext, "Transform12")

The job is failing with error : raise AnalysisException:(s.split(': ',1)[1], stackTrace) 'Cannot resolve column name "previous_project_id" among ((src) project_unique_id);' on checking the tables, both columns "project_unique_id" and "previous_project_id" are filled with NULL values, could that be the reason for the above error?

1 Antwort
0

Hi ,

could you please clarify the statement:

on checking the tables, both columns "project_unique_id" and "previous_project_id" are filled with NULL values

which table? the source table? or which DataFrame? Transform9DF or Transform3DF ?

could you post the schema of these 2 dataframes?

I might be mistaken, but I think that the apply mapping you are using is dropping any field other than "(src) project_unique_id" so when you trying to join on Transform9DF['previous_project_id'] this field is not found.

thank you

AWS
EXPERTE
beantwortet vor 2 Jahren

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen