pyspark.sql.utils.AnalysisException• Cannot resolve column name "previous_project_id among ()

0

Hi, I'm relatively new to AWS glue and was having trouble in the following transformation codes:

DataSource4 = glueContext.create_dynamic_frame.from_catalog(database = "beta", table_name = "[table_name]", transformation_ctx = "DataSource4")
Transform9 = ApplyMapping.apply(frame = DataSource4, mappings = [("project_unique_id", "int", "(src) project_unique_id", "int")], transformation_ctx = "Transform9")

Transform9DF = Transform9.toDF()
Transform3DF = Transform3.toDF()
Transform12 = DynamicFrame.fromDF(Transform3DF.join(Transform9DF, (Transform3DF['project_unique_id'] == Transform9DF['previous_project_id']), "leftanti"), glueContext, "Transform12")

The job is failing with error : raise AnalysisException:(s.split(': ',1)[1], stackTrace) 'Cannot resolve column name "previous_project_id" among ((src) project_unique_id);' on checking the tables, both columns "project_unique_id" and "previous_project_id" are filled with NULL values, could that be the reason for the above error?

1 réponse
0

Hi ,

could you please clarify the statement:

on checking the tables, both columns "project_unique_id" and "previous_project_id" are filled with NULL values

which table? the source table? or which DataFrame? Transform9DF or Transform3DF ?

could you post the schema of these 2 dataframes?

I might be mistaken, but I think that the apply mapping you are using is dropping any field other than "(src) project_unique_id" so when you trying to join on Transform9DF['previous_project_id'] this field is not found.

thank you

AWS
EXPERT
répondu il y a 2 ans

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions