pyspark.sql.utils.AnalysisException• Cannot resolve column name "previous_project_id among ()

0

Hi, I'm relatively new to AWS glue and was having trouble in the following transformation codes:

DataSource4 = glueContext.create_dynamic_frame.from_catalog(database = "beta", table_name = "[table_name]", transformation_ctx = "DataSource4")
Transform9 = ApplyMapping.apply(frame = DataSource4, mappings = [("project_unique_id", "int", "(src) project_unique_id", "int")], transformation_ctx = "Transform9")

Transform9DF = Transform9.toDF()
Transform3DF = Transform3.toDF()
Transform12 = DynamicFrame.fromDF(Transform3DF.join(Transform9DF, (Transform3DF['project_unique_id'] == Transform9DF['previous_project_id']), "leftanti"), glueContext, "Transform12")

The job is failing with error : raise AnalysisException:(s.split(': ',1)[1], stackTrace) 'Cannot resolve column name "previous_project_id" among ((src) project_unique_id);' on checking the tables, both columns "project_unique_id" and "previous_project_id" are filled with NULL values, could that be the reason for the above error?

1개 답변
0

Hi ,

could you please clarify the statement:

on checking the tables, both columns "project_unique_id" and "previous_project_id" are filled with NULL values

which table? the source table? or which DataFrame? Transform9DF or Transform3DF ?

could you post the schema of these 2 dataframes?

I might be mistaken, but I think that the apply mapping you are using is dropping any field other than "(src) project_unique_id" so when you trying to join on Transform9DF['previous_project_id'] this field is not found.

thank you

AWS
전문가
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠