Aws glue script toDF().sort() method gives exception

0

Hi All,

I am facing this issue while using pyspark script code in aws glue job.

Code is as following

DyanmicFrame.toDF().orderBy(["col1", "col2"])

This code gives me error AnalysisException: cannot resolve 'col1' given input columns: []; But Dynamic frame had 200 columns in it. but on conversion to Dataframe, it gives me this error. In jupyter notebook same code is working fine.

Please guide me how to solve this problem.

質問済み 2年前1319ビュー
1回答
0

Hello,

I would like to inform above exception generally occurs when spark is not able to find conditional columns in dataset.

To confirm , I have tested sort and orderBy function in Glue job and it is working absolutely fine. Please find the sample code below:

++++++++++ datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "testdb", table_name = "nycflights13_csv", transformation_ctx = "datasource0")

datasource0.toDF().sort('year','month').show(5)

datasource0.toDF(). orderBy('year','month').show(5) ++++++++++

I would request you please verify schema once again and try to print sample data after creating the dynamic frame and then use sort or orderBy function:

+++++++++ DyanmicFrame.printSchema()

##Above function should print the columns which you would like to use in sort or orderBY

DyanmicFrame.toDF().show()

##Above function should return values

DyanmicFrame.toDF().sort('year','month').show(5)

DyanmicFrame.toDF(). orderBy('year','month').show(5) +++++++++++

If you still face any issue, Please feel free to reach out to AWS Premium Support with sample data and we will be happy to help.

Have a Nice day!

AWS
回答済み 2年前
  • Hi @Shubham_P, is there a way to sort() or orderBy() a Dynamic Dataframe avoiding going .toDF() ?

    Thanks

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ