スキップしてコンテンツを表示

Glue DynamicFrame show method yields nothing

0

I'm using a Notebook together with a Glue Dev Endpoint to load data from S3 into a Glue DynamicFrame. The printSchema method works fine but the show method yields nothing although the dataframe is not empty. Converting the DynamicFrame into a Spark DataFrame actually yields a result (df.toDF().show()). Here the dummy code that I'm using

glueContext = GlueContext(spark.sparkContext)

df = glueContext.create_dynamic_frame_from_options(
    connection_type="s3", 
    connection_options = {
        "paths": [f"s3://bucketname/filename"]},
    format="json",
    format_options={"multiline": True}
)

df.printSchema()
df.show()

This problem was posted on Stackoverflow before: https://stackoverflow.com/questions/56013334/spark-dynamic-frame-show-method-yields-nothing

Why does the show method yield nothing? Is this a bug or is there something that I'm missing to make this work?

質問済み 3年前4621ビュー
1回答
0

Hello,

I would like to inform DynamicFrame is similar to a DataFrame, except that each record is self-describing, so no schema is required initially. Instead, AWS Glue computes a schema on-the-fly when required. Basically Glue DynamicFrame is based on RDD due to which show() method does not work directly and you need to convert dynamic frame to dataframe first to check the data in tabular format.

dyf.printSchema()
dyf.toDF().show()

AWS
回答済み 3年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

関連するコンテンツ