1回答
- 新しい順
- 投票が多い順
- コメントが多い順
0
I believe the attachFilename doesn't work for all formats.
Try using format="glueparquet", otherwise you could use the first example but reading directly with DataFrame (and then convert to DynamicFrame if you want to):
newdf = spark.read.parquet(s3://bucket/)
newdf = newdf.withColumn('filename2', input_file_name())
関連するコンテンツ
- AWS公式更新しました 3年前
But if i read through spark directly, would i be still able to use bookmark ?
In the documentation, it says that attachFilename can be used with any format, I've tested it with CSV and it works. Do you have an idea why for Parquet it's different?
This is the link to the section talking about attachFIlename option: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format.html#aws-glue-programming-etl-format-shared-reference