How to update parque format_version locally?

0

So, we are having an issue with one table in our ETL process stacking parque files.

The question is, at DMS we can set parquet format_version 1_0 or 2_0, How this change can impact production and there is a form that I can update the version locally? I've already downloaded the files but can't find a way to update the version

Thanks.

Marcelo
已提问 2 个月前97 查看次数
1 回答
1

you can try using pyarrow which could change the version

import pyarrow as pa
import pyarrow.parquet as pq

# Read Parquet file (version 2.0)
table = pq.read_table('input.parquet')

# Write Parquet file (version 1.0)
pq.write_table(table, 'output.parquet', version='1.0')

AWS
已回答 2 个月前
profile picture
专家
已审核 2 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则