How to update parque format_version locally?


So, we are having an issue with one table in our ETL process stacking parque files.

The question is, at DMS we can set parquet format_version 1_0 or 2_0, How this change can impact production and there is a form that I can update the version locally? I've already downloaded the files but can't find a way to update the version


asked 23 days ago85 views
1 Answer

you can try using pyarrow which could change the version

import pyarrow as pa
import pyarrow.parquet as pq

# Read Parquet file (version 2.0)
table = pq.read_table('input.parquet')

# Write Parquet file (version 1.0)
pq.write_table(table, 'output.parquet', version='1.0')

answered 23 days ago
profile picture
reviewed 21 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions