- Newest
- Most votes
- Most comments
Hi ignacio,
The error "Inconsistent data type results in choice type" typically occurs when the data types of the columns in the input data (CSV files) are not consistent across all rows. This can happen when some rows have a different data type for a particular column compared to other rows.
To fix this issue, you could try the following steps:
-
Analyze the input data: Inspect the CSV files to identify the columns that have inconsistent data types. You can use tools like pandas or AWS Glue's built-in data preview feature to analyze the data.
-
Data cleaning and transformation: Depending on the nature of the inconsistency, you may need to perform data cleaning and transformation steps. For example, if a column is supposed to be numeric but contains some non-numeric values, you can either remove those rows or replace the non-numeric values with a default value (e.g., null or 0).
-
Define the schema: After cleaning the data, define the schema for your Glue table explicitly. This will ensure that Glue interprets the data types correctly and consistently across all rows.
-
Use the defined schema in the Glue job: When creating the Glue job to convert CSV to Parquet, specify the defined schema to ensure that the data types are correctly interpreted and converted.
You can also find further resources following this link.
Relevant content
- asked 3 years ago
- AWS OFFICIALUpdated 9 months ago
