- Newest
- Most votes
- Most comments
The issue you're facing with column names being automatically generated as "column#1", "column#2", etc., in your Glue catalog table is likely due to the schema information not being properly passed or recognized when writing the DynamicFrame to S3.
To resolve this issue, you can try the following approach:
-
Ensure that your spark_df (Spark DataFrame) has the correct column names before converting it to a DynamicFrame. You can verify this by printing the schema of your spark_df.
-
When converting the Spark DataFrame to a DynamicFrame, explicitly pass the schema:
from awsglue.dynamicframe import DynamicFrame from awsglue.types import * # Assuming spark_df is your Spark DataFrame with correct column names schema = spark_df.schema DyF = DynamicFrame.fromDF(spark_df, glueContext, "etl_convert", schema=schema)
- Before writing the frame, you can also try to explicitly set the schema on the DynamicFrame:
from awsglue.types import * # Create a schema that matches your data schema = StructType([ StructField("column1_name", StringType(), True), StructField("column2_name", IntegerType(), True), # ... add all your columns here ]) DyF = DyF.withFrameSchema(lambda: schema)
- If the above steps don't work, you can try to rename the columns in your DynamicFrame before writing:
from awsglue.transforms import * # Assuming you know the correct column names column_names = ["column1_name", "column2_name", ..., "column21_name"] DyF = RenameField.apply(frame=DyF, old_name="column#1", new_name=column_names[0]) DyF = RenameField.apply(frame=DyF, old_name="column#2", new_name=column_names[1]) # ... repeat for all columns
By implementing these steps, you should be able to maintain the correct column names in your Glue catalog table while still being able to add new partitions for year and version. Remember to adjust the column names and data types according to your specific schema.
Sources
AWS Glue Scala DynamicFrame class - AWS Glue
DynamicFrameReader class - AWS Glue
Relevant content
- asked 4 months ago
- Accepted Answerasked 3 years ago
- Accepted Answerasked a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 8 days ago
- AWS OFFICIALUpdated 2 years ago