1 Answer
- Newest
- Most votes
- Most comments
0
o remove the unnamed column while creating a dynamic frame from the catalog options, you can use the ApplyMapping class from the awsglue.transforms module. This allows you to selectively keep the columns you want and exclude the unnamed columns.
Here's an example of how you can do this:
from awsglue.transforms import ApplyMapping
# Read the data from the catalog
demotable = glueContext.create_dynamic_frame.from_catalog(
database="intraday",
table_name="demo_table",
push_down_predicate="bus_dt = 20180117",
transformation_ctx="demotable"
)
# Define the schema mapping, excluding the unnamed column
mapping = [
("column1", "string", "column1", "string"),
("column2", "string", "column2", "string"),
# Add all other columns that you want to keep, but exclude "Unnamed: 7"
]
# Apply the mapping
demotable_transformed = ApplyMapping.apply(frame=demotable, mappings=mapping, transformation_ctx="demotable_transformed")
# Continue to the code ....
Hi sdtslmn,
Thanks for your response!
The glue job is failing in the below step while creating the dynamic frame. Apply mapping is the next step but it is failing before that.
Is there a way I can pass any parameter while creating the dynamic frame so that unnamed column gets excluded?
Read the data from the catalog
demotable = glueContext.create_dynamic_frame.from_catalog( database="intraday", table_name="demo_table", push_down_predicate="bus_dt = 20180117", transformation_ctx="demotable" )
Relevant content
- Accepted Answerasked 2 years ago
- asked 7 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
Not sure how the table can have unnamed columns, sounds more likely that the data doesn't really match the table and DynamicFrame get confused. If you read using DataFrame (e.g. spark.sql()) it will enforce the table schema but not sure if it will read the data correctly. I would try to solve the underlying issue with the table/data.