- Newest
- Most votes
- Most comments
Hello,
In Glue we use crawlers to automatically detect the schema from file and create a table in Glue catalog. For CSV files, the crawler reads either the first 100 records or the first 1 MB of data, whatever comes first to detect the schema. [1]
Having said that with this approach it is not possible to load all csv columns as string in Glue catalog directly. You can consider two approach for your use case:
-
Create a crawler and run on csv data. Once it create the table in Glue catalog with correct datatype , you can modify the table schema to string for all columns.
-
Directly read the data from csv files using Glue ETL job and in applymapping change schema to string and write the table into catalog with enableUpdateCatalog option. [2]
--Reference:
[1] https://aws.amazon.com/premiumsupport/knowledge-center/glue-crawler-detect-schema/ [2] https://docs.aws.amazon.com/glue/latest/dg/update-from-job.html
Relevant content
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago