Hi All,
I trying to use crawler to add tables in a Glue Database from CSV files. That works in the most folder/tables, but if the file have only strings separated by commas, crawler can't identify the first line by the name of columns and each one receive names like: col1, col2, etc..

In the tables properties with wrong schemas I can't see this property: "skip.header.line.count": 1

Someone of you know how can I force crawler to skip the first line?

Thank you.

I know this is an extremely old topic, but for those of you finding this result in a search engine, the proper way to solve this is by using a classifier on your crawler. You could either explicitly specify the column headings, or allow auto detection of the column headings within the classifier, more details here: .

  • Thanks for taking the time in 2020 to answer this question, it made my day here in 2023 a lot easier.


Make sure that csv file contains mixed datatypes (string, numeric) and rerun the crawler



Hi Shivan

Thank you for your answer.

This CSV file have only string data, not mixed datatypes.

You know: Why in this case the crawler can't identify correctly?

I do not know however, you can add manually skip header table property manually and change the column but, it beats the crawler purpose.

Seems like Classifiers don't help when there are multiple pre-amble lines (e.g. 6 lines) in the file before the headers and data begin (for CSV format, at least). This is a pity as we have to do some manual data-cleansing outside of Glue.

If you are using AWS CDK as the IaC tool, you can use the following code to skip the header:

    const resource = table.node.defaultChild as cfnglue.CfnTable;
    resource.addPropertyOverride('TableInput.StorageDescriptor.SerdeInfo', {
      Parameters: {
        'skip.header.line.count': '1',
Add a table property of skip.header.line.count with a value of 1.

