glue crawler Serde serialization lib

0

I created a glue CSV crawler in my DEV account, the CSV files are crawled correctly and the tables have this properties :

Name   tbl_csv_s_mytable
Database    db_rdsmydb
csvLocation   s3://xxxxxx
Connection  Deprecated
org.apache.hadoop.mapred.TextInputFormatOutput format
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormatSerde serialization lib
**org.apache.hadoop.hive.serde2.OpenCSVSerdeSerde parameters**

**quoteChar "**
**separatorChar ,**

I did the exact same thing on stage account the table are not correctly crawled I have col0, col1 ...instead of columns names :

Name	tbl_csv_s_mytable
Database	db_rdsmydb
Classification	csv
Location	  s3://xxxxx
Connection	 Deprecated	No

Input format	org.apache.hadoop.mapred.TextInputFormat
Output format	org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
**Serde serialization lib	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe**

file.delim => instead of quoteChar 

I used the same config of classifier, not sure why it works in DEV but not in the stage? I followed the exact same steps on both account

Serde serialization lib is the issue?

I,m not sure why it's settled as org.apache.hadoop.hive.serde2.OpenCSVSerdeSerde parameters in DEV

and as org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe in stage account

any ideas

thank you!!

Jess
質問済み 2年前1125ビュー
1回答
0

From the configuration you shared you are using two different classifiers for the 2 crawlers and this is why you get a different behavior.

In Dev you are probably using a Custom CSV Classifier , see this documentation page to understand how it was created, and attached to the crawler. In the definition of the custom crawler you have defined how to menage the column separators and identifying the column delimiter.

In Stage instead the Crawler has been created with the native classifier.

that is the difference you see in the Serde serialization lib.

hope this helps,

AWS
エキスパート
回答済み 2年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン