How to change data type in glue dataset catalogue

0

I managed to use glue crawler to crawled data (parquet file) from s3, however the column with type "boolean" is recognised as "string" when checking the data schema. Although i can edit the schema on the metadata catalogue, this is breaking the query on Athena as the data type is not recognised as boolean. Is there anyway to configure the serde parameter for glue crawler to automatically recognise this?

Additional information:

  • I have created the table using python pandas. Had workaround converting boolean to string given "True/False" is not recognised by glue/Athena.
  • Tried create table using Athena to ingest from s3. However, no data was shown when querying,
  • Tried visual ETL. Can't transform string to boolean.

Any suggestion would be appreciated.

Andy
질문됨 한 달 전529회 조회
1개 답변
1

Alternatively, you can try using the XMLClassifier provided by Glue. This Classifier is often better at inferring data types than the default BuiltInClassifiers. To use the XMLClassifier, select it in the Classifiers section when creating or updating your Glue Crawler.

Sources:

profile picture
전문가
답변함 한 달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠