HIVE_BAD_DATA: Error parsing field value '2022-12-14T06:51:14.710Z'

0

I face this problem when trying to query from Athena, the data is stored in S3 bucket. If I exclude the timestamp column from SELECT statement, it can still be queried. Can anyone suggest a fix for this problem? Changing the access_at field in log record maybe difficult since the task is migrating logs data from RDS.

HIVE_BAD_DATA: Error parsing field value '2022-12-14T06:51:14.710Z' for field 13: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]

the table format is as below

CREATE EXTERNAL TABLE `user_report`(
  `type` int COMMENT 'from deserializer',
  `system_id` int COMMENT 'from deserializer',
  `id` string COMMENT 'from deserializer',
  `company_id` int COMMENT 'from deserializer',
  `user_id` int COMMENT 'from deserializer',
  `token` char(255) COMMENT 'from deserializer',
  `device_type` tinyint COMMENT 'from deserializer',
  `app_version` char(255) COMMENT 'from deserializer',
  `session_cnt` int COMMENT 'from deserializer',
  `requested_cnt` int COMMENT 'from deserializer',
  `scheduled_cnt` int COMMENT 'from deserializer',
  `rescheduled_cnt` int COMMENT 'from deserializer',
  `canceled_cnt` int COMMENT 'from deserializer',
  `access_at` timestamp COMMENT 'from deserializer')
PARTITIONED BY (
  `created_hour` string)
ROW FORMAT SERDE
  'org.openx.data.jsonserde.JsonSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION
  's3://demo-kinesis-athena/'
TBLPROPERTIES (
  'has_encrypted_data'='false',
  'projection.created_hour.format'='yyyy/MM/dd/HH',
  'projection.created_hour.interval'='1',
  'projection.created_hour.interval.unit'='HOURS',
  'projection.created_hour.range'='2018/01/01/00,NOW',
  'projection.created_hour.type'='date',
  'projection.enabled'='true',
  'storage.location.template'='s3://demo-kinesis-athena/${created_hour}',
  'transient_lastDdlTime'='1671014054')
質問済み 1年前424ビュー
1回答
3
承認された回答

My suggestion would be to define the column as text as you are unable to convert before storing the timestamp. You could use something similar to the below query to parse it into a date or use the CAST function

select date_parse(substr('2022-12-14T06:51:14.710Z',1,24),'%Y-%m-%dT%H:%i:%s.%fZ') 
profile pictureAWS
回答済み 1年前
AWS
エキスパート
レビュー済み 1年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ