HIVE_BAD_DATA: Error parsing field value '2022-12-14T06:51:14.710Z'

0

I face this problem when trying to query from Athena, the data is stored in S3 bucket. If I exclude the timestamp column from SELECT statement, it can still be queried. Can anyone suggest a fix for this problem? Changing the access_at field in log record maybe difficult since the task is migrating logs data from RDS.

HIVE_BAD_DATA: Error parsing field value '2022-12-14T06:51:14.710Z' for field 13: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]

the table format is as below

CREATE EXTERNAL TABLE `user_report`(
  `type` int COMMENT 'from deserializer',
  `system_id` int COMMENT 'from deserializer',
  `id` string COMMENT 'from deserializer',
  `company_id` int COMMENT 'from deserializer',
  `user_id` int COMMENT 'from deserializer',
  `token` char(255) COMMENT 'from deserializer',
  `device_type` tinyint COMMENT 'from deserializer',
  `app_version` char(255) COMMENT 'from deserializer',
  `session_cnt` int COMMENT 'from deserializer',
  `requested_cnt` int COMMENT 'from deserializer',
  `scheduled_cnt` int COMMENT 'from deserializer',
  `rescheduled_cnt` int COMMENT 'from deserializer',
  `canceled_cnt` int COMMENT 'from deserializer',
  `access_at` timestamp COMMENT 'from deserializer')
PARTITIONED BY (
  `created_hour` string)
ROW FORMAT SERDE
  'org.openx.data.jsonserde.JsonSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION
  's3://demo-kinesis-athena/'
TBLPROPERTIES (
  'has_encrypted_data'='false',
  'projection.created_hour.format'='yyyy/MM/dd/HH',
  'projection.created_hour.interval'='1',
  'projection.created_hour.interval.unit'='HOURS',
  'projection.created_hour.range'='2018/01/01/00,NOW',
  'projection.created_hour.type'='date',
  'projection.enabled'='true',
  'storage.location.template'='s3://demo-kinesis-athena/${created_hour}',
  'transient_lastDdlTime'='1671014054')
gefragt vor einem Jahr438 Aufrufe
1 Antwort
3
Akzeptierte Antwort

My suggestion would be to define the column as text as you are unable to convert before storing the timestamp. You could use something similar to the below query to parse it into a date or use the CAST function

select date_parse(substr('2022-12-14T06:51:14.710Z',1,24),'%Y-%m-%dT%H:%i:%s.%fZ') 
profile pictureAWS
beantwortet vor einem Jahr
AWS
EXPERTE
überprüft vor einem Jahr

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen