HIVE_BAD_DATA: Error parsing field value '2022-12-14T06:51:14.710Z'

0

I face this problem when trying to query from Athena, the data is stored in S3 bucket. If I exclude the timestamp column from SELECT statement, it can still be queried. Can anyone suggest a fix for this problem? Changing the access_at field in log record maybe difficult since the task is migrating logs data from RDS.

HIVE_BAD_DATA: Error parsing field value '2022-12-14T06:51:14.710Z' for field 13: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]

the table format is as below

CREATE EXTERNAL TABLE `user_report`(
  `type` int COMMENT 'from deserializer',
  `system_id` int COMMENT 'from deserializer',
  `id` string COMMENT 'from deserializer',
  `company_id` int COMMENT 'from deserializer',
  `user_id` int COMMENT 'from deserializer',
  `token` char(255) COMMENT 'from deserializer',
  `device_type` tinyint COMMENT 'from deserializer',
  `app_version` char(255) COMMENT 'from deserializer',
  `session_cnt` int COMMENT 'from deserializer',
  `requested_cnt` int COMMENT 'from deserializer',
  `scheduled_cnt` int COMMENT 'from deserializer',
  `rescheduled_cnt` int COMMENT 'from deserializer',
  `canceled_cnt` int COMMENT 'from deserializer',
  `access_at` timestamp COMMENT 'from deserializer')
PARTITIONED BY (
  `created_hour` string)
ROW FORMAT SERDE
  'org.openx.data.jsonserde.JsonSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION
  's3://demo-kinesis-athena/'
TBLPROPERTIES (
  'has_encrypted_data'='false',
  'projection.created_hour.format'='yyyy/MM/dd/HH',
  'projection.created_hour.interval'='1',
  'projection.created_hour.interval.unit'='HOURS',
  'projection.created_hour.range'='2018/01/01/00,NOW',
  'projection.created_hour.type'='date',
  'projection.enabled'='true',
  'storage.location.template'='s3://demo-kinesis-athena/${created_hour}',
  'transient_lastDdlTime'='1671014054')
已提问 1 年前424 查看次数
1 回答
3
已接受的回答

My suggestion would be to define the column as text as you are unable to convert before storing the timestamp. You could use something similar to the below query to parse it into a date or use the CAST function

select date_parse(substr('2022-12-14T06:51:14.710Z',1,24),'%Y-%m-%dT%H:%i:%s.%fZ') 
profile pictureAWS
已回答 1 年前
AWS
专家
已审核 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则