HIVE_BAD_DATA: Error parsing field value '2022-12-14T06:51:14.710Z'

0

I face this problem when trying to query from Athena, the data is stored in S3 bucket. If I exclude the timestamp column from SELECT statement, it can still be queried. Can anyone suggest a fix for this problem? Changing the access_at field in log record maybe difficult since the task is migrating logs data from RDS.

HIVE_BAD_DATA: Error parsing field value '2022-12-14T06:51:14.710Z' for field 13: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]

the table format is as below

CREATE EXTERNAL TABLE `user_report`(
  `type` int COMMENT 'from deserializer',
  `system_id` int COMMENT 'from deserializer',
  `id` string COMMENT 'from deserializer',
  `company_id` int COMMENT 'from deserializer',
  `user_id` int COMMENT 'from deserializer',
  `token` char(255) COMMENT 'from deserializer',
  `device_type` tinyint COMMENT 'from deserializer',
  `app_version` char(255) COMMENT 'from deserializer',
  `session_cnt` int COMMENT 'from deserializer',
  `requested_cnt` int COMMENT 'from deserializer',
  `scheduled_cnt` int COMMENT 'from deserializer',
  `rescheduled_cnt` int COMMENT 'from deserializer',
  `canceled_cnt` int COMMENT 'from deserializer',
  `access_at` timestamp COMMENT 'from deserializer')
PARTITIONED BY (
  `created_hour` string)
ROW FORMAT SERDE
  'org.openx.data.jsonserde.JsonSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION
  's3://demo-kinesis-athena/'
TBLPROPERTIES (
  'has_encrypted_data'='false',
  'projection.created_hour.format'='yyyy/MM/dd/HH',
  'projection.created_hour.interval'='1',
  'projection.created_hour.interval.unit'='HOURS',
  'projection.created_hour.range'='2018/01/01/00,NOW',
  'projection.created_hour.type'='date',
  'projection.enabled'='true',
  'storage.location.template'='s3://demo-kinesis-athena/${created_hour}',
  'transient_lastDdlTime'='1671014054')
질문됨 일 년 전424회 조회
1개 답변
3
수락된 답변

My suggestion would be to define the column as text as you are unable to convert before storing the timestamp. You could use something similar to the below query to parse it into a date or use the CAST function

select date_parse(substr('2022-12-14T06:51:14.710Z',1,24),'%Y-%m-%dT%H:%i:%s.%fZ') 
profile pictureAWS
답변함 일 년 전
AWS
전문가
검토됨 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠