Does Athena geospatial support compression?

0

I'm using Athena to query some geospatial data encoded in GEOJSON. If I do it with uncompressed GEOJSON files, it works fine, but if I compress those files using gzip I get:

HIVE_CURSOR_ERROR: Illegal character ((CTRL-CHAR, code 31)): only regular white space (\r, \n, \t) is allowed between tokens at [Source: org.apache.hadoop.fs.FSDataInputStream@6acf4fe5: org.apache.hadoop.fs.BufferedFSInputStream@2793fa25; line: 1, column: 2]

Is it possible to use compressed geospatial data on Athena?

EDIT to include table create statement as requested:

CREATE EXTERNAL TABLE `locations`(
  `id` bigint COMMENT 'from deserializer', 
  `boundaryshape` binary COMMENT 'from deserializer')
ROW FORMAT SERDE 
  'com.esri.hadoop.hive.serde.JsonSerde' 
STORED AS INPUTFORMAT 
  'com.esri.json.hadoop.EnclosedJsonInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://mydata/transformed/'
TBLPROPERTIES (
  'classification'='json', 
  'last_modified_by'='hadoop', 
  'last_modified_time'='1674393793', 
  'transient_lastDdlTime'='1674393793', 
  'write.compression'='GZIP')
  • Can you paste the result of SHOW CREATE TABLE <your table name>?

  • Have updated with statement as requested

  • Please try the following: 1/ change GZIP to lowercase 'compressionType'='gzip', 2/ make sure 1 gzip file contains 1 json file only.

  • Have made that change but still getting the same error. The gzip file does contain only one file.

greg-eb
gefragt vor einem Jahr77 Aufrufe
Keine Antworten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen