I have successfully created a database using a Glue crawler, but ran into this error (or an extremely similar one) when trying to access it with even a basic Athena query.
GENERIC_INTERNAL_ERROR: IndexOutOfBoundsException thrown initializing deserializer org.openx.data.jsonserde.JsonSerDe. Cause: Index: 55, Size: 55
I have successfully run Athena queries on another database, which I believe is formatted the same way.
I tried taking just a single file out of the S3 bucket and using a new crawler to make a new table, but it created the same error. My best guess is that there's a special character or something being used in the JSON that this particular deserializer can't handle. A JSON linter read the file as a valid JSON. Are there reserved characters for the AWS SerDe? Where can I look to find more information on this particular error?
The data I am trying to assess belongs to a vendor, so I don't think I can post it here, but here is an example line from the file:
{"AE Weather Station (Standard w/ GHI) (Estimated GHI) (Watt hours/meter2)_Irradiance": "216.806", "AE Weather Station (Standard w/ GHI) (GHI sensor) (Watts/meter2)_GHI comparison": "0", "AE Weather Station (Standard w/ GHI) (GHI) (Watt hours/meter2)_Irradiance": "0", "AE Weather Station (Standard w/ GHI) - Ambient Temperature (Degrees Fahrenheit)_Temperature": "75.028", "AE Weather Station (Standard w/ GHI) - Module temperature (est) (Degrees Fahrenheit)_Temperature": "75.028", "AE Weather Station (Standard w/ GHI), Sun (GHI) (Watts/meter2)_Irradiance Sensor Degrade": "0", "AE Weather Station (Standard w/ GHI), Sun (GHI) (Watts/meter2)_Irradiance sensor orientation": "0", "AE Weather Station (Standard w/ GHI), Sun (GHI) * (Watts/meter2)_POA comparison": "0", "AE Weather Station (Standard w/ GHI): Clear Sky (E-W) (Watts/meter2)_Irradiance sensor orientation": "0", "AE Weather Station (Standard w/ GHI): Clear Sky GHI (Watts/meter2)_Irradiance sensor orientation": "0", "Ambient temperature (Degrees Celsius)_Clean average weather data": "23.904", "Ambient temperature Quality (Quality (asdf))_Clean average weather data": "100", "Bin Size": "Bin1Hour", "Blue-sky ratio: AE Weather Station (Standard w/ GHI), Sun (GHI) (Percent)_Irradiance Sensor Degrade": "104.792", "Blue-sky ratio: IMT 1 (1810) Ref Cell w/ Mod Temp, Irradiance (Percent)_Irradiance Sensor Degrade": "102.432", "Blue-sky ratio: IMT 2 (1550) Ref Cell w/ Mod Temp, Irradiance (Percent)_Irradiance Sensor Degrade": "111.587", "Blue-sky ratio: IMT 3 (1220) Ref Cell w/ Mod Temp, Irradiance (Percent)_Irradiance Sensor Degrade": "NaN", "GHI (Watts/meter2)_Clean average weather data": "0", "GHI Quality (Quality (asdf))_Clean average weather data": "100", "IMT 1 (1810) Ref Cell w/ Mod Temp (Estimated POA) (Watt hours/meter2)_Irradiance": "0", "IMT 1 (1810) Ref Cell w/ Mod Temp (Watt hours/meter2)_Irradiance": "0.090", "IMT 1 (1810) Ref Cell w/ Mod Temp - Irradiance weighted module temperature (Degrees Fahrenheit)_Temperature": "71.581", "IMT 1 (1810) Ref Cell w/ Mod Temp, Irradiance (Watts/meter2)_Irradiance Sensor Degrade": "0.090", "IMT 1 (1810) Ref Cell w/ Mod Temp, Irradiance (Watts/meter2)_Irradiance sensor orientation": "0.090", "IMT 1 (1810) Ref Cell w/ Mod Temp, Irradiance * (Watt hours/meter2)_GHI comparison": "0.090", "IMT 1 (1810) Ref Cell w/ Mod Temp, Irradiance * (Watts/meter2)_POA comparison": "0.090", "IMT 1 (1810) Ref Cell w/ Mod Temp: Clear Sky (E-W) (Watts/meter2)_Irradiance sensor orientation": "0", "IMT 1 (1810) Ref Cell w/ Mod Temp: Clear Sky GHI (Watts/meter2)_Irradiance sensor orientation": "0", "IMT 2 (1550) Ref Cell w/ Mod Temp (Watt hours/meter2)_Irradiance": "0.048", "IMT 2 (1550) Ref Cell w/ Mod Temp - Irradiance weighted module temperature (Degrees Fahrenheit)_Temperature": "71.331", "IMT 2 (1550) Ref Cell w/ Mod Temp, Irradiance (Watts/meter2)_Irradiance Sensor Degrade": "0.048", "IMT 2 (1550) Ref Cell w/ Mod Temp, Irradiance (Watts/meter2)_Irradiance sensor orientation": "0.048", "IMT 2 (1550) Ref Cell w/ Mod Temp, Irradiance * (Watt hours/meter2)_GHI comparison": "0.048", "IMT 2 (1550) Ref Cell w/ Mod Temp, Irradiance * (Watts/meter2)_POA comparison": "0.048", "IMT 2 (1550) Ref Cell w/ Mod Temp: Clear Sky (E-W) (Watts/meter2)_Irradiance sensor orientation": "0", "IMT 2 (1550) Ref Cell w/ Mod Temp: Clear Sky GHI (Watts/meter2)_Irradiance sensor orientation": "0", "IMT 3 (1220) Ref Cell w/ Mod Temp (Watt hours/meter2)_Irradiance": "0.052", "IMT 3 (1220) Ref Cell w/ Mod Temp - Irradiance weighted module temperature (Degrees Fahrenheit)_Temperature": "70.246", "IMT 3 (1220) Ref Cell w/ Mod Temp, Irradiance (Watts/meter2)_Irradiance Sensor Degrade": "0.052", "IMT 3 (1220) Ref Cell w/ Mod Temp, Irradiance (Watts/meter2)_Irradiance sensor orientation": "0.052", "IMT 3 (1220) Ref Cell w/ Mod Temp, Irradiance * (Watt hours/meter2)_GHI comparison": "0.052", "IMT 3 (1220) Ref Cell w/ Mod Temp, Irradiance * (Watts/meter2)_POA comparison": "0.052", "IMT 3 (1220) Ref Cell w/ Mod Temp: Clear Sky (E-W) (Watts/meter2)_Irradiance sensor orientation": "0", "IMT 3 (1220) Ref Cell w/ Mod Temp: Clear Sky GHI (Watts/meter2)_Irradiance sensor orientation": "0", "Max wind speed - AE Weather Station (Standard w/ GHI) (Miles/hour)_Wind": "0", "Module temperature (Degrees Celsius)_Clean average weather data": "21.652", "Module temperature Quality (Quality (asdf))_Clean average weather data": "100", "POA (Watts/meter2)_Clean average weather data": "0", "POA Quality (Quality (asdf))_Clean average weather data": "100", "Rain (Rain (Inches))_Rain, Humidity, Barometric Pressure": "0", "Variance (Percent)_Irradiance": "NaN", "datetime": "2021-06-01 00:00:00", "time_retrieved": "20221007T10:56:1665165415"}