1 Answer
- Newest
- Most votes
- Most comments
0
After checking the article mentioned by @Tasio I found out that the data that's coming to cloud watch is not formated properly (un-even spaces). Further digging I found the root cause to be cloudformation template, the format string for API Gw access logs had un-even spaces. While sorting out the un-even space issue I stumble upon another option to have API GW send access logs as csv. SO, I made that change and on Athena I set up the table with following properties:
InputFormat: "org.apache.hadoop.mapred.TextInputFormat"
OutputFormat: "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
SerdeInfo:
Parameters: { "separatorChar" : "," }
SerializationLibrary: "org.apache.hadoop.hive.serde2.OpenCSVSerde"
Now everything is working and I can query the logs using Athena
answered 2 years ago
Glad you could fix it.
Relevant content
- asked 2 years ago
- asked 8 months ago
- Accepted Answerasked a year ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 2 months ago
Could you kindly share the pattern you're using? Also, did you check this article? https://aws.amazon.com/premiumsupport/knowledge-center/regexserde-error-athena-matching-groups/