By using AWS re:Post, you agree to the Terms of Use
/Parse application logs via Athena and Glue catalog/

Parse application logs via Athena and Glue catalog


I have log files in my onpremise application, Individual file size is 100 MB (file rollover/appender defined in Log4J in onpremise application) and 50 such files get generated on daily basis which is 5 GB total. We then zip these files at day end and push to S3 bucket manually today.

  1. We dont want to use cloudwatch agent in onpremise application to push logs to CloudWatch (as onpremise applications are not having CPU/Memory and running at peak)
  2. We zip the files at day end else we have to do 50-55 S3 Copy manually each day (yes we can create some script for this - but again not elegant)
  3. Yes Elasticsearch is an option but we are building solution which will take 2-3 months to ingest data to Elasticsearch and use ELK stack

Now till 1 year, whenever we get customer complaint we have to extract specific file. We know the date, but unfortunately we have to download all 5 GB for that day and extract the required file and content.

As part of this usecase wanted to check: 1. If Athena works for Log4J log files (from JBoss, Websphere) - do we have Serde's (serialization / deserialization libraries) for the same, glue catalog for the same

asked 5 months ago10 views
1 Answers

Have you had a look at the GROK SerDe? This may help you? Example 2 uses log4j format. Hope this helps.

answered 5 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions