Parse application logs via Athena and Glue catalog

0

I have log files in my onpremise application, Individual file size is 100 MB (file rollover/appender defined in Log4J in onpremise application) and 50 such files get generated on daily basis which is 5 GB total. We then zip these files at day end and push to S3 bucket manually today.

  1. We dont want to use cloudwatch agent in onpremise application to push logs to CloudWatch (as onpremise applications are not having CPU/Memory and running at peak)
  2. We zip the files at day end else we have to do 50-55 S3 Copy manually each day (yes we can create some script for this - but again not elegant)
  3. Yes Elasticsearch is an option but we are building solution which will take 2-3 months to ingest data to Elasticsearch and use ELK stack

Now till 1 year, whenever we get customer complaint we have to extract specific file. We know the date, but unfortunately we have to download all 5 GB for that day and extract the required file and content.

As part of this usecase wanted to check:

  1. If Athena works for Log4J log files (from JBoss, Websphere) - do we have Serde's (serialization / deserialization libraries) for the same, glue catalog for the same
feita há 2 anos661 visualizações
1 Resposta
0

Have you had a look at the GROK SerDe? This may help you? https://docs.aws.amazon.com/athena/latest/ug/grok-serde.html Example 2 uses log4j format. Hope this helps.

respondido há 2 anos

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas