Create additional columns in athena while querying load balancer logs

0

I'm using Athena to Query load balancer logs (https://docs.aws.amazon.com/athena/latest/ug/application-load-balancer-logs.html)

In my request_url field I have additional parameters that I would like to use as separate columns in my table to make querying easier.

Examples of my request_url:

https://customer_01.example.com/api/users?agent=2.17.01&source=web

https://customer_123.example.com/api/books?agent=1.1

https://customer_42.example.com/api/locations?agent=2.02.13&source=android

I have a Route53 entry for *.example.com pointing to my load balancer. Almost all the requests contain agent and source parameters. I want to parse the url and add 4 new fields: customer, agent, agent_int, source in the table.

request_urlcustomeragentagent_intsource
https://customer_01.example.com/api/users?agent=2.17.01&source=web12.17.0121701web
https://customer_123.example.com/api/books?agent=1.11231.110100
https://customer_42.example.com/api/locations?agent=2.02.13&source=android422.02.1320213android

With the new fields like customer and agent_int it will be very simple for me to query the logs based on a range of values. Is it possible to parse the request_url and extract this information? Some amount of transformation is also required (e.g. agent_int = (X*100+Y)*100+Z where agent format is X.Y.Z)

No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions