Parsing Redshift Audit logs

0

We have Redshift's audit logs enabled. They are setup to write to S3. We are attempting to parse these and ingest them into Splunk. In the logs the multi-line SQL queries are getting lost in the stream as there is no way to correlate the end of an SQL query with the preceding statements. How can we work around this?

1 Answer
0

Hello.

Parsing Redshift's audit logs with multiline SQL statements and ingesting them into Splunk can be challenging due to the multiline nature of SQL statements. However, you can use a combination of pre-processing and Splunk configurations to ensure that multiline SQL queries are ingested properly. Here's a suggested approach:

Log Structure Understanding: First, understand the structure of your audit logs. Typically, a Redshift log entry will have a timestamp, log level, process ID, user, database, and other metadata, followed by the actual SQL statement.

Before ingesting them into Splunk, you can write a script (Python, for example) that goes through the logs and combines multiline SQL statements into a single line.

This script can replace newline characters within SQL statements with a space or some other delimiter of your choice, ensuring that each SQL statement becomes a single line.

Splunk Configuration: Use Splunk's props.conf and transforms.conf to set up custom source type configurations for Redshift logs.

In the props.conf, specify the new source type, let's call it redshift_processed:

[redshift_processed]
SHOULD_LINEMERGE=true
LINE_BREAKER=([\r\n]+)\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d{3}Z # Assuming the timestamp format YYYY-MM-DDTHH:MM:SS.sssZ, modify as per your timestamp format.
TRUNCATE=0

The SHOULD_LINEMERGE setting is used to indicate that Splunk should consider multiple lines as a single event until it encounters the LINE_BREAKER pattern, which indicates a new event.

TRUNCATE=0 ensures that very long events (like big SQL statements) aren't truncated.

Splunk Data Input:

When setting up the data input in Splunk for the pre-processed logs, make sure to specify the custom source type (redshift_processed) so that Splunk knows how to handle the multiline events.

Regards, Andrii

profile picture
EXPERT
answered 7 months ago
profile picture
EXPERT
reviewed a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions