Bringing Generative AI to the data warehouse with Amazon Bedrock and Amazon Redshift

11 minute read
Content level: Intermediate
2

This article provides a serverless framework for using Generative AI on datasets held in an Amazon Redshift data warehouse.

Introduction

Data warehousing platforms have long served a large portion of enterprise analytics capabilities. Advances in AI are expanding the types of use cases that can be delivered out of the data warehouse; helping customers to detect and respond to change faster, surface trends buried deep in their data, and improve customer experience.

This article walks through a simple yet powerful architectural pattern that brings Generative AI to the data warehouse, via an example scenario that uses fictitious data.

Scenario

Consider a fictitious retail business, AnyCompany, who operate their data warehousing environment using Amazon Redshift Serverless. AnyCompany capture and store transactional data related to product purchases and customer reviews. AnyCompany have a customer experience (CX) team whose role is to analyse customer feedback and interact with customers, with the goal of improving overall customer satisfaction.

Given the number of daily product reviews, it’s not possible for the AnyCompany CX team to make contact with each and every customer. They’re looking for a way to automate personalised customer communications with tailored messaging based on sentiment and level of detail in a customer’s product review.

Prerequisites

To work through the example scenario in this article, please ensure you have:

The authors of this article performed testing in the us-west-2 (Oregon) region.

Architecture

The solution depicted below operates as follows:

  1. An end user calls a Lambda User Defined Function (UDF) draft_customer_review_emails in Redshift, passing in details from the reviews table including customer name, product reviewed, and the review itself. The UDF can be invoked for a single review or multiple reviews as defined in the call statement
  2. The Redshift UDF invokes a Lambda function that collates review details and constructs a prompt that can be passed to a Large Language Model (LLM)
  3. An LLM hosted in Amazon Bedrock is invoked using the review detail prompt
  4. The model response is passed back to the end user in Redshift for further action Enter image description here

It is worth noting at this point that this architectural pattern is extensible in the sense that the application’s behaviours can be modified extensively via simple changes to the prompt template in the Lambda function, and the input data passed to the function.

Implementation steps - AWS Lambda

In this section you will create a new AWS Lambda function and deploy the code required to invoke an Amazon Bedrock model.

  1. First, open Lambda from the AWS Management Console and click the Create function button.

  2. Use the Author from scratch option, provide a function name [e.g. redshift-review-comms-lambda] and select Python 3.12+ as the Runtime. Enter image description here

  3. Under Permissions, expand the Change default execution role section and select the radio button and confirm selection of Create a new role with basic Lambda permissions. Enter image description here

  4. Once the Lambda has been created, navigate to the automatically created AWS Identity and Access Management (IAM) role and attach a policy that provides access to invoke models within Amazon Bedrock.

  5. Navigate to the Lambda you have created in the AWS Management Console. In the Code tab for the Lambda paste the following Python code. This invoke will call the Anthropic Claude v2.1 model in Amazon Bedrock for every record that is passed to the Lambda by Amazon Redshift. It will then pass the model response back to Redshift. This can be extended to use different LLMs hosted in Bedrock, provide a modified prompt, or execute a completely different scenario.

import json
import boto3

# Define boto3 client for Amazon Bedrock invokation
client = boto3.client(service_name='bedrock-runtime')

# Define parameters to be used in Claude model invokation
claude_bedrock_model_id = "anthropic.claude-v2:1"
claude_accept = 'application/json'
claude_content_type = 'application/json'

# Define function used to invoke the Bedrock Claude model and parse result back
def call_claude_llm(body):
    # Invoke model
    response = client.invoke_model(body=body, modelId=claude_bedrock_model_id, accept=claude_accept, contentType=claude_content_type)
    
    # Parse response
    response_body = json.loads(response.get('body').read())["completion"]
    
    # Return model response
    return response_body


def create_claude_body(customer,product,review):
    body = json.dumps({
        "prompt": f"\n\nHuman: You are drafting emails for the AnyCompany Customer Experience Team, responding to feedback from customer reviews. \
        Draft a follow up email to {customer} based on their recent review of the product {product}. \
        If the review is negative, be apologetic and offer $20 discount if they reply to the email. \
        If the review was positive, paraphrase their review and thank them for taking the time to leave the review. \
        If there's not enough detail to explain why a customer was satisifed or dissatisfied, ask them to provide more details. \
        Limit your resopnse to 100 words and sign off the email as 'AnyCompany Customer Experience Team'. \
        Your response should include only the email, so the first word should be 'Dear' followed by their first name. \
        Here is the review: {review}\n\nAssistant:",
        "max_tokens_to_sample": 500,
        "temperature": 0.1,
        "top_p": 0.9,
    })
    return body

# Define main Lambda handler
def lambda_handler(event, context):
    try:
        # Initialise response for Amazon Redshift
        response = dict()
        
        # Read records passed from Redshift
        records = event["arguments"]
        
        # Initialise results list to be sent to Amazon Redshift
        results = []
        
        # For every record/row provided by Redshift, do the following
        for record in records:
            # Parse input variables
            customer = record[0]
            product = record[1]
            review = record[2]
            
            # Generate the body of the payload to be sent to the model
            body = create_claude_body(customer,product,review)
            
            # Try generate a response from Bedrock
            try:
                llm_response = call_claude_llm(body)
                results.append(llm_response)
            except:
                results.append(None)

        # Create response to Redshift and return
        response['success'] = True   
        response['results'] = results
    except Exception as e:
        response['success'] = False  
        response['error_msg'] = str(e)

    return json.dumps(response)

Enter image description here

  1. Click on the Deploy button to deploy the changes to the function Enter image description here

  2. Next you will need to Increase the default timeout time of the Lambda. Do this by navigating to the Configuration tab on the Lambda function and editing the General configuration. Enter image description here

  3. Update the Timeout setting to be at least 1 minute. This allows the Lambda enough time to execute for every record passed by Redshift. Note, if you are executing this Lambda for a lot of records, this value may need to be further increased. Enter image description here

  4. Click on the Save button.

You have now successfully deployed and configured an AWS Lambda to invoke an Amazon Bedrock model. Next you need to set up the Lambda UDF and data to be used for testing in Amazon Redshift.

Implementation steps - Amazon Redshift

  1. Navigate to the Redshift Query Editor v2 (QEv2) through the AWS Management Console
  2. Connect to your Redshift Serverless Workgroup and open a new editor tab
  3. Execute the following CREATE FUNCTION statement to deploy the Lambda UDF for invoking the Amazon Bedrock model. Make sure to update the AWS Lambda function name if you chose a different name when creating the function.
CREATE EXTERNAL FUNCTION draft_customer_review_emails(VARCHAR,VARCHAR,VARCHAR)
RETURNS VARCHAR(4000)
VOLATILE
LAMBDA 'redshift-review-comms-lambda'
IAM_ROLE DEFAULT;
  1. Execute the following CREATE TABLE statement to create the table that will hold the mock review data for AnyCompany
CREATE TABLE public.reviews (
CUSTOMER VARCHAR,
PRODUCT VARCHAR,
PURCHASE_DATE DATE,
REVIEW_DATE DATE,
REVIEW VARCHAR(MAX)
);
  1. Execute the following INSERT statement to create your mock review data
INSERT INTO public.reviews (CUSTOMER,PRODUCT,PURCHASE_DATE,REVIEW_DATE,REVIEW) VALUES 
('Rick Fraser','40cm freestanding fan','12/23/2023','02/06/2024','The fan produces great airflow, but it also produces great noise. I wanted to use this at night but it is keeping me awake.'),
('Sean Beath','85 inch 4K QLED TV','12/14/2023','02/06/2024','This TV is absolutely incredible. The refresh rate and clarity make me feel like Im back at the Sphere Las Vegas!'),
('Tahlia Spritz','Wireless Bluetooth Speaker','12/24/2023','02/06/2024','Full stars for sound clarity, battery life could be improved. Its not really wireless when it needs to be constantly plugged into power.'),
('Ashlee Colada','Charcoal hood BBQ','11/02/2023','02/06/2024','Love this BBQ - great build quality and sensational heat retention.'),
('Mohan Swatcher','Pod coffee maker','12/18/2023','02/05/2024','The coffee tastes slightly burnt to me. I will keep trying.'),
('Liesel Stewart','Battery powered vacuum','12/09/2023','02/05/2024','Very powerful vacuum.'),
('Ainsley Price','500L Quad Door Fridge','12/19/2023','02/05/2024','Overall great fridge. I had a leaking water reservoir but the staff replaced the part free of charge. Great service! '),
('Pedro Scully','Top load washing machine','10/28/2023','02/05/2024','Best washing machine Ive ever used. Cannot recommend more highly');
  1. Execute the following SELECT statement to invoke your AWS Lambda Function and retrieve generated emails for each of the reviews created on the 6th of February 2024.
SELECT customer, product, review, draft_customer_review_emails(customer,product,review) AS draft_email_response
FROM reviews
WHERE review_date = '2024-02-06'
;

Enter image description here

Results

The result of the final query run on Redshift are drafted emails tailored to each individual customer review, as generated by the Amazon Bedrock Large Language Model invocations. Investigate each of the draft email responses and compare them to the review that was input. For example, look at the review and response for Rick Fraser’s review.

Product review

"The fan produces great airflow, but it also produces great noise. I wanted to use this at night but it is keeping me awake."

Response

Dear Rick,

We appreciate you taking the time to leave a review of our 40cm freestanding fan. We are sorry to hear the fan is too noisy for your needs, especially for use at night. Please accept a $20 discount code as an apology if you reply to this email. We would welcome more details on the noise level so we can determine if your experience is typical or if your unit may be defective. Your feedback is valuable for improving our products.

Sincerely,

AnyCompany Customer Experience Team

Interpretation

Even though the review has positive and negative aspects, the model has been able to identify the overall sentiment of Rick’s review as one of dissatisfaction. Then, in accordance with the prompt provided, the model has tailored a response to be apologetic in tone and offer a discount on a future purchase.

Note that the responses that you receive may be different to what is shown in this article. Large Language Models are, by nature, non-deterministic across invocations and therefore some variance in output is expected.

Conclusion

Using this simple architectural pattern we have extended the capability of our data warehouse environment by leveraging Large Language Models provided via Amazon Bedrock. Using an external function, the process of reviewing customer feedback and drafting a contextualised customer communications has been automated so that AnyCompany can scale their customer experience improvement efforts.
This pattern is extensible in the sense that the Lambda UDF’s behaviour can be completely changed through modifications to the prompt template and input data, and additional steps of automation (for example, sending the email) can be added to the Lambda function.

We’d love to hear about other use cases you’re exploring in the comments!

Clean up

Note that as all services used in this article are serverless there will be no costs incurred while you are not executing the SQL select statement to invoke the Lambda UDF. To clean up your environment, do the following.

  • Delete the created AWS Lambda function
  • Delete the created AWS Identity and Access Management (IAM) role for the Lambda function
  • Run the following statements in your Amazon Redshift Serverless endpoint
DROP TABLE public.reviews;
DROP FUNCTION public.draft_customer_review_emails(character varying, character varying, character varying);

Authors

Rick Fraser - Specialist Data & AI Solutions Architect

Sean Beath - Specialist Redshift Solutions Architect