Direkt zum Inhalt

How to determine per query cost for Amazon Redshift serverless?

Lesedauer: 3 Minute
Inhaltsebene: Expertenwissen
1

Amazon Redshift serverless data warehouse compute costs are based on Redshift Processing Units (RPUs) consumed. This article will provide you a way to approximate the query cost even though the costs are billed for the entire warehouse.

With Amazon Redshift serverless you pay for the workloads you run in RPU-hours, on a per-second basis, with a 60-second minimum charge for the warehouse. Kindly note that this 60-seconds minimum billing is not per query, but for the entire Redshift serverless warehouse. This includes the cost of queries accessing data stored in open file formats on Amazon S3. There are no fees for the time taken to start up the data warehouse. Refer to Billing usage notes for further details.

NOTE

The query below show the cost for US East (Ohio) / us-east-2 billed at $0.36 per RPU hour. For pricing in your region refer https://aws.amazon.com/redshift/pricing/#Amazon_Redshift_Serverless , and substitute 0.36 with appropriate value.

IMPORTANT

  1. If you are trying to estimate the cost of a single query then use elapsed_time from sys_query_history and calculate the cost as elapsed_time / 3600 x RPU x 0.36
  2. Below query is an approximation which determines the total cost incurred on a day and proportionately distributes it based on the elapsed time amongst all the queries that executed. This query is to be used when you have run your entire workload and want to estimate what portion of total daily cost to attribute to particular query.
  3. Do not add any filters in the CTE's as it shall skew the results.
with
daily_cost as (
    select
        trunc(start_time) "Day",
        max(compute_capacity) max_compute_capacity,
        (sum(charged_seconds) / 3600:: double precision) * 0.36 as daily_cost
    from
        sys_serverless_usage
    group by
        1
),
daily_queries as (
    SELECT
        *,
        elapsed_time / total_time_for_day:: double precision as perc
    from
        (
            SELECT
                
                query_id,
                user_id,
                query_text,
                trunc(start_time) "Day",
                elapsed_time,
                sum(elapsed_time) over (partition by trunc(start_time)) total_time_for_day
            FROM
                sys_query_history
        )
)
select
    q.*,
    c.daily_cost * q.perc as query_cost,
    c.*
from
     daily_cost c
join daily_queries q using ("Day") 
where Day = '2024-05-30'
;
SAMPLE OUTPUT

The below sample output, showing partial results, depicts cost for each query and query text details.

query_id| user_id | day        | elapsed_time | total_time_for_day | perc                   | query_cost             | day        | max_compute_capacity |	daily_cost  | query_text
--------+---------+------------+--------------+--------------------+------------------------+------------------------+------------+----------------------+------------------+--------------------------------------+
5240721 | 101     | 2024-05-30 | 174646       | 22720291           | 0.00768678534971229    | 0.10921384624871222    | 2024-05-30 | 128                  | 1.208	    | select city, sum(qty_sold) from sales group by 1 order by 2 desc limit 3;
				  
5240073 | 101     | 2024-05-30 | 765          | 22720291           | 0.00003367034339480951 | 0.00047838823895345354 | 2024-05-30 | 128                  | 1.208           | select user_id, (first_name ||' '|| last_name) as user_name from tbl_user;
AWS
EXPERTE
veröffentlicht vor einem Jahr4228 Aufrufe
1 Kommentar

It works out of the box!

I have validated it using the cost explorer and the numbers looks the same!

Kudos!

beantwortet vor einem Jahr