スキップしてコンテンツを表示

Athena federated query executing about 40 concurrent lambdas for one query which is exhausting AWS RDS connections

0

I initially saw this error when trying to load up one of our Grafana dashboards, but it looks like Athena is invoking a lot of concurrent lambdas for queries which is exhausting our RDS connections. I tried to run a query directly in Athena, and I am seeing the error. Here is the error I see

GENERIC_USER_ERROR: com.google.common.util.concurrent.UncheckedExecutionException: io.trino.spi.TrinoException: Encountered an exception[com.zaxxer.hikari.pool.HikariPool$PoolInitializationException] from your LambdaFunction[arn:aws:lambda:us-west-1:----:function:athenafederatedcatalog_flight_data_prod] executed in context[retrieving meta-data] with message[Failed to initialize pool: FATAL: remaining connection slots are reserved for non-replication superuser, rds_reserved, and rds_superuser connections]
This query ran against the "public" database, unless qualified by the query. Please post the error message on our forum  or contact customer support  with Query Id: b92cd5ca-e660-4e7c-843e-56f70fc23182

Here is my query

 -- This is a standalone version of the query for testing directly in Athena.
-- All Grafana macros have been replaced with static values.

-- CTEs to get historical flight and warning data
WITH total_flights AS (
    SELECT 
        date_trunc('day', flight_timestamp) AS flight_timestamp, 
        count(distinct flight_uuid) AS total_flights 
    FROM flight 
    -- Replaced $__dateFilter with a static date range
    WHERE date_trunc('day', flight_timestamp) BETWEEN DATE '2025-08-18' AND DATE '2025-08-25'
    GROUP BY 1
),
total_warnings AS (
    SELECT 
        date_trunc('day', flight_warning.start_timestamp) AS warning_timestamp, 
        count(*) AS all_warnings
    FROM (
        SELECT DISTINCT 
            a.flight_uuid, 
            title || ': ' || description AS warning_desc, 
            date_trunc('day', start_timestamp) AS start_timestamp, 
            severity 
        FROM alert a
        JOIN flight f ON from_utf8(to_utf8(a.flight_uuid)) = from_utf8(to_utf8(f.flight_uuid))
        -- Replaced $__dateFilter with a static date range
        WHERE date_trunc('day', start_timestamp) BETWEEN DATE '2025-08-18' AND DATE '2025-08-25'
    ) flight_warning
    GROUP BY 1
),
-- 1. Combine historical data and create our numerical 'x' and 'y' values
historical_data AS (
    SELECT 
        f.flight_timestamp,
        -- Replaced $__timeFrom with a static timestamp
        CAST(date_diff('day', TIMESTAMP '2025-08-18 00:00:00', f.flight_timestamp) AS DOUBLE) AS x,
        COALESCE(round(CAST(w.all_warnings AS DOUBLE) / CAST(f.total_flights AS DOUBLE), 2), 0) AS y -- This is ALL_WARNINGS
    FROM total_flights f
    LEFT JOIN total_warnings w ON f.flight_timestamp = w.warning_timestamp
),
-- 2. Calculate the linear regression parameters (m and b)
regression_params AS (
    SELECT
    regr_intercept(h.y ,h.x) b,
    regr_slope(h.y, h.x) m
    from
    historical_data h
    
),
-- 3. Generate a full series of dates, including future dates
date_series AS (
    SELECT
        CAST(dt AS timestamp) AS full_timestamp
    -- Replaced $__timeFrom and $__timeTo with static timestamps for a week past and a week future
    FROM (SELECT TIMESTAMP '2025-08-18 00:00:00' AS start_date, TIMESTAMP '2025-09-01 00:00:00' AS end_date)
    CROSS JOIN UNNEST(sequence(start_date, end_date, interval '1' day)) AS t(dt)
)
-- 4. Final SELECT: Join historical data and project the trendline
SELECT 
    ds.full_timestamp AS "Timestamp",
    hd.y AS "ALL_WARNINGS", -- Actual historical data
    -- Calculate the trendline value for ALL dates (past and future)
    -- Replaced $__timeFrom with a static timestamp
    rp.m * CAST(date_diff('day', TIMESTAMP '2025-08-18 00:00:00', ds.full_timestamp) AS DOUBLE) + rp.b AS "Trendline"
FROM date_series ds
CROSS JOIN regression_params rp -- Join the calculated m and b to every row
LEFT JOIN historical_data hd ON ds.full_timestamp = hd.flight_timestamp
ORDER BY ds.full_timestamp;

when I was executing this query I am seeing 45 concurrent executions. I am not sure why a single query is generating so many lambdas. Is there something I am doing wrong?

質問済み 2ヶ月前68ビュー
2回答
0
承認された回答

Athena's query processing naturally triggers multiple Lambda functions (around 40) because it splits your query into parallel tasks for faster processing. This high number of Lambda invocations typically occurs when queries need to scan large amounts of data.

But please note Athena is a serverless service, which means that all the underlying computing resources are managed by AWS so there is no infrastructure to set up or manage. It is not possible to force Athena to use just one or two Lambdas (as parallel processing is fundamental to its performance).

AWS
回答済み 2ヶ月前
エキスパート
レビュー済み 2ヶ月前
0

The error you're experiencing is due to your Athena federated query exhausting the available RDS connections. This happens because Athena is invoking multiple Lambda functions concurrently to process your query, and each Lambda function is establishing connections to your RDS database.

Here's what's happening:

  1. Athena federated queries work by invoking Lambda functions to connect to external data sources like your RDS database.

  2. For complex queries with multiple joins and operations, Athena may invoke many Lambda functions concurrently (in your case, around 45).

  3. Each Lambda function creates its own connection to the RDS database, quickly exhausting your available connection slots.

  4. The error message indicates that all remaining connection slots are reserved for special users (non-replication superuser, rds_reserved, and rds_superuser).

To resolve this issue, you have several options:

  1. Implement RDS Proxy between your Lambda functions and RDS database. RDS Proxy manages a pool of shared database connections, which enables your Lambda functions to reach high concurrency levels without exhausting database connections. This is recommended by AWS for production environments with many concurrent connections.

  2. Optimize your query to reduce complexity. Your query contains multiple joins, CTEs, and complex operations that might be causing Athena to split the work across many Lambda functions. Try simplifying the query or breaking it into smaller parts.

  3. Increase the maximum connections allowed in your RDS instance, if possible. However, this is more of a workaround than a solution.

  4. Consider implementing connection pooling in your Lambda function code if you're not using RDS Proxy.

  5. Break down large queries into smaller, more manageable parts that can be executed sequentially rather than all at once.

The most effective solution would likely be implementing RDS Proxy, as it's specifically designed to handle this type of scenario where many Lambda functions need to connect to an RDS database concurrently.
Sources
Athena Federated Query - MySQL - Lambda timeout troubleshooting | AWS re:Post
Which is preferred, Lambda->RDS Proxy->RDS or Lambda->RDS? | AWS re:Post

回答済み 2ヶ月前
  • My question is, why is around 40 lambda invoked for one query. I understand that we can add an RDS proxy but I want to know if Athena can optimize this where maybe one or two lambdas are executed.

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

関連するコンテンツ