Does Athena support ORDER BY in ARRAY_AGG ?

0

It appears that Athena does not support ORDER BY within ARRAY_AGG function. Here is an example.

If I run the following query, it aggregates records into an array, as expected.

    WITH dataset AS (
       SELECT 1 AS breakdown_id, 1 AS history_order, 'OPEN' AS action
       UNION ALL
       SELECT 1 AS breakdown_id, 2 AS history_order, 'IN_PROGRESS' AS action
       UNION ALL
       SELECT 1 AS breakdown_id, 3 AS history_order, 'CLOSED' AS action
       UNION ALL
       SELECT 2 AS breakdown_id, 1 AS history_order, 'OPEN' AS action
       UNION ALL
       SELECT 2 AS breakdown_id, 2 AS history_order, 'IN_PROGRESS' AS action
       UNION ALL
       SELECT 3 AS breakdown_id, 1 AS history_order, 'OPEN' AS action
       UNION ALL
       SELECT 3 AS breakdown_id, 2 AS history_order, 'IN_PROGRESS' AS action
       UNION ALL
       SELECT 3 AS breakdown_id, 3 AS history_order, 'WAITING_FOR_INPUT' AS action
    
    )
    SELECT breakdown_id
         , ARRAY_AGG(action) AS actions
      FROM dataset
     GROUP BY breakdown_id

Result:

#breakdown_idactions
11[IN_PROGRESS, CLOSED, OPEN]
22[WAITING_FOR_INPUT, IN_PROGRESS, OPEN]
32[IN_PROGRESS, OPEN]
\
However the elements of the array need to be in the order specified by the history_order column. Unfortunately adding ORDER BY into the ARRAY_AGG function results in nothing being returned.
    WITH dataset AS (
       SELECT 1 AS breakdown_id, 1 AS history_order, 'OPEN' AS action
       UNION ALL
       SELECT 1 AS breakdown_id, 2 AS history_order, 'IN_PROGRESS' AS action
       UNION ALL
       SELECT 1 AS breakdown_id, 3 AS history_order, 'CLOSED' AS action
       UNION ALL
       SELECT 2 AS breakdown_id, 1 AS history_order, 'OPEN' AS action
       UNION ALL
       SELECT 2 AS breakdown_id, 2 AS history_order, 'IN_PROGRESS' AS action
       UNION ALL
       SELECT 3 AS breakdown_id, 1 AS history_order, 'OPEN' AS action
       UNION ALL
       SELECT 3 AS breakdown_id, 2 AS history_order, 'IN_PROGRESS' AS action
       UNION ALL
       SELECT 3 AS breakdown_id, 3 AS history_order, 'WAITING_FOR_INPUT' AS action
    
    )
    SELECT breakdown_id
         , ARRAY_AGG(action ORDER BY history_order) AS actions
      FROM dataset
     GROUP BY breakdown_id

Result:

#breakdown_idactions

Is it a defect or a feature? And, more importantly, is there another way to get the array ordered by history_order?

AWS
Denis_A
asked 3 years ago1793 views
1 Answer
0
Accepted Answer

Hi,

If you upgrade your Athena engine to v2, the query that was failing on you should work.

https://docs.aws.amazon.com/athena/latest/ug/engine-versions-changing.html

AWS
hamltm
answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions