It appears that Athena does not support ORDER BY within ARRAY_AGG function. Here is an example.
If I run the following query, it aggregates records into an array, as expected.
WITH dataset AS (
SELECT 1 AS breakdown_id, 1 AS history_order, 'OPEN' AS action
UNION ALL
SELECT 1 AS breakdown_id, 2 AS history_order, 'IN_PROGRESS' AS action
UNION ALL
SELECT 1 AS breakdown_id, 3 AS history_order, 'CLOSED' AS action
UNION ALL
SELECT 2 AS breakdown_id, 1 AS history_order, 'OPEN' AS action
UNION ALL
SELECT 2 AS breakdown_id, 2 AS history_order, 'IN_PROGRESS' AS action
UNION ALL
SELECT 3 AS breakdown_id, 1 AS history_order, 'OPEN' AS action
UNION ALL
SELECT 3 AS breakdown_id, 2 AS history_order, 'IN_PROGRESS' AS action
UNION ALL
SELECT 3 AS breakdown_id, 3 AS history_order, 'WAITING_FOR_INPUT' AS action
)
SELECT breakdown_id
, ARRAY_AGG(action) AS actions
FROM dataset
GROUP BY breakdown_id
Result:
# | breakdown_id | actions |
---|
1 | 1 | [IN_PROGRESS, CLOSED, OPEN] |
2 | 2 | [WAITING_FOR_INPUT, IN_PROGRESS, OPEN] |
3 | 2 | [IN_PROGRESS, OPEN] |
\ | | |
However the elements of the array need to be in the order specified by the history_order column. Unfortunately adding ORDER BY into the ARRAY_AGG function results in nothing being returned. | | |
WITH dataset AS (
SELECT 1 AS breakdown_id, 1 AS history_order, 'OPEN' AS action
UNION ALL
SELECT 1 AS breakdown_id, 2 AS history_order, 'IN_PROGRESS' AS action
UNION ALL
SELECT 1 AS breakdown_id, 3 AS history_order, 'CLOSED' AS action
UNION ALL
SELECT 2 AS breakdown_id, 1 AS history_order, 'OPEN' AS action
UNION ALL
SELECT 2 AS breakdown_id, 2 AS history_order, 'IN_PROGRESS' AS action
UNION ALL
SELECT 3 AS breakdown_id, 1 AS history_order, 'OPEN' AS action
UNION ALL
SELECT 3 AS breakdown_id, 2 AS history_order, 'IN_PROGRESS' AS action
UNION ALL
SELECT 3 AS breakdown_id, 3 AS history_order, 'WAITING_FOR_INPUT' AS action
)
SELECT breakdown_id
, ARRAY_AGG(action ORDER BY history_order) AS actions
FROM dataset
GROUP BY breakdown_id
Result:
Is it a defect or a feature? And, more importantly, is there another way to get the array ordered by history_order?