The issue is related to the recently introduced QUALIFY clause.
For some reason QUALIFY does not work in certain instances unless either a table alias or a additional clause prior to it is added. This behavior is quite unexpected and to add to the confusion, In some of our other production queries it just runs fine.
However in the below depicted first Example: The query does not work until an Alias is added which should not be the case. In the second example, A WHERE clause that does not alter the query fixes the issue. Could likely have something to do with the parser. Appreciate if this can be investigated.
For reference I am on Redshift Cluster version:
PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.54052
Firstly load sample data:
create table sample_purchases (
user_id int,
buy_date date
);
insert into sample_purchases values
(1, '2023-06-01'::date),
(1, '2023-06-05'::date),
(1, '2023-06-22'::date),
(2, '2023-07-10'::date),
(2, '2023-08-10'::date),
(3, '2023-08-01'::date),
(4, '2023-08-03'::date),
(4, '2023-08-15'::date);
The following query does not work, Although Ideally it should:
select
*
from
sample_purchases
QUALIFY row_number() over (
partition by
user_id
order by
buy_date
) = 1
order by
user_id;
The above query produces the following error:
SQL Error [42601]: ERROR: syntax error at or near "row_number"
Position: 53
However if you add an alias to the table name sample_purchases
as depicted in the below query, It would work:
select
*
from
sample_purchases sp
QUALIFY row_number() over (
partition by
user_id
order by
buy_date
) = 1
order by
user_id;
In addition if you were to add a simple WHERE
clause that does nothing, It would work:
select
*
from
sample_purchases
where 1=1
QUALIFY row_number() over (
partition by
user_id
order by
buy_date
) = 1
order by
user_id;
Thanks for the clarification Sean!