Possible Bug with Redshift's QUALIFY clause

0

The issue is related to the recently introduced QUALIFY clause.

For some reason QUALIFY does not work in certain instances unless either a table alias or a additional clause prior to it is added. This behavior is quite unexpected and to add to the confusion, In some of our other production queries it just runs fine.

However in the below depicted first Example: The query does not work until an Alias is added which should not be the case. In the second example, A WHERE clause that does not alter the query fixes the issue. Could likely have something to do with the parser. Appreciate if this can be investigated.

For reference I am on Redshift Cluster version:

PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.54052

Firstly load sample data:

create table sample_purchases (
    user_id int,
    buy_date date
);

insert into sample_purchases values 
(1, '2023-06-01'::date),
(1, '2023-06-05'::date),
(1, '2023-06-22'::date),
(2, '2023-07-10'::date),
(2, '2023-08-10'::date),
(3, '2023-08-01'::date),
(4, '2023-08-03'::date),
(4, '2023-08-15'::date);

The following query does not work, Although Ideally it should:

select
    *
from 
    sample_purchases
QUALIFY row_number() over (
        partition by
            user_id
        order by
            buy_date
) = 1
order by
    user_id;

The above query produces the following error:

SQL Error [42601]: ERROR: syntax error at or near "row_number"
  Position: 53

However if you add an alias to the table name sample_purchases as depicted in the below query, It would work:

select
    *
from 
    sample_purchases sp
QUALIFY row_number() over (
        partition by
            user_id
        order by
            buy_date
) = 1
order by
    user_id;

In addition if you were to add a simple WHERE clause that does nothing, It would work:

select
    *
from 
    sample_purchases
where 1=1
QUALIFY row_number() over (
        partition by
            user_id
        order by
            buy_date
) = 1
order by
    user_id;
asked 8 months ago576 views
1 Answer
0
Accepted Answer

Hi,

Thanks for the sample reproduction! This behaviour is currently expected. This is because QUALIFY is not a registered keyword in Amazon Redshift as mentioned in this documentation. Therefore, the query parser cannot determine if the QUALIFY keyword is an alias or a clause unless there is an alias or WHERE clause preceding it, as in the examples you have provided.

To ensure that QUALIFY continues to run correctly, I suggest aliasing all tables that use this clause.

profile pictureAWS
EXPERT
answered 8 months ago
  • Thanks for the clarification Sean!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions