Redshift query returning incorrect results

0

The following query returns 'abcd' instead of an empty result set:

WITH

not_in_cte AS (
  SELECT *
  FROM big_table 
  WHERE id NOT IN (SELECT * FROM small_table)
),

left_join_cte AS (
  SELECT 'abcd'::varchar(4) AS value
  FROM not_in_cte
  LEFT JOIN medium_table
    ON not_in_cte.id = medium_table.id
)

SELECT *
FROM left_join_cte
WHERE value = 'wxyz'::varchar(4)
LIMIT 1;

Where the three tables (big, medium and small) are defined by:

CREATE TABLE big_table 
AS 
SELECT
  ROW_NUMBER() OVER() AS id
FROM stl_scan 
LIMIT 1000;

CREATE TABLE medium_table
AS 
SELECT id
FROM big_table
LIMIT 100;

CREATE TABLE small_table
AS 
SELECT id
FROM big_table
LIMIT 10;

The following changes "fix" the query so that it returns an empty result set:

  • Removing the WHERE clause from the first CTE
  • Removing the LEFT JOIN from the second CTE

I'm not sure about how much of the above is required to reproduce the issue. It's impacted our work a few times recently on seemingly very different queries. In each case, we've been able to distill the issue down to something that looks like the query shared above

Chris B
질문됨 10달 전516회 조회
1개 답변
1
수락된 답변

Hi Chris_B,

Thank you for the details. I have tested it and yes indeed it is not filtering correctly, it seems that it is related to the LEFT OUTER JOIN. The WHERE condition works for the INNER JOIN, RIGHT OUTER JOIN and FULL OUTER JOIN but not the LEFT OUTER JOIN.

I recommend creating a support ticket for a detailed analysis.

Regards, Ziad

AWS
전문가
Ziad
답변함 10달 전
  • Thanks Ziad! Sadly our AWS support plan doesn't include technical support, which is why I posted here. Hoping someone from the redshift dev team checks these periodically

  • Thank you Chris for the details. Yes this will be reviewed.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인