Weird result on Redshift Left Outer Join

0

We have an SQL producing results with lots of NULL fields. The code contains almost 1800 lines with many subqueries but the outermost query is a left outer join of 2 subqueries (named t_1 and t_531). If we run the SQL for the left side (t_1), we have the following result: Figure 1. The result of running the t_1 subquery.

Running the right side (t_531) results in the following: Figure 2. The result of running the t_531 subquery.

The join operation is performed using the following comparisons:

ON
   t_1."DateCustom[Reporting Year.Key0]" = t_531."DateCustom[Reporting Year.Key0]"
AND
   t_1."DateCustom[Reporting Year.SortKey0]" = t_531."DateCustom[Reporting Year.SortKey0]"
AND
   t_1."DateCustom[Reporting Year]" = t_531."DateCustom[Reporting Year]"
and
   t_1."[IsGrandTotalRowTotal]" = t_531."[IsGrandTotalRowTotal]"

All the 4 lines returned at t_1 contain exactly 2 matches on the results from t_531. So the result should be a total of 8 lines, joining the columns. However, the query output when calculating the left outer join of t_1 and t_531 only contains 4 rows and the columns from t_531 are NULL: Figure 4. The result of a left outer join of  t_1 and t_531.

Some considerations from the team:

  • Does this query represent a valid user case for the application? The original SQL query that contains the bug has more than 10000 lines of SQL. It was really hard to interpret so we decided to simplify the case to the smallest possible SQL, using even some mocked data if possible. The simpler query that presented the bug that we were able to create was the one provided.

  • Is it possible to make this query work? When trying to simplify the query we did the following:

    • Persisted some subqueries into tables or views and replaced the subquery with the persisted data;
    • Removed some unused projections from the SELECT statements;
    • Replaced some subqueries by their results combining some SELECT statements with the** UNION ALL** operation

With any of these changes, the query returned the expected value, which would be the 8 lines of data from the join:Figure 5. The expected result of a left outer join of  t_1 and t_531. It was obtained by doing any of the changes suggested on the SQL.

  • How do you know that it is a Redshift-specific issue? We have copies of the same database on different data sources (BigQuery, Snowflake, MySQL, etc.) and the SQL works on them (with adequate syntax adjustments). Also, running both sides of the left outer join provides the correct result, as well as persisting the subqueries.

The SQL script to create the DB schema and tables and the SQL query are quite large, so I made them available at https://drive.google.com/drive/folders/11FZ7xvflXR-lx4AWfV3QLf9BVc9oOuT9?usp=sharing

  • Thanks for sharing the scripts. I tried using the Redshift-bug-db-schema.sql and notice the first line create schema adventureworks2012Bug; was a minor typo. I corrected that to adventureworks2012 and I have the schema and tables created. However, when I execute the query-simplified.sql I get no rows. Will it be possible for you to include minimal insert statements for tables so the query can execute and provide results?

asked 3 months ago199 views
2 Answers
0

We tried different changes and all of them provided the correct result. However, the weird thing is that, individually, the subqueries work, but the left outer join of them is wrong. This behavior does not seem correct from the DBMS side, right?

answered 3 months ago
  • Since most queries are providing correct results this seems more of a particular query problem to me, rather than a widespread DBMS problem. Also since the subqueries are work fine, but the left outer join of them is providing incorrect results, it could imply come unexpected data type conversion implicitly happening. I think you should try explicitly casting the correct data types for the join columns to be able to get down to the crux of the issue.

    Also, FYI, the current type compatibility and conversion rules are documented at https://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html#r_Type_conversion

    And lastly, may I request that you use the comment button to add comments instead of posting comments as answers.

  • Sorry, I didn't pay attention when I responded to your comment. About the data types, they are all the same on both side of the join: date time, date time, string, and boolean. If I create a table saving one of the side's results and use it on the query it gives the right result. It's like if the substitution principle is not valid

0

I tried using a simplified version of the table joins and I am seeing 8 rows in the results. Can you double check if some intermediate table is breaking the joins somewhere.

Here is my setup

create schema adventureworks2012;

create table adventureworks2012.t1
( c1 TIMESTAMP,
  c2 TIMESTAMP,
  c3 VARCHAR(50),
  c4 BOOLEAN
);

create table adventureworks2012.t531
( c1 TIMESTAMP,
  c2 TIMESTAMP,
  c3 VARCHAR(50),
  c4 BOOLEAN,
  sales_amount INTEGER
);

insert into adventureworks2012.t1 values
('1970-01-01','1970-01-01','_AS_CONSTANT_',true),
('2005-01-02','2005-01-02','Reporting Calendar 2005',false),
('2007-01-07','2007-01-07','Reporting Calendar 2007',false),
('2008-01-06','2008-01-06','Reporting Calendar 2008',false)
;

insert into adventureworks2012.t531 values
('1970-01-01','1970-01-01','_AS_CONSTANT_',true,339772),
('1970-01-01','1970-01-01','_AS_CONSTANT_',true,700759),
('2005-01-02','2005-01-02','Reporting Calendar 2005',false,3266373),
('2005-01-02','2005-01-02','Reporting Calendar 2005',false,6485784),
('2007-01-07','2007-01-07','Reporting Calendar 2007',false,142513),
('2007-01-07','2007-01-07','Reporting Calendar 2007',false,302238),
('2008-01-06','2008-01-06','Reporting Calendar 2008',false,197258),
('2008-01-06','2008-01-06','Reporting Calendar 2008',false,398522)
;

select t1.c1, t1.c2, t1.c3, t1.c4, t531.sales_amount
from adventureworks2012.t1 as t1
left outer join adventureworks2012.t531 as t531
on t1.c1 = t531.c1
and t1.c2 = t531.c2
and t1.c3 = t531.c3
and t1.c4 = t531.c4
;

I am seeing the 8 rows as expected query results

profile pictureAWS
answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions