What was the case.
A colleague approached me telling that he had two similar queries. One of them returning data, the other not.
The "simplified" version of the two queries looked like:
SELECT col1
FROM tab1
WHERE col1 NOT IN (SELECT col1 FROM tab2);
FROM tab1
WHERE col1 NOT IN (SELECT col1 FROM tab2);
This query returned no data, however he -and later on I also- was sure that there was a mismatch in the data, which should have returned rows.
This was also proven/shown by the second query:
SELECT col1
FROM tab1
WHERE NOT EXISTS
(SELECT col1
FROM tab2
WHERE tab1.col1 = tab2.col1);
FROM tab1
WHERE NOT EXISTS
(SELECT col1
FROM tab2
WHERE tab1.col1 = tab2.col1);
This query returned the expected difference. And this query does in fact the same as the first query!!
Even when we hardcoded an extra WHERE clause, the result was the same. No rows for:
SELECT *
FROM tab1
WHERE tab1.col1 NOT IN (SELECT col1 FROM tab2)
AND tab1.col1 = 'car';
FROM tab1
WHERE tab1.col1 NOT IN (SELECT col1 FROM tab2)
AND tab1.col1 = 'car';
and the correct rows for:
SELECT *
FROM tab1
WHERE NOT EXISTS
(SELECT 1
FROM tab2
WHERE tab1.col1 = tab2.col1)
AND tab1.col1 = 'car';
FROM tab1
WHERE NOT EXISTS
(SELECT 1
FROM tab2
WHERE tab1.col1 = tab2.col1)
AND tab1.col1 = 'car';
After an hour searching, trying to reproduce the issue, I almost was about to give up and send it to Oracle Support qualifying it as a bug.
However, there was one difference that I saw, that could be the cause of the problem.
Allthough the statements are almost the same, the execution plan showed a slight difference. The execution plan for the NOT IN query looked like:
Whereas the execution plan of the query with the NOT EXISTS looked like:
See the difference?
Not knowing what a "HASH JOIN ANTI NA" exactly was, I entered it as a search command into the knowledge base of My Oracle Support. Besides a couple of patch-set lists, I also found Document 1082123.1, which explains all about the HASH JOIN ANTI NULL_AWARE.
In this document the behaviour we saw is explained, with the most important remark being:
'If t2.n2 contains NULLs,do not return any t1 rows and terminate'
And then it suddenly hit me as I was unable to reproduce the case using my own created test tables.
In our case, it meant that if tab2.col1 would have contained any rows with a NULL value, the join between those two tables could not be made based on a "NOT IN" clause.
The query would terminate without giving any results !!!
And that is exactly what we saw.
The query with the NOT EXISTS doesn't use a NULL_AWARE ANTI JOIN and therefore does return the results
Also the mentioned workaround:
alter session set "_optimizer_null_aware_antijoin" = false;
seems not to work. Allthought the execution plan changes to:
it still returns no rows !!
And Now??
Since there is a document explaining the behaviour, I'm doubting if we can classify this as a bug. But in my opinion, if developers do not know about this strange behaviour, they will easily call it a bug.
The "problem" is easily solved ( or worked around ) using the NOT EXISTS solution, or using NVL with the JOINed columns. However I would expect the optimizer to sort these things out himself.
For anyone who wants to reproduce/investigate this case, I have listed my test-code. The database version we used was 11.1.0.7 on Windows 2008 R2. I'm sure the OS doesn't matter here.
-- Create two tables, make sure they allow NULL values
CREATE TABLE tab1 (col1 VARCHAR2 (100) NULL);
CREATE TABLE tab2 (col1 VARCHAR2 (100) NULL);
INSERT INTO tab1
VALUES ('bike');
INSERT INTO tab1
VALUES ('car');
INSERT INTO tab1
VALUES (NULL);
INSERT INTO tab2
VALUES ('bike');
INSERT INTO tab2
VALUES (NULL);
COMMIT;
-- This query returns No results
SELECT col1
FROM tab1
WHERE col1 NOT IN (SELECT col1 FROM tab2);
-- This query return results
SELECT col1
FROM tab1
WHERE NOT EXISTS
(SELECT col1
FROM tab2
WHERE tab1.col1 = tab2.col1);
Success!!
Plan
|
||
SELECT STATEMENT ALL_ROWS Cost: 5 Bytes:
808 Cardinality: 2
|
||
3 HASH JOIN ANTI NA Cost: 5 Bytes: 808 Cardinality: 2
|
||
1 TABLE ACCESS FULL TABLE PIM_KRG.TAB1 Cost: 2 Bytes: 606 Cardinality: 3
|
||
2 TABLE ACCESS FULL TABLE PIM_KRG.TAB2 Cost: 2 Bytes: 404 Cardinality: 2
|
Whereas the execution plan of the query with the NOT EXISTS looked like:
Plan
|
||
SELECT STATEMENT ALL_ROWS Cost: 5 Bytes:
808 Cardinality: 2
|
||
3 HASH JOIN ANTI Cost: 5 Bytes: 808 Cardinality: 2
|
||
1 TABLE ACCESS FULL TABLE PIM_KRG.TAB1 Cost: 2 Bytes: 606 Cardinality: 3
|
||
2 TABLE ACCESS FULL TABLE PIM_KRG.TAB2 Cost: 2 Bytes: 404 Cardinality: 2
|
See the difference?
Not knowing what a "HASH JOIN ANTI NA" exactly was, I entered it as a search command into the knowledge base of My Oracle Support. Besides a couple of patch-set lists, I also found Document 1082123.1, which explains all about the HASH JOIN ANTI NULL_AWARE.
In this document the behaviour we saw is explained, with the most important remark being:
'If t2.n2 contains NULLs,do not return any t1 rows and terminate'
And then it suddenly hit me as I was unable to reproduce the case using my own created test tables.
In our case, it meant that if tab2.col1 would have contained any rows with a NULL value, the join between those two tables could not be made based on a "NOT IN" clause.
The query would terminate without giving any results !!!
And that is exactly what we saw.
The query with the NOT EXISTS doesn't use a NULL_AWARE ANTI JOIN and therefore does return the results
Also the mentioned workaround:
alter session set "_optimizer_null_aware_antijoin" = false;
seems not to work. Allthought the execution plan changes to:
Plan
|
||
SELECT STATEMENT ALL_ROWS Cost: 4 Bytes:
202 Cardinality: 1
|
||
3 FILTER
|
||
1 TABLE ACCESS FULL TABLE PIM_KRG.TAB1 Cost: 2 Bytes: 606 Cardinality: 3
|
||
2 TABLE ACCESS FULL TABLE PIM_KRG.TAB2 Cost: 2 Bytes: 404 Cardinality: 2
|
it still returns no rows !!
And Now??
Since there is a document explaining the behaviour, I'm doubting if we can classify this as a bug. But in my opinion, if developers do not know about this strange behaviour, they will easily call it a bug.
The "problem" is easily solved ( or worked around ) using the NOT EXISTS solution, or using NVL with the JOINed columns. However I would expect the optimizer to sort these things out himself.
For anyone who wants to reproduce/investigate this case, I have listed my test-code. The database version we used was 11.1.0.7 on Windows 2008 R2. I'm sure the OS doesn't matter here.
-- Create two tables, make sure they allow NULL values
CREATE TABLE tab1 (col1 VARCHAR2 (100) NULL);
CREATE TABLE tab2 (col1 VARCHAR2 (100) NULL);
INSERT INTO tab1
VALUES ('bike');
INSERT INTO tab1
VALUES ('car');
INSERT INTO tab1
VALUES (NULL);
INSERT INTO tab2
VALUES ('bike');
INSERT INTO tab2
VALUES (NULL);
COMMIT;
-- This query returns No results
SELECT col1
FROM tab1
WHERE col1 NOT IN (SELECT col1 FROM tab2);
-- This query return results
SELECT col1
FROM tab1
WHERE NOT EXISTS
(SELECT col1
FROM tab2
WHERE tab1.col1 = tab2.col1);
Success!!
Through the OTN forum, I received a very nice explanation, written by Jonathan Lewis.
ReplyDeleteIt can be found here: http://jonathanlewis.wordpress.com/2007/02/25/not-in/