EMS logo

Products Navigation

choose your database tool

Our Partnership Status

Microsoft Certified Partner
Oracle Certified Partner
Embarcadero Technology Partner

SQL Articles

All SQL Articles

Hazan Ilan
MySQL Left Join

A reminder about “A LEFT JOIN B ON conditional_expr”

The ON condition (in the expression “A LEFT JOIN B ON conditional_expr”) is used to decide how to retrieve rows from table B (Matching-Stage).
If there is no row in B that matches the ON condition, an extra B row is generated with all columns set to NULL.
In the Matching-Stage any condition in the WHERE clause is not used. Only after the Matching-Stage, the condition in the WHERE clause will be used. It will filter out rows retrieved from the Matching-Stage.

Lets see a LEFT JOIN example:

mysql> CREATE TABLE `product` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `amount` int(10) unsigned default NULL,
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=5 DEFAULT CHARSET=latin1
 
mysql> CREATE TABLE `product_details` (
  `id` int(10) unsigned NOT NULL,
  `weight` int(10) unsigned default NULL,
  `exist` int(10) unsigned default NULL,
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
 
mysql> INSERT INTO product (id,amount)
       VALUES (1,100),(2,200),(3,300),(4,400);
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0
 
mysql> INSERT INTO product_details (id,weight,exist)
       VALUES (2,22,0),(4,44,1),(5,55,0),(6,66,1);
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0
 
mysql> SELECT * FROM product;
+----+--------+
| id | amount |
+----+--------+
|  1 |    100 |
|  2 |    200 |
|  3 |    300 |
|  4 |    400 |
+----+--------+
4 rows in set (0.00 sec)
 
mysql> SELECT * FROM product_details;
+----+--------+-------+
| id | weight | exist |
+----+--------+-------+
|  2 |     22 |     0 |
|  4 |     44 |     1 |
|  5 |     55 |     0 |
|  6 |     66 |     1 |
+----+--------+-------+
4 rows in set (0.00 sec)
 
mysql> SELECT * FROM product LEFT JOIN product_details
       ON (product.id = product_details.id);
+----+--------+------+--------+-------+
| id | amount | id   | weight | exist |
+----+--------+------+--------+-------+
|  1 |    100 | NULL |   NULL |  NULL |
|  2 |    200 |    2 |     22 |     0 |
|  3 |    300 | NULL |   NULL |  NULL |
|  4 |    400 |    4 |     44 |     1 |
+----+--------+------+--------+-------+
4 rows in set (0.00 sec)

Is there a difference between the ON clause and the WHERE clause?

A question: Is there a difference in the result set of the following two queries?

1. SELECT * FROM product LEFT JOIN product_details
         ON (product.id = product_details.id)
         AND   product_details.id=2;
2. SELECT * FROM product LEFT JOIN product_details
         ON (product.id = product_details.id)
         WHERE product_details.id=2;

It is best to understand by example:

mysql> SELECT * FROM product LEFT JOIN product_details
       ON (product.id = product_details.id)
       AND product_details.id=2;
+----+--------+------+--------+-------+
| id | amount | id   | weight | exist |
+----+--------+------+--------+-------+
|  1 |    100 | NULL |   NULL |  NULL |
|  2 |    200 |    2 |     22 |     0 |
|  3 |    300 | NULL |   NULL |  NULL |
|  4 |    400 | NULL |   NULL |  NULL |
+----+--------+------+--------+-------+
4 rows in set (0.00 sec)
 
mysql> SELECT * FROM product LEFT JOIN product_details
       ON (product.id = product_details.id)
       WHERE product_details.id=2;
+----+--------+----+--------+-------+
| id | amount | id | weight | exist |
+----+--------+----+--------+-------+
|  2 |    200 |  2 |     22 |     0 |
+----+--------+----+--------+-------+
1 row in set (0.01 sec)

The first query retrieve all rows from product table while using the ON condition to decide which rows to retrieve from the Left joined product_details table.
The second query is doing simple Left-Join. It filters out rows not matching the WHERE clause conditions.

See more examples:

mysql>
mysql> SELECT * FROM product LEFT JOIN product_details
       ON product.id = product_details.id
       AND product.amount=100;
+----+--------+------+--------+-------+
| id | amount | id   | weight | exist |
+----+--------+------+--------+-------+
|  1 |    100 | NULL |   NULL |  NULL |
|  2 |    200 | NULL |   NULL |  NULL |
|  3 |    300 | NULL |   NULL |  NULL |
|  4 |    400 | NULL |   NULL |  NULL |
+----+--------+------+--------+-------+
4 rows in set (0.00 sec)

All the rows from the product table are retrieved. However, no matching found at product_details table (there is no row that matches the condition product.id = product_details.id AND product.amount=100).

mysql> SELECT * FROM product LEFT JOIN product_details
       ON (product.id = product_details.id)
       AND product.amount=200;
+----+--------+------+--------+-------+
| id | amount | id   | weight | exist |
+----+--------+------+--------+-------+
|  1 |    100 | NULL |   NULL |  NULL |
|  2 |    200 |    2 |     22 |     0 |
|  3 |    300 | NULL |   NULL |  NULL |
|  4 |    400 | NULL |   NULL |  NULL |
+----+--------+------+--------+-------+
4 rows in set (0.01 sec)

All the rows from the product table are retrieved. However, only one matching found at product_details table.

LEFT JOIN with WHERE … IS NULL clause

What is happening if you use the WHERE …. IS NULL clause?
As stated before, the Where-condition stage is happening after the Matching-Stage. This means that the WHERE IS NULL clause will filter out, from the Matching-Stage result, rows that didn’t satisfy the Matching-Condition.
All that is fine on the paper but if you are using more than one condition in the ON clause it is starting to be confusing.

I have developed a simple way to better understand a complex Matching-Condition with WHERE … IS NULL clause by:

  • looking at the IS NULL clause as a negation of the Matching-Condition.
  • using the Logical rule: !(A and B) == !A OR !B

Look at the following example:

mysql> SELECT a.* FROM product a LEFT JOIN product_details b
       ON a.id=b.id AND b.weight!=44 AND b.exist=0
       WHERE b.id IS NULL;
+----+--------+
| id | amount |
+----+--------+
|  1 |    100 |
|  3 |    300 |
|  4 |    400 |
+----+--------+
3 rows in set (0.00 sec)

Lets examine the Matching clause (ON clause):

(a.id=b.id) AND (b.weight!=44) AND (b.exist=0)

Remember that we can think of the IS NULL clause as a negation of the Matching-Condition.
This means we will retrieve the following rows:

!( exist(b.id that equals to a.id) AND b.weight !=44 AND b.exist=0 )
!exist(b.id that equals to a.id) || !(b.weight !=44) || !(b.exist=0)
!exist(b.id that equals to a.id) || b.weight =44 || b.exist=1

Like in a C programing language, the operands of logical-AND and logical-OR expressions are evaluated from left to right. If the value of the first operand is sufficient to determine the result of the operation, the second operand is not evaluated (short-circuit evaluation).
In our case, this means, retrieve all rows in A that don’t have matching id in B PLUS, for the rows that do have matching id in B, retrieve only the ones that have b.weight =44 OR b.exist=1

See another example:

mysql> SELECT a.* FROM product a LEFT JOIN product_details b
       ON a.id=b.id AND b.weight!=44 AND b.exist=1
       WHERE b.id IS NULL;
+----+--------+
| id | amount |
+----+--------+
|  1 |    100 |
|  2 |    200 |
|  3 |    300 |
|  4 |    400 |
+----+--------+
4 rows in set (0.00 sec)

Explanation:

! ( exist(bid that equals to aid) AND b.weight !=44 AND b.exist=1 )
!exist(bid that equals to aid) || !(b.weight !=44) || !(b.exist=1)
!exist(bid that equals to aid) || b.weight =44 || b.exist=0

The battle between the Matching-Conditions and the Where-conditions

You can get the same results (A.*) if you put only the basic matching condition in the ON clause and the negation of the rest in the Where condition clause.
For example,
Instead of writing:

SELECT a.* FROM product a LEFT JOIN product_details b
ON a.id=b.id AND b.weight!=44 AND b.exist=0
WHERE b.id IS NULL;

You can write:

SELECT a.* FROM product a LEFT JOIN product_details b
ON a.id=b.id
WHERE b.id is null OR b.weight=44 OR b.exist=1;
mysql> SELECT a.* FROM product a LEFT JOIN product_details b
       ON a.id=b.id
       WHERE b.id is null OR b.weight=44 OR b.exist=1;
+----+--------+
| id | amount |
+----+--------+
|  1 |    100 |
|  3 |    300 |
|  4 |    400 |
+----+--------+
3 rows in set (0.00 sec)

Instead of writing:

SELECT a.* FROM product a LEFT JOIN product_details b
ON a.id=b.id AND b.weight!=44 AND b.exist!=0
WHERE b.id IS NULL;

You can write:

SELECT a.* FROM product a LEFT JOIN product_details b
ON a.id=b.id
WHERE b.id is null OR b.weight=44 OR b.exist=0;
mysql> SELECT a.* FROM product a LEFT JOIN product_details b
       ON a.id=b.id
       WHERE b.id is null OR b.weight=44 OR b.exist=0;
+----+--------+
| id | amount |
+----+--------+
|  1 |    100 |
|  2 |    200 |
|  3 |    300 |
|  4 |    400 |
+----+--------+
4 rows in set (0.00 sec)

Does these queries really the same?
These queries retrieve the same result set as long as you need only the values from the first table (e.g. A.*). In a case that you are retrieving values from the LEFT JOINed table, the results values are not the same.
As stated before, the condition in the WHERE clause filters out rows retrieved from the Matching-Stage.

For example:

mysql> SELECT * FROM product a LEFT JOIN product_details b
       ON a.id=b.id AND b.weight!=44 AND b.exist=1
       WHERE b.id is null;
+----+--------+------+--------+-------+
| id | amount | id   | weight | exist |
+----+--------+------+--------+-------+
|  1 |    100 | NULL |   NULL |  NULL |
|  2 |    200 | NULL |   NULL |  NULL |
|  3 |    300 | NULL |   NULL |  NULL |
|  4 |    400 | NULL |   NULL |  NULL |
+----+--------+------+--------+-------+
4 rows in set (0.00 sec)
 
mysql> SELECT * FROM product a LEFT JOIN product_details b
       ON a.id=b.id
       WHERE b.id IS NULL OR b.weight=44 OR b.exist=0;
+----+--------+------+--------+-------+
| id | amount | id   | weight | exist |
+----+--------+------+--------+-------+
|  1 |    100 | NULL |   NULL |  NULL |
|  2 |    200 |    2 |     22 |     0 |
|  3 |    300 | NULL |   NULL |  NULL |
|  4 |    400 |    4 |     44 |     1 |
+----+--------+------+--------+-------+
4 rows in set (0.00 sec)

General Note: If you use LEFT JOIN to find rows that do not exist in some table and you have the following test: col_name IS NULL in the WHERE part,
where col_name is a column that is declared as NOT NULL, MySQL stops searching for more rows (for a particular key combination) after
it has found one row that matches the LEFT JOIN condition.

Source: mysqldiary.com

twitterfacebook