Assuming:
table1andtable2both have a date partition with a column name ofdtable2referencestable1by the columnt1_idtable1andtable2are stored in HDFS as RCFile
The following correctly produces NULL (empty) values:
WITH
t1 AS (SELECT t1.id FROM table1 t1 WHERE t1.d='2014-02-24'),
t2 AS (SELECT t2.t1_id, t2.id FROM table2 t2 WHERE t2.d='2014-02-24')
SELECT
t1.id,
t2.t1_id,
t2.id
FROM t1
LEFT OUTER JOIN t2 ON t1.id = t2.t1_idHowever, the following incorrectly produces no NULL (empty) values where there are no table2 with table2.t1_id = table1.id:
SELECT
t1.id,
t2.t1_id,
t2.id
FROM t1
LEFT OUTER JOIN t2 ON t1.id = t2.t1_id
WHERE
t1.d = '2014-02-24' AND
t2.d = '2014-02-24'