Wednesday, June 18, 2008

Sort Merge Joins

Overview of Sort Merge Joins
------------------------------------------------

Sort merge joins can be used to join rows from two independent sources. Hash joins generally perform better than sort merge joins. On the other hand, sort merge joins can perform better than hash joins if both of the following conditions exist:

•The row sources are sorted already.
•A sort operation does not have to be done.

Sort merge joins are useful when the join condition between two tables is an inequality condition (but not a nonequality) like <, <=, >, or >=. Sort merge joins perform better than nested loop joins for large data sets. You cannot use hash joins unless there is an equality condition.
In a merge join, there is no concept of a driving table. The join consists of two steps:

1.Sort join operation: Both the inputs are sorted on the join key.
2.Merge join operation: The sorted lists are merged together.

If the input is already sorted by the join column, then a sort join operation is not performed for that row source.

When the Optimizer Uses Sort Merge Joins
-----------------------------------------------------

The optimizer can choose a sort merge join over a hash join for joining large amounts of data if any of the following conditions are true:

•The join condition between two tables is not an equi-join.
•Because of sorts already required by other operations, the optimizer finds it is cheaper to use a sort merge than a hash join.

Sort Merge Join Hints
--------------------------------------------

To instruct the optimizer to use a sort merge join, apply the USE_MERGE hint. You might also need to give hints to force an access path.

There are situations where it is better to override the optimize with the USE_MERGE hint.

Example of Sort Merge Joins
----------------------------------


Based on the data in Example of using Index by Hints
SQL> select a.n, b.b from table_a a , table_b b where a.k>b.k;

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 3680082791

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 21 | | 5265 (2)| 00:01:04 |
| 1 | MERGE JOIN | | 1 | 21 | | 5265 (2)| 00:01:04 |
| 2 | SORT JOIN | | 10000 | 90000 | 408K| 50 (2)| 00:00:01 |
| 3 | TABLE ACCESS FULL| TABLE_A | 10000 | 90000 | | 8 (0)| 00:00:01 |
|* 4 | SORT JOIN | | 1007K| 11M| 38M| 5214 (2)| 00:01:03 |
| 5 | TABLE ACCESS FULL| TABLE_B | 1007K| 11M| | 590 (2)| 00:00:08 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access(INTERNAL_FUNCTION("A"."K")>INTERNAL_FUNCTION("B"."K"))
filter(INTERNAL_FUNCTION("A"."K")>INTERNAL_FUNCTION("B"."K"))

No comments:

Post a Comment