What Is Merge Join
Merge Join is generally called Sort Merge Join and is a commonly used join method in multi-table join queries. Specifically, it sorts the associated columns of the associated tables, respectively, then extracts data from each sorted table, and matches the data in a sorted table with the data in another sorted table.
Merge Join requires more sorting operations, and therefore, it consumes many resources. Generally speaking, in scenarios where Merge Join can be used, Hash Join can achieve better performance, that is, the effect of Hash Join is better than that of Merge Join. However, if the join keys are already sorted, there is no need to sort them when Merge Join is performed. In this case, the performance of Merge Join is better than that of Hash Join, and the performance advantage of Merge Join is effectively reflected at this time.
In earlier versions, the LibraDB engine only supports Hash Join. In version 2.2410.1.0 and later, the column storage engine also supports Merge Join based on the primary key.
Use Cases
In scenarios where the join key is the primary key of both tables, Merge Join has obvious effects for equivalent association (A.a = B.a).
Usage
Merge Join can only be used by specifying a hint, as shown in the following command.
select * from orders o,lineitem l where o.o_orderkey=l.l_orderkey;