A typical database may execute an SQL query in multiple ways, depending on the selected operators’ order and algorithms. One crucial decision is the order in which the optimizer should join relations. The difference between optimal and non-optimal join order might be orders of magnitude. Therefore, the optimizer must choose the proper order of joins to ensure good overall performance. In this blog post, we define the join ordering problem and estimate the complexity of join planning.

Example

Consider the TPC-H schema. The customer may have orders. Every order may have several positions defined in the lineitem table. The customer table has 150,000 records, the orders table has 1,500,000 records, and the lineitem table has 6,000,000 records. Intuitively, every customer places approximately 10 orders, and every order contains four positions on average.

Leave a Reply

Your email address will not be published. Required fields are marked *