November 11, 2008

Order of Tables can Influence Access Path

People new to coding SQL often ask me if the order in which you list the table names in the FROM clause matters. In the early days of DB2, the recommendation was to list the tables in the order that you wanted them processed. If anything, that forced programmers to think about the order the tables should be processed in. Back then we'd also recommend that the predicates on the WHERE clause be coded with matching index first, followed by predicates with the highest filter factor-- his was designed to reduce the number of rows that qualify before the next predicate was applied. 

As the DB2 Optimizer improved from release to release, a consensus emerged that the order of the tables wasn't important. And I do believe that, in most cases, the order of the table in the FROM clause doesn't matter. However, it's come to my attention that the order can sometimes influence the Optimizer.

The DB2 9.1 for z/OS Performance Monitoring and Tuning Guide has a chapter titled, "Rearranging the order of tables in a FROM clause." It states that the order of tables or views in a FROM clause can affect the access path that DB2 chooses for a SQL query.

Now, I discussed this with IBM and even though this is documented, it's very rare. In fact, if you have a reproducible example of switching table orders in the FROM clause that dramatically impacts performance, IBM would love to see it, because it's that uncomon. However, if you switch the table order and observe a switch in join sequence with negligible difference in performance or cost estimates, be assured that the Optimizer is working as designed.

It's also said that the order of predicates in the WHERE clause isn't important--and in most cases this is true. However, when fine-tuning the SQL, you may find situations where you need to move predicates around. To learn more about the order of predicate evaluation, read “When DB2 evaluates predicates” in the DB2 9.1 for z/OS Performance Monitoring and Tuning Guide.