Blog
DB2utor

Categories

September 10, 2007

Explaining How DB2 Will Get Data for a Given SQL Statement

When developers are writing SQL statements to retrieve data they aren't always thinking about how DB2 is going about accessing that data. The SQL query is written then executed to verify the correct data is being returned. Unless some kind of change control or standards are put in place, the SQL statement will go into production as is. When the application executing this statement starts slowing down or causes the system to slow down, a DBA is called in to explain what the statement is doing.

During my many years acting as the gatekeeper for SQL going into production, I've come to rely on the EXPLAIN statement to tell me what DB2 is doing. The EXPLAIN statement provides information about access-path selection chosen by DB2 to retrieve the data. This information is written into a PLAN_TABLE. With each release of DB2, this table changes to include new columns that provide more insight into what DB2 is doing based on new features in the latest release. The history of these changes can be found in the new Redbook “DB2 9 for z/OS Performance Topics”.  To get a current definition of what columns are in the PLAN_TABLE, you can always reference the current SQL Reference Guide. 

One of the complaints about looking at the raw data in the plan table is how to understand the meaning of the code values. People want a tool to provide this information in English text. Since I’m usually working without these tools, I like to write my own translation using native SQL using the CASE statement. The sample SQL below shows the use of the CASE statement. 

Remember the values change between releases. Use the “ELSE” to return the actual column value if no translation is provided.

SELECT QUERYNO , QBLOCKNO , APPLNAME      
  , SUBSTR(PROGNAME,1,8) AS PROGNAME      
  , PLANNO                               
  , CASE  METHOD                         
      WHEN 0 THEN 'FIRST TABLE ACCESSED' 
      WHEN 1 THEN 'NESTED LOOP JOIN    ' 
      WHEN 2 THEN 'MERGE SCAN JOIN     ' 
      WHEN 3 THEN 'SORT NEEDED         ' 
      WHEN 4 THEN 'HYBRID JOIN         ' 
      ELSE CHAR(METHOD)                  
    END METHOD                           
FROM PLAN_TABLE
ORDER BY PROGNAME, QUERYNO, QBLOCKNO
WITH UR;

The EXPLAIN statement can also provide some additional information if the following tables are available:

  • STATEMNT_TABLE – Estimated cost of executing SQL statement
  • FUNCTION_TABLE – Information about each function called in a SQLSQ statement.
  • STATEMENT_CACHE_TABLE – Hold output from statement cache trace IFCID 316 and 318
  • Populate using EXPLAIN STMTCACHE ALL

What’s new with EXPLAIN in DB2 9? A new column PARENT_PLANNO has been added to the plan_table. This column is used with PARENT_QBLOCKNO to connect a child query block to a parent miniplan for global-query optimization.

Related to EXPLAIN is the new Optimization Service Center (OSC), a GUI front end to report on SQL performance. To support this product an additional set of tables have been added:

  • DSN_PREDICAT_TABLE
  • DSN_STRUCT_TABLE
  • DSN_PGROUP_TABLE
  • DSN_PTASK_TABLE
  • DSN_FILTER_TABLE
  • DSN_DETCOST_TABLE
  • DSN_SORT_TABLE
  • DSN_SORTKEY_TABLE
  • DSN_PRANGE_TABLE
  • DSN_VIEWREF_TABLE
  • DSN_QUERY_TABLE
  • DSN_VIRTUAL_INDEXES

I’ll provide more information in the future about OSC. One thing that is really interesting is the DSN_VIRTUAL_INDEXES table used to support doing an explain using a virtual index.

If you find this information helpful, please leave a comment.