Blog
DB2utor

Categories

October 16, 2012

Returning Detail and Summary Data

Recently someone on the DB2-L list serve asked if there's a way to return all the detail rows along with a summary row in a single SQL statement.

It's a very good question. When people ask for a better way, they're generally seeking a more efficient option. My SQL sample below utilizies common table expression (CTE). CTE is designed to allow tasks to perform quickly while minimizing CPU usage.

Prior to CTE's introduction in DB2 9, queries would have to be written using GROUP BY and UNION ALL. Here's an example using the DB2 catalog tables. I listed the tables in the DSNDB06 database along with the number of columns on the table. This code also conducts a total count of columns for all tables in DSNDB06.

EXPLAIN ALL SET QUERYNO = 1001 FOR

SELECT  T.DBNAME AS DBNAME

           ,   SUBSTR(T.CREATOR,1,8) AS CREATOR

           ,   SUBSTR(T.NAME,1,18) AS NAME

           ,  T.COLCOUNT AS COLCOUNT

 FROM SYSIBM.SYSTABLES T

 WHERE T.DBNAME = 'DSNDB06'

      AND T.CREATOR = 'SYSIBM'

      AND T.TYPE = 'T'

union all

SELECT   '********' AS DBNAME

             ,  '********' AS CREATOR

             ,  '********** TOTALS:' AS NAME

             ,  SUM(T.COLCOUNT) AS COLCOUNT

FROM SYSIBM.SYSTABLES T

WHERE T.DBNAME = 'DSNDB06'

     AND T.CREATOR = 'SYSIBM'

     AND T.TYPE = 'T'

GROUP BY 1,2,3;

 

Now let's see how this is done with CTE. With CTE, you can write the select to create the list of tables and column counts. So I used CTE and then wrote a query to process this new virtual table so that it lists the detail rows. Finally, I used UNION ALL to create a query that summarizes the number of columns. Both of these queries run against the virtual table, so DB2 passes through the data just once. This greatly reduces the getpages, which reduces processing service units.

EXPLAIN ALL SET QUERYNO = 1002 FOR

WITH CTE_ROWS

 AS (

        SELECT T.DBNAME AS DBNAME

                  ,  SUBSTR(T.CREATOR,1,8) AS CREATOR

                  ,  SUBSTR(T.NAME,1,18) AS NAME

                  ,  T.COLCOUNT AS COLCOUNT

        FROM SYSIBM.SYSTABLES T

        WHERE T.DBNAME = 'DSNDB06'

             AND T.CREATOR = 'SYSIBM'

             AND T.TYPE = 'T'

     )

     SELECT DBNAME

               ,  CREATOR

               ,  NAME

               ,  COLCOUNT

     FROM CTE_ROWS

     UNION ALL

     SELECT  '********' AS DBNAME

                 ,  '********' AS CREATOR

                 ,  '********** TOTALS:' AS NAME

                ,  SUM(COLCOUNT) AS COLCOUNT

     FROM CTE_ROWS;

 

To understand how the second statement is more efficient than the first, use the SQL EXPLAIN function and review the cost of the statement found in the DSN_STATEMNT_TABLE table. To learn more about EXPLAIN and SQL performance in general, see the DB2 10 for z/OS Managing Performance guide.

This query can be used to get the processor cost from DSN_STATEMNT_TABLE:

 

SELECT ST.QUERYNO

          ,  ST.COST_CATEGORY

          ,  ST.PROCSU

          ,  SUBSTR(ST.REASON,1,40) AS REASON

 FROM DSN_STATEMNT_TABLE ST

 WHERE ST.QUERYNO IN (1001, 1002)

 

The output from the above query : 

QUERYNO        COST_CATEGORY      PROCSU    REASON

--------       -------------      ------    ----------------------

    1001             A                23

    1002             B                 3    TABLE CARDINALITY

 

As you can see from these results, the processor cost in service units is greatly reduced when CTE is used to process the query. To understand why the cost category is "B," look at the REASON column and note the Table Cardinality reason code. Cardinality statistics are missing for one or more of the tables that are used in the statement, so you must run the runstats utility and collect statistics for columns used in the where clause. That's table cardinality.

This is just one way to take advantage of CTE. How do you use CTE? Please post in Comments so we can benefit from your experience.