December 10, 2007

Most Consumable DB2 9 Improvements Require Minor Work

Last week I listed some new DB2 9 features that can help you while requiring essentially no effort on your part. These benefits are in utility CPU reduction, logging, larger prefetch sizes, LOB performance and DDF virtual storage constraint relief.

Now let's look at some helpful DB2 9 features that can be implemented with minimal effort.

Optimization Service Center (OSC)--OSC, a much more integrated and robust tool, replaces Visual Explain. The new Redbook, “DB2 9 for z/OS: New tools for Query Optimization,” details the new features, but one that stands out is the profile monitor, which allows you to discover your normal SQL activity and any new problem SQL. You can then analyze the access paths of these SQL statements and use the Statistics Advisor to calculate and gather the runstats that are needed to improve the statement's performance.

Online reorg--The major change here is the removal of the build2 phase. Prior to DB2 9 the build2 phase locked all non-partitioned indexes (NPIs), thus locking out all applications. For very large tables with large NPIs, this could result in a major outage. To take advantage of online reorg, you'll want to REORG contiguous partitions in the same job. By doing this you can take advantage of multiple task processing each partition and, at the same time, the NPI indexes. This greatly reduces the elapse time and CPU cost and greatly improves data availability during the online reorg.

Improved runstats--The runstats utility gathers summary information about the storage usages of table spaces, indexes and partitions. It also gathers data distribution statistics, which are used by the optimizer to calculate the most efficient access path for each SQL statement.

Prior to DB2 9, the information collected was cardinality, or frequency of values in a given column or group of columns. DB2 is great at calculating the best access path when the SQL uses equal predicates (e.g., (WHERE hire_date = '2007-11-30')). However, it's much tougher to calculate how many rows will be retrieved with a range predicate like (WHERE hire_date BETWEEN '2006-12-01' and '2006-12-31'). The reason is that DB2 assumes an even distribution of data. So if you have a large percentage of data falling into a given date range, you may get better performance by accessing with a table-space scan instead of an index scan.

A new option in DB2 9 is to collect histogram statistics. The histogram option collects the frequency statistics of the distinct values of a column cardinality over an entire range, giving the optimizer more information for a potentially better access path.

Optimization improvements--The area of SQL access path processing has seen many performance enhancements. The key is to refresh your current runstats and then rebind your packages to take advantage of these new optimization algorithms. Once you migrate to DB2 9 new function mode, you'll realize performance gains with your existing applications in these areas:

  • DISTINCT and GROUP BY--With these enhancements, you can avoid sorts and reduce get pages on work files.
  • Dynamic prefetch--These enhancements allow regular index access.
  • Global query optimization--This improves SQL using sub query.
  • Complex query optimization--This reduces the number of partitions being accessed by applying a page range screening.
  • Generalized sparse indexes and memory data caching.
  • Dynamic index ANDing for star join query.
  • LOB--Specifically, the new LOB file reference and the new FETCH CONTINUE features. LOB file reference is used for insert into a table or file, while FETCH CONTINUE is designed to improve memory management when retrieving LOB data.

Virtual storage constraing relief (VSCR) and EDMPOOL--DB2 9 moves many DB2 structures above the bar in the DBM1 address space to provide more virtual memory for those structures that must remain below the bar. Structures for EDMPOOL and the DIST address space (DDF) are also moved above the bar. To take advantage of the EDMPOOL changes, use package to improve memory management with smaller objects in the pool. Use bind option RELEASE(COMMIT) when appropriate to cause the object to be removed from the pool sooner. Also, bind option DEGREE(ANY) can add 50-70 percent more storage for the cursor table (CT) and package table (PT). When possible, use DEGREE(1) to reduce the amount of memory required by a package.

REOPT(AUTO)--New to DB2 9, this bind option tells DB2 to examine the values in the host variables at execution time and rebuild an access path. When the host variables change and DB2 determines a new access path is needed to reduce the number of pages to process, the old access path is cached and reused.

Reorder row format--Prior to DB2 V9, each variable-length column in a data row was stored with its length preceding it in a 2-byte field. A field that follows the variable-length column has a variable offset. To access a column with a variable offset, DB2 must scan the row to find the column of interest. This scanning process starts at the first variable-length column and continues till DB2 reaches the column to process. If you update any variable-length column, DB2 logs the changed column and all following columns to the end of the row. For these reasons, the old recommendation was that all variable-length columns and most frequently changed columns be placed at the end of the row. However, a lot of times you don’t have an option to change the physical structure. Now in DB2 9 the physical row is reformatted automatically on insert. Instead of placing the length field in front of each variable-length column, DB2 stores the offset of all columns after the last fixed-length !
column, followed by each of the variable-length columns. For existing table spaces you need to perform a reorg to have the data reformatted.

Index page size--Prior to DB2 9 the page size for an index was limited to 4KB buffer pools. With DB2 9 New Function Mode (NFM), you can specify 4, 8, 16 or 32KB buffer pools. The larger page size can help reduce the need to split index pages, and can boost performance for sequential processing and compression ratios.

You can convert existing indexes to a larger page size by using the ALTER INDEX statement.

Index (on expression)--Prior to DB2 9 you couldn't make a predicate indexable if it had functions or expressions that must derive a value like column_name +1 or SUBSTR(column_name,1,10). Now you can create an index with the expression you want indexed, and DB2 will use it.

Hopefully this quick list--and last week's--get you and your management excited about DB2 9 and help start the upgrading process.