June 01, 2010

Getting a Handle on REORG

I consider the REORG utility to be one of the most important tools in the DBA toolbox. It's also one of the toughest utilities to understand. REORG is used to improve data access and reclaim fragmented space. It can run on the table space as well as the index space.

In the early days of DB2, we feared REORG, because if a REORG went down, it'd mark the table space with a utility pending status. In most cases, the utility would fail due to a space shortage in either the work files or in the volumes on which these files reside. Oftentimes the only way out of this situation would be to terminate the utility -- then you'd have to recover the table space, something that could take hours (and you'd still be left with an incomplete REORG).

The introduction of the high availability option, SHRLEVEL CHANGE, was a huge enhancement, but the catch is you need enough disk space to duplicate the table space. I recall the first time I wanted to implement SHRLEVEL CHANGE; the storage management team wasn't real keen on the idea.

When TEMPLATE arrived on DB2, it helped with the dynamic allocation of datasets. It also meant that DBAs no longer had to hard-code DD statements in jobs or when setting dataset size. However, as table spaces grow you still must remember to go back and change these jobs to increase the size of the files. If you don't specify disk space parameters, DB2 estimates the size for you. To improve the accuracy of the default space estimation, make sure to run the RUNSTATS utility with the UPDATE SPACE or UPDATE ALL options before running any of these utilities: CHECK DATA, CHECK INDEX, CHECK LOB, REBUILD INDEX, REORG INDEX, REORG TABLESPACE.

Some of the TEMPLATE parameters are used to automate the primary and secondary space allocations on the datasets used by the utility. These parameters use values in the catalog that are set by RUNSTATS. So, especially if you have a high volume of growth, it's important to ensure the statistics are up to date before running a REORG. The parameters used are PCTPRIME, MAXPRIME and NBRESCND. PCTPRIME specifies the percentage of the estimated required space that is to be obtained as the primary allocation size. The default is 100 percent. However, if your table space is larger than a disk volume, you'll receive an error. You can use MAXPRIME to set the upper limit to avoid this problem with large objects. Once the primary restriction is applied (PCTPRIME or MAXPRIME), the remaining amount of space is divided by NBRSECND to determine the size of the secondary space allocations.

Two other challenges with REORG involve processing partitioned table spaces and dealing with non-partitioning indexes in a high-availability environment. DB2 version 9 introduced new online SHRLEVEL CHANGE processing that removes the BULID2 phase.  Prior to DB2 9, data that undergoes reorganization with either SHRLEVEL CHANGE or SHRELEVEL REFERENCE is unavailable to applications during the relatively lengthy BUILD2 phase. Due to the changes in how REORG processes the partitions and non-partitioning indexes to improve availability, you may in fact be forced to change the way you currently run your SHRLEVEL CHANGE reorgs.

If you're looking to better understand REORG and other DB2 utilities, I highly recommend the IBM Redbook, "DB2 9 for z/OS: Using the Utilities Suite."

Among many other things, this publication gives a detailed explanation (in Chapter 8.2) of the elimination of the BUILD2 phase in DB2 9. In general, the Redbook offers a complete guide to DB2 utilities.

So what have you found with the new DB2 9 online REORG? What do you think of other features delivered with DB2 9 and the utility suite? Please register your thoughts in Comments.