Blog
DB2utor

Categories

September 04, 2007

What Makes Compressed Indexes Tick?

Some of the top reasons for upgrading to DB2 9 for z/OS are CPU savings and DASD savings. The DASD savings are achieved through index compression.

DB2 has supported the compression of data in a table space for years, either through a table edit proc or through the COMPRESS table space option added in DB2 V3, which takes advantage of z/OS hardware compression. It wasn't until DB2 9 for z/OS that we could compress an index.

If you have worked on large systems, especially data-warehousing systems, then you have probably dreamed of some day having the ability to compress the data in the index just like you do for the data, especially since the space consumed by indexes on some of these systems is much greater than the space consumed by the compressed data in the table space. 

If you’re like me, you'll want to jump in and compress everything to save money on DASD. But like most everything new, you need to play with it to figure out what impact index compression will have on your application. 

Index compression isn't just like data compression. Unlike data compression, index compression doesn't need to build a compression dictionary. Indexes are compressed using a “prefix compression” concept long used by VSAM access method services, which relies on the data being in sorted key order. You can find more information on prefix compression in the DB2 Redpaper "Index Compression with DB2 9 for z/OS."

Some things to remember when you implement index compression: 

Page Size
The page size of an index prior to DB2 9 for z/OS is 4k. In DB2 9 you can have 8k, 16k and 32k page sizes. However, a compressed index will always be stored in a 4k physical page on disk and can never be in a 4k buffer pool. Let me repeat that: “A COMPRESSED INDEX CAN NEVER BE PUT IN A 4K BUFFER POOL.” If you try to put it in a 4k buffer pool with compress yes, you will get the following error:

DSNT408I SQLCODE = -676, ERROR:  THE PHYSICAL CHARACTERISTICS OF THE INDEX ARE INCOMPATIBLE WITH RESPECT TO THE SPECIFIED STATEMENT.

The benefit of the larger page size to your application is fewer page splits due to fewer leaf pages. This will benefit applications that do a lot of inserts. Applications that scan an index from the root page will see a reduction in getpages due to the reduced number of index levels.

What buffer pool should you use?
To figure out the best buffer pool in which to place the index, use the DSN1COMP utility. One thing you'll notice is that you'll probably never want to use a 32k buffer pool for indexes. This is because unlike data compression an index page isn't loaded into the buffer pool in compressed format. The data is uncompressed as it's being loaded into the buffer pool.  Because the data is being compressed to fit into a 4k page and then uncompressed to fit in an 8k, 16k or 32k page, you're going to have some wasted buffer-pool memory. In almost all cases you have a larger than normal amount of wasted space in the 32k buffer pool. The reason IBM doesn't leave the page compressed is because an index page is read repeatedly and you wouldn't want to go through the overhead of decompressing it each time you read the page.

The following is an example of running DSN1COMP utility for two different index space on the same table. The first report is for an index space that contains a char(06) column. Looking at this report, I may decide not to compress this index. If I use the 8k buffer, then I’m going to have a 45-percent reduction in DASD with 9-percent waste of memory in the buffer pool.  The 16K buffer provided a little better DASD compression but wastes 53 percent of the buffer pool.

Notice in the second report that you get a 51-percent reduction using an 8K buffer with no buffer-pool waste and 76-percent reduction with no buffer-pool waste.

              Report 1: Char(06)                                                                            
              8  K Page Buffer Size yields a                                 
             45  % Reduction in Index Leaf Page Space                        
                 The Resulting Index would have approximately               
             55  % of the original index's Leaf Page Space                  
              9  % of Bufferpool Space would be unused to                   
                 ensure keys fit into compressed buffers                     
                 ----------------------------------------------             
             16  K Page Buffer Size yields a                                 
             46  % Reduction in Index Leaf Page Space                        
                 The Resulting Index would have approximately               
             54  % of the original index's Leaf Page Space                  
             53  % of Bufferpool Space would be unused to                   
                 ensure keys fit into compressed buffers                     
                 ----------------------------------------------               
             32  K Page Buffer Size yields a                                  
             46  % Reduction in Index Leaf Page Space                         
                 The Resulting Index would have approximately                  
             54  % of the original index's Leaf Page Space                     
             76  % of Bufferpool Space would be unused to                     
                 ensure keys fit into compressed buffers                      

              Report 2: Char(03)                                                                            
              8  K Page Buffer Size yields a                                  
             51  % Reduction in Index Leaf Page Space                         
                 The Resulting Index would have approximately                  
             49  % of the original index's Leaf Page Space                     
                 No Bufferpool Space would be unused                           
                 ----------------------------------------------               
             16  K Page Buffer Size yields a                                  
             76  % Reduction in Index Leaf Page Space                         
                 The Resulting Index would have approximately                  
             24  % of the original index's Leaf Page Space                     
                 No Bufferpool Space would be unused                           
                 ----------------------------------------------               
             32  K Page Buffer Size yields a                                 
             80  % Reduction in Index Leaf Page Space                        
                 The Resulting Index would have approximately                
             20  % of the original index's Leaf Page Space                   
             37  % of Bufferpool Space would be unused to                     
                 ensure keys fit into compressed buffers                     

You may see some overhead on inserts but you definitely will see CPU savings in reduced I/Os on the SELECT side. So if you are in a large data-warehouse environment, you're going to want to move to compressed indexes as soon as possible. As always I like to learn from other people and their experiences. Please send me a note with your experiences or make a comment on this blog. Thank you!