Blog
DB2utor

Categories

July 08, 2014

Happy 7th Anniversary to DB2utor

I was recently talking to another DB2 professional who said that when he's looking for DB2 information, he usually ends up at DB2utor. That was a very nice complement. It's also the sort of impact I hoped this blog would have when I started it.

As it happens, DB2utor launched seven years ago this month. While this site houses a wealth of information on DB2, DB2 itself, of course, is ever-evolving thanks to the efforts of IBM developers. DB2utor will also continue to evolve, and I hope you'll be a part of it. I've always relied on readers to tell me what they're interested in. Check the DB2utor archive and you'll find more than 100 tags covering DB2 and related topics. If there's something you believe should be added to that list, please let me know.

I’m thankful that people take the time to read these posts, and I'm especially appreciative of those who take a moment to post a comment on the blog or contact me directly. In short, I'm grateful for your continued support.

July 01, 2014

Redbooks Examines DB2 Analytics Accelerator 4.1

Recently I shared my impressions of the latest version of IBM DB2 Analytics Accelerator. Now I'm here to let you know that there's a new Redbook that details the features found in Version 4.1.

"Reliability and Performance with IBM DB2 Analytics Accelerator 4.1" is currently a draft publication. The formal publishing date is set for the end of July. I'll list just a few highlights:

* Chapter 5 examines new query accelerator features, including static SQL support as well as support for “NOT IN <subquery>” and many other new sub-query predicates. Numerous customers I talk to believe that their current workloads aren't large enough to run cost-effectively on DB2 accelerators. However, with the recent query enhancements, it may be time to reconsider this. Use the monitoring tools and you may discover that SQL statements that previously couldn't run on the accelerator now can.

* Chapter 6 provides a look into the primary factors that affect throughput of loading data into the accelerators -- partitions and table number and sizes; processors (number and speed), and; Netezza host. The chapter goes through different scenarios to help you understand your database design and the best way to load the data.

* Chapter 7 covers online data archiving with the High-Performance Storage Saver.  When archiving occurs, the table space partition is put into a persistent read mode with a status of PRO. The process to archive and restore is handled by two separate stored procedures. However, I'm not sure how to remove the archived partition when you no longer want to save it on both DB2 and the accelerator. Hopefully the process of purging old data from the accelerator will be clarified in the final version of the Redbook.

* Chapter 9 is devoted to monitoring enhancements. Having spent most of my career tuning SQL, I find this information extremely interesting. There are several options for monitoring the accelerator: DB2 Accelerator Studio, DB2 commands, query history tables and DB2 monitoring tools from IBM or third-party vendors.

The DB2 command –DIS ACCEL(name) DETAIL provides information about active and queued queries, wait times and disk storage utilization for the named accelerator as well as totals for all subsystems connected to the named accelerator.

As for real-time monitors, IBM continues to add information to the standard interfaces it and third-party vendors use to monitor DB2 activity. The first set of information allowed customers to see what is happening on the Netezza box. The next set of information provided more insight into which statements are running on the accelerator and how much was saved from general DB2 processing. In addition to monitoring utilization levels, customers need to know that the statements they create are capable of running on the accelerator. Making this information available real-time via the monitoring tools adds considerable value to this product.

Here's the complete table of contents:

Chapter 1. The analytics lifecycle

Chapter 2. Database design considerations

Chapter 3. Overview of IBM DB2 Analytics Accelerator V4.1

Chapter 4. Installation and maintenance procedures

Chapter 5. Query enhancements

Chapter 6. Load and incremental update

Chapter 7. Online data archiving

Chapter 8. High availability, disaster recovery, and workload balancing

Chapter 9. Monitoring enhancements

Chapter 10. The Accelerator Loader

Appendix A. Preparing input data for Accelerator Loader

I feel this publication is well worth your time. If you read it, please share your thoughts in comments.

June 24, 2014

Capturing Dynamic SQL with EXPLAIN

I’ve already written extensively about EXPLAIN, but since IBM continues to develop the feature and enhance the process, there's plenty more to cover.

For instance, I recently started experimenting with the SET CURRENT EXPLAIN MODE statement, which is used to capture EXPLAIN data for dynamically executed statements. The challenge with dynamic SQL is that -- because it's built dynamically based on input during execution -- you may not know how the statement will appear until it's executed. This makes it challenging to prebuild and add an EXPLAIN command to determine the access path DB2 will use. However, starting with DB2 10, applications can be modified to take advantage of the SET CURRENT EXPLAIN MODE statement. The options are:

  • NO specifies that no EXPLAIN information is captured. NO is the initial value of the EXPLAIN MODE special register.
  • YES enables the EXPLAIN facility and causes EXPLAIN information to be inserted into the EXPLAIN tables for eligible dynamic SQL statements after the statement is prepared and executed. All dynamic SQL statements are compiled and executed normally.
  • EXPLAIN enables the EXPLAIN facility and causes EXPLAIN information to be captured for any eligible dynamic SQL statement after the statement is prepared. This setting behaves similarly to YES; however, dynamic statements (except for SET statements) aren't executed.

To enable a program to execute a hard-coded statement, pass in a parameter to drive the program logic or simply pass in a value to be used in a host variable on the SET statement.

This is hard-coded logic:

    IF pass-value = ‘NO’ then

       SET CURRENT EXPLAIN MODE = NO

    ELSE IF pass-value = ‘YES’ then

       SET CURRENT EXPLAIN MODE = YES

    ELSE IF pass-value = ‘EXPLAIN’ then

       SET CURENT EXPLAIN MODE = EXPLAIN

               END-IF

           END-IF

    END-IF.

 

Here's how to pass in a value for a host variable:

   SET CURRENT EXPLAIN MODE = :pass-value

 

The host variable must be defined as CHAR or VARCHAR, and it must be set to a valid option (NO, YES or EXPLAIN). Leading blanks aren't allowed. All input values must be uppercase. In addition, they must be left-justified within the host variable and padded on the right with blanks if the length of the value is less than the length of the host variable.

The host variable is a simple option for turning on/off EXPLAIN processing during program execution. I used this script in SPUFI to test it out:

 

1 - DELETE PLAN_TABLE;
2 - COMMIT;
3 - SET CURRENT EXPLAIN MODE = YES;
4 - SELECT * FROM SYSIBM.SYSTABLES WHERE NAME = 'SYSTABLES';
5 - SET CURRENT EXPLAIN = NO;
6 - SELECT * FROM PLAN_TABLE;

 

Notice I turned on the auto EXPLAIN feature with statement 3, and then shut it off with statement 5. I didn't want statement 6 to appear in the EXPLAIN tables. I did find an interesting contradiction to the documentation during my testing in SPUFI. When I set mode to EXPLAIN at statement 3, I received the message below warning me that the statement wasn't executed. I also found that nothing after statement 4 was executed.  So obviously, statement 6 wasn't executed, either.

Note: When you use mode EXPLAIN you will receive warning SQLCODE +217:

DSNT404I SQLCODE = 217, WARNING:  THE STATEMENT WAS NOT EXECUTED AS ONLY EXPLAIN INFORMATION REQUESTS ARE BEING PROCESSED

Have you had an opportunity to use this feature to help you EXPLAIN dynamic SQL statements? Please share your experiences in comments.

June 17, 2014

New Stored Procedure Simplifies EXPLAIN Maintenance

Managing EXPLAIN tables can be time-consuming. DB2 developers often create EXPLAIN tables and allow them to exist on the system for years without being upgraded to include the latest columns or support the newer table types that have been introduced in recent DB2 releases. Because these tables aren't current, they're incapable of utilizing key features that are designed to help diagnose the access paths DB2 has chosen and why.

In short, if you're a developer and you rely on EXPLAIN tables that are based on older DB2 releases, you're missing out.

A new stored procedure, SYSPROC.ADMIN_EXPLAIN_MAINT, is designed to simplify the process of upgrading existing EXPLAIN tables. It can also be used to create the new tables that are introduced with each release of DB2. (See, for example, DSN_PREDICATE_SELECTIVITY, a new table in DB2 11.) So if you have an old set of tables, this procedure will make sure they're current. And if you're missing any of the latest tables, it will create them for you.

IBM is releasing this new procedure through maintenance APAR PI05200 (NEW STORED PROCEDURE SYSPROC.ADMIN_EXPLAIN_MAINT) for DB2 11 only. It was closed for customer testing on April 25 and will now generate PTF UI17423.

To learn more about this new stored procedure and view a sample program used to call it, see the DB2 11 Managing Performance Guide (updated April 2014).

 

June 10, 2014

Excluding NULL Key Columns

I've never been a fan of setting columns to null, because this introduces complexities into the application code. I'm certainly not the only one who feels this way, either. Check out this well-written screed that explains in detail why NULL values in a database are a programmer's nightmare.

Increasingly though, applications with database designs that incorporate NULL values -- developed by SQL Server, Oracle, Sybase and others -- are being brought to DB2 for z/OS. Of course DB2 for z/OS already supported these distributed databases, but DB2 11 provides new function that makes it easier to deal with NULL settings.

DB2 11 features the capability to exclude NULL key columns, which is designed to improve performance and reduce index sizes. When NULL key columns are excluded from an index, DB2 only adds an entry when the key column is not NULL. To specify that an index excludes NULL keys, create the index using the CREATE INDEX statement.

As you can imagine, if your data has a high percentage of NULL values in key columns, then the index will be much smaller and a scan will perform much more quickly since fewer entries will be in the index.

If you're currently supporting a database with NULL key columns and have had a chance to take advantage of the new DB2 11 feature to exclude NULL keys, please share your experiences in comments.