June 24, 2008

Batch Tuning: Do More in Less Time

A large percentage of business is run with online applications. When you go to the grocery store and make a purchase or go to an ATM to withdraw money, you're using an online application. These applications are designed to complete a transaction in seconds.  Once you withdraw money or purchase a product, this activity must be processed and posted through back-end processing to customer accounts. The back-end processing is designed to process large amounts of data in batch (no online input or human interaction). In contrast to the immediacy of consumer transactions, the back-end applications that allow them usually take minutes, if not hours, to complete.

When designing batch systems, keep these goals in mind:

  • Reduce Elapse Time
  • Reduce CPU Usage
  • Reduce I/O
  • Checkpoint Restart
  • Continuous Availability

Here, we'll focus on the first goal, reducing elapse time. To do more in less time you need to organize the data so you can process multiple batch jobs in parallel using key ranges, table space partitions or functional groups (like region, country and department).

You can also take advantage of DB2's capability to initiate multiple parallel operations (I/O, CPU or Sysplex) when accessing data or indexes in a partitioned table space. A description of these taken from the "DB2 for z/OS Performance Monitoring and Tuning Guide."

Query I/O parallelism manages concurrent I/O requests for a single query, fetching pages into the buffer pool in parallel. This processing can significantly improve the performance of I/O-bound queries. I/O parallelism is used only when one of the other parallelism modes cannot be used.

Query CP parallelism enables true multitasking within a query. A large query can be broken into multiple smaller queries. These smaller queries run simultaneously on multiple processors accessing data in parallel. This reduces the elapsed time for a query.

To expand even further the processing capacity available for processor-intensive queries, DB2 can split a large query across different DB2 members in a data sharing group. This is known as Sysplex query parallelism. Information on parallelism can be found in the Performance Monitoring and Tuning Guide.

You can run as many parallel jobs as you want when using multiple batch jobs and passing parameters to control the key range or group of data that needs to be processed. But while this allows you to control the number of parallel task that run, you must also maintain more JCLs. In addition, changing key ranges and/or groups requires changes to the batch jobs.

On the other hand, you could let DB2 determine the number of parallel tasks. The benefit here is you're taking advantage of an optimized feature built into DB2, one that reduces the number of batch jobs you must maintain. The downside is that DB2 may not process the query using parallelism. A table in the Performance Monitoring and Tuning Guide spells out what happens when parallelism isn't used. Some examples follow.

  • Query access via RID list with list prefetch and multiple index access--In this case,  I/O and CPU parallelism are used, but NOT sysplex query parallelism. Queries that return LOB data will use I/O, CPU and sysplex query parallelism. Using an EXISTS within a WHERE predicate will not use any parallelism.

You can also help reduce the elapse time of your entire batch schedule by having no more than one DB2 program executing in a batch job. Say JOB1 is executed in two steps (J1Step1 and J1Step2). J1Step1's elapse time is 15 minutes, while J1Step2's is 30 minutes. Now you schedule JOB2, and it abends due to a timeout caused by contention with J1Step1. So you reschedule JOB2 to run after JOB1, which adds an additional 15 minutes to the batch schedule. The total time to process JOB1 and JOB2 is 60 minutes (45 for JOB1 and 15 for JOB2). If you could schedule JOB2 to run after J1Step1, the total elapse time would be 45 minutes (see Figure 1).Download db2utor_fig1.jpg

Over the next few weeks we'll look at batch tuning and ways to reduce CPU, I/O and checkpoint restart, while designing for continuous availability.