July 20, 2009

Point-in-Time Recovery Sans Quiesce

One of the DBA's primary jobs is to ensure that data is safe and recoverable from any unexpected disastrous event. DB2 provides several utilities to help backup the data either at a volume level, using the BACKUP SYSTEM utility, or at a table/index space level, using the COPY utility. To support concurrency, COPY is usually run with the SHRLEVEL(CHANGE) command, which allows users to update the data while it's being backed up. DBAs would then follow COPY by using the QUIESCE utility to generate a consistency point for a point-in-time recovery (PITR).

The easiest and best way to recover a group of related DB2 objects is to create a consistent recovery point using QUIESCE. However, this isn't always possible, especially in a busy 24X7 shop. If QUIESCE cannot get a lock on all of the objects being quiesced, it will fail. In these situations getting a consistency point is very difficult. 

The RECOVER utility supports PITR with these options:

DB2 9 for z/OS offers some improvements that help in recovering objects to a consistent state when no consistency point is available. The issue with not having a consistency point when recovering to a point in time is that some transactions may be in one of these statuses: inflight, indoubt, inabort. RECOVER is now capable of analyzing and detecting uncommitted transactions running based on the point-in-time provided on the TORBA or TOLOGPOINT parameters. Any uncommitted transactions are rolled back, thus leaving the objects in a consistent state.

DB2 cannot ensure a consistent point-in-time recovery when using TOCOPY, TOLASTCOPY or TOLASTFULLCOPY. The reason is that multiple objects can be included in the recovery list and DB2 cannot determine which time should be the point-in-time for the recovery. If you must use one of these options, it's highly recommended that the copy be share level reference.