April 01, 2008

The Origins of SYSDUMMYU

Sometimes developers need to retrieve data from a DB2 special register or invoke a DB2 built-in function against a program host variable. However, DB2 requires that a table be referenced in order to process these statements.

One issue developers must deal with is ensuring that only one row of data comes back. If you use a large table and forget to qualify the data, then the value will be returned for each row of data in the table. One way we handled this issue in the early days was by using the DB2 catalog table SYSTABLES with the row filtered based on table name and type.

Here's how you'd use SYSTABLES to retrieve the current date:

WHERE creator = 'SYSIBM'
and name = 'SYSTABLES'
and type = 'T';

However, this didn't sit well with DBAs, because programs that used a catalog table acquired locks on the catalog. This caused contention for DBA jobs that needed to update and modify the catalog. IBM saw the impact this was having on availability and delivered a replacement for SYSTABLES. The new table, sysibm.sysdummy1, contained one column with one row of data. This removed the depency on the catalog, and thus eliminated the availability issue. SYSDUMMY1 became an industry standard.

Another benefit of SYSDUMMY1 is that it greatly simplified the SQL code, as you can see in these examples:

Read the contents of the current date special register:


Retrieve information using a built-in scalar function:


The sysibm.sysdummy1 table met all the requirements for processing DB2 statements that required a table, but the data in the table wasn't important. That is, it wasn't important until the introduction of ASCII and Unicode encoding. Like most tables in z/OS, sysibm.sysdummy1 is encoded as EBCDIC. Most programs are compiled and bound with EBCDIC CCSID.

This became a problem with Java and ODBC programs that needed to process LOB data. LOB data is usually stored in Unicode format. When a Java application processed an LOB locator-based operation that has LOB data in Unicode, the program would perform poorly; in some cases it would even fail, with a character-conversion error going from EBCDIC to Unicode.

To address this issue, IBM released APAR PQ85495. This fix introduced three new dummy tables, one for each of the three different encoding schemes:

ASCII – sysibm.sysdummya
EBCDIC – sysibm.sysdummye
Unicode – sysibm.sysdummyu

A later fix (APAR PK27025) requires these tables when using the type-4 DB2 Universal Driver.

So here's the difference when processing LOB data stored in a Unicode table.

SYSDUMMY1 table:

SELECT LENGTH(:dbcloblocator)

Unicode table:

SELECT LENGTH(:dbcloblocator)

As you can see, this allows developers to recognize that they're dealing with Unicode data. Using sysibm.sysdummyu with the LENGTH function ensures the proper length is given for the Unicode data. When using JDBC type-4 universal drivers, this conversion is handled automatically as long as maintenance APAR PK27025 is applied.