November 25, 2008

How Many Tables?

Depending on the database management system you support, you may have a different understanding as to what a database is. A DB2 for z/OS database is a collection of tables and table and index spaces. When you create a database, an internal representation of that database is maintained in a database descriptor (DBD). The objects that are defined in a DBD are tables, table spaces, index, index spaces, relationships, check constraints and triggers. The DBD also contains information on accessing the tables in the database.


A question DBAs often debate is: How many tables should be defined to one database? It's much easier for a DBA to store all the tables related to a given application in one database. (Or maybe it just seems easier because that's what we're used to doing.) The more tables you define to a database, the larger the DBD.


A large DBD can cause concurrency issues and impact SQL performance due to the large amount of logging and increased I/O. When a change is made to an object--like adding a column to a table or creating an index--a lock is taken. If the DBD hasn't been loading into the EDM DBD cache, then a lock is taken on the table space DBD01; otherwise the lock is taken on the EDM DBD cache. These locks can cause conflict with different types of operations. Static SQL doesn't acquire locks, so it won't cause conflicts. However, dynamic SQL and utilities acquire a shared lock--this conflicts with DDL operations like DROP, CREATE and ALTER. If you have lots of DDL type operations to objects defined to a large DBD, you'll see a significant amount of logging.


Another potential problem with large DBDs is the amount of memory required to support loading the DBD into the EDM DBD cache. DBDs created or altered since DB2 Version 6 can be loaded into memory in 32KB pieces; however, DBDs created in V6 or earlier require contiguous space. As you make changes to a DBD with create, alter and drop operations, the DBD space isn't reclaimed. You must run the MODIFY utility to reclaim the space.


It's always been the standard among DBAs to store one table per table space, especially if you know the table will become large. While my rule of thumb is about 50 tables per database with each table occupying its own table space, some recommend defining one table per database to get the best concurrency and performance. The trade-off of storing 50 tables in one database versus having 50 tables in 50 databases is concurrency over maintenance and security.


If you're considering using just one table per database, you must plan ahead and define a naming convention that will identify a group of databases that belong to the same application. This will help you with security and maintenance.


With DB2 V9 for z/OS, IBM is supporting the one-table-per-database rule by allowing DBAs to implicitly define the database and table space for tables created without a given database name. The maximum number of databases in a DB2 subsystem is 65,271. Now with DB2 V9, up to 10,000 of these databases can be implicitly defined.


IBM introduced the implicit database definition option to support vendors' efforts to port their applications from DB2 for LUW to DB2 for z/OS. Implictly defining the database makes sense in these situations, when the goal is to help with maintenance and security. However, most z/OS DBAs--who already rely on the mainframe's secure architecture--won't want applications built on z/OS to implicitly define a production database.


I like IBM's naming convention for implicitly defined databases. If you create a table without specifying the database name, DB2 generates a database with the name of DSN##### (where ##### is 00001 to 10000 and the table-space name is the first eight characters of the table name). Because the current limit is 10,000 implicit databases, you could see databases with names from DSN00000 to DSN10000. If you take IBM's naming convention and replace DSN with a three-character mnemonic representing your application, you can support individual database names for each table (e.g., Human Resource Service = HRS00001.)


If your company has already implemented the one-table-per-database recommendation, please post a message and tell us of your experience.