Not that long ago, implementing a test machine alongside a "prod" machine was a given. Hardware simply wasn't as reliable back then. So, to protect themselves from hardware failure, companies would install a hot standby backup along with their production machine -- just in case. Since that backup box typically sat idle, many companies opted to run test workloads on it. At least this way, that second machine was doing something worthwhile.
However, with the advent of Live Partition Mobility and PowerHA -- and with more Reliability, Availabilty and Serviceability (RAS) built into newer hardware -- it's more or less assumed that machines will stay up. And somewhere between then and now, the distinction between prod and test has started to blur.
Almost three years ago I saw my first Live Partition Mobility demo, and I immediately went from skeptic to true believer.
But even now, I find many customers can't quite believe what they're seeing. For instance, a few weeks back I was demonstrating how to move a busy LPAR from one frame to another. The customer had the same skepticism I had back at the beginning: Will it work? Will I drop packets? Is this smoke and mirrors and magic?
Yes, it works. No smoke, no mirrors -- and no dropped packets.Because you can quickly and easily move workloads around your environment, you're freed from the entire concept of "this frame is production" and "that frame is test." You can concentrate on properly mixing workloads across the environment based on need and available resources. You can create uncapped partitions with proper values for the weights of your partitions. If the machine has free cycles, you can allocate them on a very granular level. If one machine becomes constrained, you can easily shift your workload to another frame that can better handle the load.
When my customer and I were discussing PowerHA and whether they wanted the capability of failing multiple LPARs, a comment was made -- and a lightbulb went on in the minds of those present. What if you set things up the "old way," your production frame dies for some reason, and you need to failover your prod workload? Should the whole environment failover at once, or would it be preferable to have half of prod failover while the other half keeps on processing? After all, in a mixed environment with production LPARs running on different physical machines, losing a frame means only failing a subset of the environment as opposed to the whole thing.
CPU micro-partitioning, PowerVM server virtualization, Live Partition Mobility and PowerHA are all game changers. When we plan for these technologies, we must also rethink the way our systems are implemented. Though it's tempting to still think in terms of standalone systems, alternatives are now possible. Rather than separate prod from test, we may find that mixing production with test on the same frame might make perfect sense.
Note: IBM is hosting a pair of webcasts on future trends relating to Power Systems. Register here and here.





If I recall correctly, the IBM Redpaper "APV Deployment Examples" (http://bit.ly/c6wBDb) back in 2007 recommended mixing Prod and Dev on the same physical server.
It seems smarter to mix vastly different loads on the same managed system than to try to isolate them, even though they often have many mutual dependencies.
Posted by: Anthony English | July 21, 2010 at 07:22 AM