The announcement of the IBM POWER7 systems is really exciting! This announcement is primarily a hardware announcement though. You and i covers this announcement from an IBM i perspective. Although the announcement is about IBM's new POWER7 servers, there’s IBM i support for these new systems in the 6.1.1 release.
I'll review a few of the key features that POWER7 provides that are supported in the IBM i 6.1.1 release.
4 SMT Threads per Core
POWER7 supports Simultaneous Multithreading (SMT) with four hardware threads, which provides more capacity per core. Power hardware has had SMT support for many years. Initially, it was two hardware threads of execution on a processor core. With POWER7, the number of hardware threads has been increased to four.
In addition, with POWER7 we’ve also added the capability to support “dynamic switching.” Dynamic switching allows the hardware to allocate resources to optimize the processing capability. So while the core may be configured to use four threads, if the work to be done only takes one thread, the other three threads can free up resources, giving that single thread greater performance.
You control the SMT behavior with Processor multitasking (QPRCMLTTSK) system value. The default for this system value is “system controlled,” which means the operating system determines the optimal setting; for POWER7 systems, this is dynamic switching with four hardware threads. You can explicitly turn off SMT support and always run the cores in a single-threaded mode by setting QPRCMLTTSK to *OFF. In general, the default setting will be the best for most applications; in fact, IBM doesn’t recommend the use of single-threaded mode on POWER7 since the system automatically switches each core to single-threaded mode if there’s just one task executing on the core. The article “Simultaneous Multi-Threading on POWER7 Processors” has a lot more information for those of you interested in the details.
Energy Management Features
Energy management features were introduced with the POWER6 systems. The IBM Systems Director Active Energy Manager plug-in measures, monitors and manages the power components built into IBM systems.
While the energy management features in POWER7 are functionally similar as POWER6, both IBM i and the underlying hardware have been enhanced to allow for increased energy efficiency and more intelligent power savings. Release 6.1.1 of IBM i is now more tightly coupled to EnergyScale functions built-in to PowerVM, the system hardware and the POWER processor itself. For more information on POWER7 and energy management, see the EneryScale Whitepaper.
What About That Processor Frequency?
Historically, the one thing about the ever-increasing performance of processors has been that the processor frequency always improves. However, POWER7's design improves performance per core as well as a massive improvement in capacity per chip with a processor core frequency roughly 30 percent less than that of POWER6. We've all gotten so used to frequency being the performance touchstone in processors that all the other intriguing things that make a processor fast tend to get overlooked. And some of them—like processor cache, memory bandwidth, and Non-Uniform Memory Access (NUMA)—matter. POWER7 has a number of advantages over POWER6, including a much faster L2 cache, a large on-chip L3 cache and twice the memory bandwidth. And POWER7, like POWER5, uses “out of order” execution.
Check out the article “Of Gigahertz and CPWs – P7” for a discussion of what else is going on in these modern processors and what it means to the execution of your programs.
POWER7 has eight cores on a single chip. That's a tremendous amount of processing capacity on a single chip. See “What's This Multi-Core Computing Really?” for some great information on multi-core computing.
“Of POWER7 and NUMA” discusses POWER7's memory access design in much more detail.
A Few Additional Notes
A good page to bookmark is the IBM i Performance Management Web site. This site has a lot of great information on performance. The papers I've referenced in this article can be found under the “Resources” tab on that page.
It's possible some of the links in my blog today may not be live immediately because the information is so new. Please check back again later in the week.
I'd like to thank Mark Funk on the IBM i Systems Performance team, Chris Francois and Darcy Koch from the IBM i LIC Development team, and Michael Hollinger on the IBM Power Firmware Development team for their assistance in providing content and reviewing this blog article.
If we talk about performance, CPW is not fair to new Power System equipments running Power6 or future systems running Power7.
CPW is a database performance spec, and considering we can attach external storage including SAN Volume Controller and lots of arrays, even SSDs, we can get more performance from a system depending on disk configuration.
Posted by: Diego Kesselman | February 08, 2010 at 11:15 AM
Diego,
Thanks for the great comment, and I apologize for taking so long to reply to it.
The team responsible for publishing CPW has the notion of fair comparisons foremost in their mind as they set up and run the workload CPW. It might be helpful to know that this team does not think of CPW as a benchmark; it is a workload intended solely for relative positioning of the processor and memory subsystem. Certainly the team works to ensure good performance results, ensuring that both the processor hardware and software used in its support are executing efficiently. And, yes, it is not unreasonable for the team to use advanced I/O subsystems and the possible use of SSDs is in the future. But when if we change the I/O subsystem, the same I/O subsystem will be used within to compare to preceding comparison systems as well. That is part of the reason that the notion of a CPW rating gets published as opposed to some notion of throughput (transactions/minute). If changes are made to the CPW workload or environment, adjustments are made to the definition of CPW in order to ensure appropriate and consistent bridging between systems. The CPW team knows how CPW results are being used and works hard to ensure its consistency. Our goal with CPW performance reporting is to allow customers and business partners running similar environments to experience performance roughly mirroring the CPW results.
It is fair, though, to question just how well the CPW workload proper represents your use of a system. It is just one database transaction processing workload after all. Every workload uses the entire system in different ways. But, again, CPW's purpose is to provided an informed judgement of the relative compute capacity of each system. As such, it attempts to aggressively use all aspects of the processor complex. Explaining what that means alone can take quite a while, but may we suggest - if extra detail is useful to you - that you find the paper "Of Gigahertz and CPWs" mentioned in my blog for an overview of the processor complex that the CPW workload is driving.
Posted by: Dawn | March 12, 2010 at 01:27 PM