However, workstation devices are still widely used today; telnet sessions all use workstation devices.
The subsystem job is central to workstation-device management. It handles putting up the sign-on display, as well as error-recovery processing if the session ends unexpectedly. Prior to 5.4, subsystem jobs were single-threaded and could only process one device at a time. Thus, if you had a situation where many devices were affected at one time (a network outage, for example), the subsystem job could become a bottleneck for the device error recovery processing. As a circumvention for this issue, IBM made a general recommendation that no more than 250-300 devices be handled by a single subsystem. To implement this recommendation, you had to define multiple subsystem descriptions and set up the necessary workstation entries to spread the devices across those subsystems. We wrote the Interactive Subsystem Configuration experience report to describe in detail how to perform this configuration.
In the 5.4 release, the subsystem job architecture was changed to be multithreaded; now a single subsystem job could handle device-error processing for up to 20 devices at a time, in 20 different threads. This architecture change had benefits beyond the parallel processing for device error recovery; it also resulted in subsystem start-up time being faster.
I want to note, though, that the controlling subsystem job isn’t multithreaded. If you leave the controlling subsystem system value (QCTLSBSD) to the default of QBASE, your interactive users will run in the controlling subsystem and you won’t get the benefits of multithreaded subsystem jobs. If you have many interactive users, you should change the QCTLSBSD to QCTL (or equivalent).
Although multithread subsystem jobs alleviate the requirement to set up and manage multiple subsystems for interactive users, you may still want to consider doing this, although for different reasons. Multiple subsystems can make it easier for you to manage the users on your system and offer additional options for performance tuning.
So why write this blog about something so old? I had a coworker ask me just last week if we'd yet removed the restriction of 250-300 devices per subsystem. That took me by surprise, but since this was an internal design change that did not affect externals, we never documented this. We just removed that old recommendation from the Information Center. The only place I know where there was external documentation about this change was a small update in a support technote that described the circumvention stating that the recommendation is no longer needed for releases 5.4 and higher.
Finally, I'll share a bit of my history with subsystems and device-recovery processing. In 1995 or so, I was asked to investigate the SNA error-recovery processing in general. It was during this job assignment I learned of the tight integration between SNA communications and IBM i work management, and where I discovered the lack of scalability in subsystem jobs due to their single-threaded design. At that time we didn’t have support for threads; they were added to the operating system in the V4R2 release. We looked at various ways to address this scalability limit, but all of the ideas at the time were too expensive and risky to implement. So we published the recommended limit of 250-300 devices per subsystem as the circumvention and lived with the issue. In 2002, my work assignment changed and I joined the work-management team. While in that job, I had the opportunity to focus on our subsystem architecture. By this time, threads were old news and the recommendation of changing our subsystem architecture to be multithreaded was accepted and delivered in the 5.4 release. It was an amazingly talented group of software engineers that worked on this project and the quality of their work was outstanding!