IBM has made some additional improvements to the
handling of QINACTITV. This time, the changes address two questions:
1. What does it mean
for an interactive job to be inactive?
With the change, a job can use CPU time and still be considered inactive.
2. Which jobs can be disconnected and
which jobs must be ended?
With the change, a job can be disconnected even if a device name was not
supplied when the session was created. What this means is that jobs for
devices whose names start with QPADEV can be disconnected. This affects
the Disconnect Job (DSCJOB) command processing and the inactivity timer
processing, but it does not affect the handling of device errors.
The change is available only for the 7.1 release
with PTF
SI50502. This is a delayed PTF, so it requires an IPL to put it on or to
take it off.
First, let's look at what it means for a job to
be inactive.
- Is there I/O occurring?
We first look at the session – the connection between the user and the system.
A session is inactive if there is no input or output occurring. A session is
active if the system continues to send data to the user even if the user never
sends input to the system.
- Who is waiting?
A session will be inactive while the system is working on a long-running
command, because there is no I/O occurring. In order for the job to be
considered inactive, the system must be waiting for input from the user. The
Work with Active Jobs (WRKACTJOB) command shows a status of DSPW for a job that
is waiting for input from a workstation.
- Is
the job using CPU?
Looking at whether the job uses CPU time can be both helpful and harmful. Let’s
look at some examples and how the new PTF affects these examples.
The job might do work that comes from sources other than the workstation. For
example, the job may have a message queue allocated and a break message
handling program defined. In this case, using CPU indicates that the job is
doing work and is in some sense active. With the new PTF, running a break
message handling program will no longer be a reason to consider the job active.
The new code gives less weight to unpredictable, background activities.
The job might do work that is just a part of being there and interacting with
other jobs. The job uses CPU time, but there is no reason to consider the job
to be active. For example, when some other job looks at the invocation stack of
an interactive job, the interactive job does some of that work and is charged
with the CPU time it uses. This might occur when you have the inactive
job message queue (QINACTMSGQ) system value set to the name of a message
queue. The program that handles the CPI1126 message might want to look at an
invocation stack to decide if a job should be considered inactive. With the new
PTF, looking at an invocation stack will no longer be a reason to consider the
target job to be active.
More and more things are being done in the background, often in secondary
threads. Using JAVA code in an interactive job is one way to see CPU time used
just for being there and interacting with the environment.
There is no perfect way to decide whether a job
should be considered active. A program handling CPI1126 messages can decide
what to do with a job that the system considers inactive, but it is not
notified about a job that the system considers active. The PTF cover letter
describes a way to use an environment variable to change how the system treats
CPU time for determining inactivity, but it cannot solve all the potential
problems. The session will look active when someone sends a message to the
workstation message queue, even if the user for that workstation isn't there.
Now on to the question of which jobs can be
disconnected.
In order for a Disconnect Job (DSCJOB) to make
sense, there has to be the capability to reconnect to the job. In order to
reconnect to the job, the same user must sign back on to the same device
description.
For the case where DSCJOB is being done in
response to a device error, DSCJOB is not allowed for sessions where no
specific virtual device is requested and the system selects which device to
use. The session is gone and it is very unlikely for that user to get the same
device when the user next connects to the system or when the device is next
used.
For inactivity, the session is still intact and
connected to the same user's workstation. The DSCJOB command is used by
programs that handle the CPI1126 message when the QINACTMSGQ system value names
a message queue. That means the command should act the same way the inactivity
timer code acts. Also, if a user wants to DSCJOB
rather than SIGNOFF,
this is probably a good thing to allow.
This PTF allows more jobs to be disconnected. If
your system is set up with QINACTITV(*NONE)
or QINACTMSGQ(*ENDJOB),
this PTF will probably have very little effect on the workload. So, for
discussion, we'll assume that jobs are being disconnected for inactivity.
There is a cost to start a new job, to end a job,
to disconnect a job, or to reconnect to a job. Different applications will have
different costs. While the overall amount of work is important, many systems
are more strongly affected by how that work is distributed over time.
Disconnected jobs continue to hold locks and use system resources. This is also
true of inactive jobs.
- When a user signs
on and is reconnected to a disconnected job, the system avoids the cost of
ending the old job and creating a new job. When a large number of users
reconnect, the work is generally well spread out over time.
- When a user does not reconnect
before the time limit defined by the time
interval before disconnected job ends (QDSCJOBITV) system value, the
disconnected job is ended. By comparison to QINACTMSGQ(*ENDJOB), the
system sees the extra work of a disconnect, but there is no work avoided.
All jobs see the same QDSCJOBITV value, so the work is spread out the same
way the work for QINACTITV is spread out.
- When a user does not reconnect and
the subsystem is ended before the QDSCJOBITV time limit, the work of
ending the disconnected jobs gets done all at once during the ending of
the subsystem. This can significantly increase the stress on the system.
Jobs that are inactive and not disconnected are also ended during the
ending of the subsystem.
The QDSCJOBITV system value should be set high
enough to allow users enough time to reconnect to their disconnected jobs, but
low enough that the disconnected jobs are likely to be ended before the
subsystem ends. If users are not going to reconnect, the QINACTMSGQ should be
set to *ENDJOB rather than *DSCJOB.
One of the common problems with QINACTITV occurs
when a user returns to a session at the same time the system is checking for
inactivity. The system does not know that the user is there until the system
sees input on that session and the system only sees input when the user presses
enter or a function key.
If QINACTMSGQ is set to *DSCJOB and the user is
using a virtual device selected by the system (a QPADEVxxxx device), the job
will now be disconnected rather than ended. The user can sign on and continue.
The PTF should be very helpful, even though the handling of QINACTITV will
never be perfect.
I’d like to
thank Dan Tarara for writing this blog. Dan is a member of the IBM i Work
Management development team. Thanks, Dan!
Connect With Us: