MVS TOOLS AND TRICKS OF THE TRADE February 1992 Sam Golob MVS Systems Programmer sbgolob@cbttape.org ABOUT DATASET BLOCK SIZES AND I/O EFFICIENCY - PART III. This is the final part of our three-part discussion about I/O efficiency considerations. I want to thank my friend Rick Fochtman for his help in producing this series. This month, I'd like to talk about recognizing and avoiding extra CCW key searches in production processing. Key searches are wasteful in that they require the channel to be connected for the entire time the search is occurring. Any process that does repeated key searches will tie up a channel and possibly cause an I/O bottleneck. Certain "normal" computing processes do repeated key searches. They are wasteful almost by definition, and therefore they should be minimized (or at least studied) in a production environment to improve the throughput of work. Three of these processes are: 1) BLDL processing, which does PDS directory member name searches on disk, 2) ISAM data organization, which is not used much nowadays, but which should be eliminated if it is still used, and 3) creation and deletion of datasets on volumes with non-indexed VTOCs. A BIT OF PRACTICAL HELP FOR BLDL INEFFICIENCIES. BLDL is by far the most prevalent of these three processes. Because of that, I'll digress on the subject of BLDL processing. When a member is to be fetched from a library on disk, BLDL is used to find the member's CCHHR track location. The BLDL process occurs whenever a partitioned dataset member is accessed through the PDS directory. For example, if you are executing a program from a disk library, or if you are looking for an assembler macro, BLDL is called. BLDL processing, and its improvements, are centered in SVC 18, or module IGC018. "Unimproved" BLDL uses key-based searches which are very slow and I/O-bound. We'll talk later about how these searches work. BLDL-caused bottlenecks are an old problem in MVS. Practically speaking, however, you can do a lot by eliminating repeated BLDL searches when finding PDS member locations on disk more than once. If you have an XA or an ESA system, it is highly recommended to use LLA (Linklist Lookaside) to avoid repeated BLDL directory searches for linklist programs. LLA does BLDL processing once, upon its initialization, and creates a hashed table of all linklist directory entries, in the LLA address space. An SVC 18 hook sends linklist BLDL requests to LLA, if the LLA address space is running. If you have an ESA system, the Virtual Lookaside Facility (VLF) will greatly speed program access and CLIST execution, partially by avoiding key-search I/Os to disk, and also by taking advantage of expanded storage if it is available. On an ESA system, its Library Lookaside Facility can be used to eliminate traditional BLDL searches on non-linklist program libraries as well as on linklist libraries. Even for MVS/370 there is help. The Dynamic BLDL program called "DYNABLDL", which is on File 407 of the CBT MVS Utilities Tape, can greatly boost throughput on an MVS/370 system or on an XA system, by eliminating repeated BLDL searches. DYNABLDL does only one real BLDL for each found module name. A dynamic table is built in core that is used by each subsequent request for the same module. The most recently and frequently used linklist programs are put into the dynamic table. DYNABLDL offers a nice report concerning its own efficiency, the names of all modules hit, and when they were hit. With DYNABLDL, you can also "purposely miss" certain linklist libraries if you want to, through a conditional assembly option. You can obtain the CBT Utilities Tape from NaSPA (414) 423-2420 or from SPLA (305) 284-6257. If you use DYNABLDL, you should nullify the IEABLD-- member of SYS1.PARMLIB, and let DYNABLDL handle all of the linklist BLDL requests. I have found that LLA is somewhat more efficient on an XA system than DYNABLDL is, but you may want to use DYNABLDL on XA as a research tool, for the benefit of its reports. DYNABLDL hooks into SVC 18 later than LLA does, so LLA must be down in order for DYNABLDL to function on an XA system. WHY "SEARCH KEY" CCW'S ARE INEFFICIENT. Now, let's get back to the main topic of our discussion, which is to explain why "Search Key" CCW's are inefficient. We explained some basics of "channel programs" in the last two installments of this column. In System 370 architecture, I/O to peripheral devices is not done by the CPU, but by hardware devices loosely known as "channels". Channels can be pictured as being an "intelligent cable" to bring data between the CPU and the outside device. Channels can function independently of the CPU. One channel is usually connected to a string of several disk or tape devices. Channels can be programmed. The "instructions" in a channel program are called CCW's, or "Channel Command Words". "Search Key High or Equal" and "Search Key Equal" are types of channel "op codes". A "Search Key" CCW looks for the "key" of "keyed data" in a keyed data file. Keyed data is a type of disk data in a dataset, with up to 255 bytes of "key", followed by possibly many thousand bytes of "data". The key precedes the data, followed by another key and more data. The purpose of the key is to aid in "directly" locating the data. If the key is searched for and found, the data immediately following the key may be retrieved relatively quickly thereafter. A "Search Key" CCW is a hardware mechanism of comparing for and finding a certain key in a keyed data file. Usually, "Search Key" CCW's are coded in loops. There is a "Search Key", followed by a TIC (Transfer in Channel) which is an unconditional branch back to the "Search Key". The next instruction after the TIC is usually a "Read Data" CCW. This is how the loop is executed. Our channel is connected to the disk media which is rotating under the read heads. When a "Search Key" CCW is executed, the channel pulls in the next key area that passes under the read heads. (Remember that on a CKD disk device, all disk data is recorded as either "count" areas, "key" areas, or "data" areas, and the hardware knows how to tell the difference.) The channel works very quickly. During the time when the gap is rotating between key and data, the channel grabs the compare value from main computer storage, and compares it to the key value drawn in from the disk key. If the compare is satisfied, the channel sets the status modifier in the "Search Key" CCW to skip the next instruction, and go on to the following "Read Data" CCW in the channel program. The "Read Data" CCW will pull in the data corresponding to the key. If the compare was not satisfied, the TIC is executed to branch back to the "Search Key", and the channel program waits for the next "key" area to rotate under its heads. The crucial point is that during this "Search Key" loop, THE CHANNEL REMAINS TIED TO THE DISK DEVICE. The reason is quite logical. The channel cannot disconnect from the disk device during an unsuccessful search loop, while the keys are being read again and again. The problem is, that while successive keys are being read, the channel program must wait for the next key to pass under the read heads. However, since the data area after each key area is of unpredictable size, there is no way of knowing in advance when the next key area will arrive under the heads. Therefore the channel must always remain connected to the rotating device during the "Search Key" loop. Most key searches are unsuccessful for a long while, before the correct key is found. Getting the data, once you're there, is not the problem. That's the same no matter what disk I/O method you use. It's GETTING TO THE DATA that's the problem, and the channel has to be tied up while you're doing that. Any I/O operation that ties up a channel for a long time, is a potential bottleneck. As anyone in the tuning game knows, the main idea is to eliminate bottlenecks. WHY BLDL IS "BAD". A PDS directory consists of 256-byte blocks of directory entries, each preceded by an 8-byte key. Each directory entry is headed by its PDS member name, and all directory entries in the directory are sorted in member name order. The 8-byte key of each directory block contains the highest member name in the block. This would be the name of the last directory entry in the block, because of the member sort order. When normal BLDL processing tries to locate a member name within a PDS directory, it executes a loop of "Search Key High or Equal" CCW instructions. Many linklist libraries contain hundreds of directory blocks. And linklist libraries are concatenated. That means that if a member is not found, all of the directory blocks in the library must be searched, each with its own "Search Key" cycle. Then the next library in the concatenation has to be searched. Typical "simple" directory searches might involve thousands of "Search Key" CCW instructions, tying up the channels to the libraries' disk devices for a long time. These channels could be better used for other I/O purposes in the meantime. That's why "BLDL is bad". WHY ISAM IS "BAD". ISAM dataset organization consists of Index and Data components. The Index component consists of keys, which tell you how to get to the corresponding Data component records on disk, so their data can be retrieved or written. One problem with ISAM is that all of the keys in an ISAM Index component are accessed using "Search Key Equal" CCW loops. We've already shown that these are tremendous channel hogs. VSAM, which is much improved over ISAM, also has Index and Data components. However, VSAM doesn't access its keys with the inefficient "Search Key Equal" loops, but it uses a method that is far more efficient, and far less wasteful of channel time. VSAM uses a tree structure index. Another problem with ISAM is that inserted records are often written to "overflow extents", which can be distant on the disk from the original Index records (or on another disk volume). There will be a lot of seek delays. VSAM doesn't usually have this problem because its control intervals are usually split in the middle, and the inserted data is near to the old data. The moral of the story: Ditch ISAM and convert to VSAM. WHY LARGE NON-INDEXED VTOCS ARE BAD. Non-indexed VTOCs rely heavily on hardware key searches, whether you are trying to obtain an existing dataset, or whether you are trying to allocate a new dataset. To obtain an existing dataset, the channel program is as follows (and please remember that the search starts from the beginning of the VTOC, proceeding sequentially record by record): "Read Count" CCW (which gets the CCHHR of the next data area), then "Search Key Equals" on the dataset name, then a TIC back to the "Read Count", and finally a "Read Data". Once the data is found, it can be re-accessed easily because the actual CCHHR has been gotten by the last "Read Count" instruction. GETTING to the data is obviously wasteful. The situation is a lot worse if you are allocating a new dataset. The channel program for doing that is very complicated, but here are a few essential parts of it. This channel program has to do two things at once. First, it must search for the first available Format 0 entry. For that, it has to do a "Search Key Equal" on a key of 44 hexzeros. The search must begin at the beginning of the VTOC and it can't stop at the previous high water mark for non-trivial data. Second, the same channel program has to search for the existence of a duplicate dataset name (or a duplicate key). One problem here is that for both searches to be accomplished, two turns on the same track are required. You can't do two different key searches on the same record in one turn. This is very wasteful of channel connect time. Finally, if the search for the nonzero key succeeds and there is a duplicate dataset name, DADSM has to be informed. But an awful lot of time was taken getting to this point. Indexed VTOCs don't have these inefficiencies. The VTOC indexes have tree-structured keys accessible by software, and they heavily utilize key compression. Duplicate dataset conditions are discovered far more efficiently, as are available Format 0 records. CONCLUSION. Hardware key searches at the channel program level are very inefficient and you should avoid them at all costs. The job of the systems programmer here, is to recognize which "normal processes of the system" use these wasteful means to accomplish their ends. If there is an alternative, use the alternative. I hope this knowledge benefits your shop, and enhances your personal understanding of system processes. Good luck in everything. See you next month.