MVS TOOLS AND TRICKS OF THE TRADE January 1992 Sam Golob MVS Systems Programmer sbgolob@cbttape.org ABOUT DATASET BLOCK SIZES AND I/O EFFICIENCY - PART II. This issue marks the beginning of this column's fourth year. Time flies, when you enjoy your work! Last month, we began to speak about the need for I/O efficiency to and from disk storage. Although it is important to pack as much data as possible into a track of disk space, it is probably a good deal more important to write and retrieve that data quickly. An analysis of the workload at most shops will probably show that throughput--getting the jobs done quickly--is the highest priority aim. My friend Rick Fochtman has done a lot of study directed at increasing job throughput. Rick contends that I/O bottlenecks are a great part of the performance problems in most places. One outgrowth of Rick's studies, is the rather startling result that certain prevalent block sizing practices can significantly hurt I/O rates to and from disk storage. I hope to explain some of Rick's contentions as we go on. We'll talk about the channel programs associated with disk I/O, and how to reduce various sources of wasted time. Rick has stated, that on a very busy system, datasets with one-fifth track blocking (five blocks per track) could have I/O rates 2 1/2 or 3 times more efficient than with half-track blocking. As we showed last month (see Figure 1), track utilization at 5 blocks per track on 3380's and 3390's is just a bit less than space usage at 2 blocks per track. That slight space loss will be greatly offset by the increased I/O speed to and from the data. Rick adds the caveat that he has not studied DASD with cache controllers, but there is still a lot of DASD without cache controllers out there. For the latter, Rick feels he can vouch for these results. CHANNEL PROGRAMS IN DATA MANAGEMENT AND IN IOS. IOS, the "Input-Output Subsystem", is a part of the base MVS product. IOS does device-specific handling of I/O requests. IOS can be pictured as as the "lowest level" of I/O software in MVS. The more familiar "READS" and "WRITES" to disk that are done by most programmers, are part of the "data management" packages (nowadays called "DFP" or "Data Facility Product"), which are an add-on to the base MVS product. The data management "access methods", which handle the "reads and writes", are a higher-level and more device-independent programmer interface. "Access Methods" are part of DFP. What happens when you do a sequential READ or a WRITE, using an access method (except for BSAM, where the programmer has complete control)? Almost always, the system tries to fill 5 buffers at once. This number 5 was decided upon by the data management architects some years ago as optimal for internal design reasons. When the BUFNO parameter is not stated on a DD card, "BUFNO=5" is usually the default. In order to fill these buffers, the system must get the data. It does so using "channel programs", which instruct those I/O hardware devices loosely known as "channels". The "channels" do the actual I/O to and from the devices, independently of the CPU. Instructions in channel programs are called "CCWs", or "Channel Command Words". We shall assume (for those who are familiar) that our discussion will concern only "Format-0 CCWs", that are essentially unchanged for 25 years. There are various types of these CCWs. We shall discuss what happens when we want to read some records on disk storage using "QSAM" (Queued Sequential Access Method). When we issue a GET for the data, QSAM constructs a channel program consisting of 3 instructions. These are: "Search ID=" (a search for particular CCHHR disk data within boundaries), TIC (an unconditional "goto") back to the "search id=" instruction and finally (if the search was successful) the next sequential CCW (the TIC) is bypassed. This results in the execution of the third instruction, a "Read CKD" (or "Read Count-Key-Data") which gets the data. If both BUFNO greater than 1, and NCP greater than 1 are specified, then the channel program is extended to read that number of records (the minimum of these two numbers) without additional seek overhead. However, this is limited by space allocation unit (TRK or CYL) boundaries. (BLKs are rounded to tracks.) The access method (QSAM) passes this channel program down to IOS, which adds two instructions to the front of it, and sometimes a third. These are: A "standalone SEEK" for the desired CCHHR, to get the heads moving to the right cylinder. This is followed by a "CCW Set File Mask", which sets limits on allowing a change of track or cylinder. You want to remain in the bounds of your dataset allocation during the subsequent reading of the data. The hardware helps you do this. So the program which goes to the device actually consists of five instructions: standalone seek, CCW set file mask, search id=, TIC back to the search, and finally Read CKD. A significant time-consumer here is the standalone seek. Any means of decreasing the number of seeks is probably good for throughput. A second potential time-waster has to do with disk rotation delays. We'll talk about that shortly. Our third time- gobbler is eliminated by 5-block per track blocking. Simply put, it's THE PREVENTION OF BREAKS IN THE CHANNEL PROGRAM. Remember that the access methods will try to fill five buffers at one time. They do this for their own internal efficiency (so it doesn't pay to set BUFNO=2 for every DD card in your shop. The "access method" people have thought of that already.) The five buffers will get filled smoothly, provided that the CCW "set file mask" instruction doesn't detect a bad data boundary, such as the end of a track or the end of a cylinder. If track or cylinder boundary does get hit, IOS will construct a new channel program and start it automatically, but with the penalty of a big time delay and many extra "searches". Now just think. Under half-track blocking, reading 5 blocks crosses two track boundaries. This results in two channel program breaks and restarts, forcing lots of extra "searches". Under "5-blocks per track" blocking, reading 5 blocks results in no channel program break at all. The channel program just starts at the beginning of the track, reads the entire track and fills the buffers. There are no channel program break delays introduced. That's where you get the huge savings, especially while using TRK allocation that forces an program break at every track boundary. This idea is worth investigation and implementation. With that explained, let's conclude this month's installment with a discussion of disk rotation delays. DEVICES EQUIPPED WITH ROTATIONAL POSITION SENSING (RPS). Our discussion of "disk rotation delays" is really a misnomer. The major concern here is the prevention of "channel-busy" tie-ups during disk rotation. If a disk device has to interrupt an I/O because it is waiting for a particular record to rotate under its heads, the least it can do is free the connecting channel to do other work during the rotational delay. Devices having the "RPS" feature do just this. They free the connecting channel while they are rotating toward a record. Both 3380 and 3390 disk devices have the RPS feature. From our point of reference, how does this feature work? If the access method (DFP) knows that it's going to an RPS-equipped device, it issues a "set sector" CCW instruction before the "search id=" CCW instruction. This "set sector" instruction operates in the hardware, between the channel and the device. The idea is that the hardware will determine when the required record is underneath its heads, and only then will it try to connect the channel to the device. If the channel is busy, the device will have to do one more rotation. Otherwise, a free channel will be connected to the device and will retrieve the record. There are many implications to this. A full disk rotation is on the order of 14-17 milliseconds. That's a long time to tie up a channel connected to half a dozen devices, all trying to do their own I/O's. Speaking about channel tie-ups, we'll talk about key searches next month, when we conclude this series. Key searches tie up the channel while they're occurring. ISAM does a lot of key searches. BLDL does a lot of key searches. Program operations that involve key searches can cause I/O bottlenecks, principally because of channel tie-ups. We'll see how, next time. Meanwhile, we've got plenty to chew on. Good luck in all your work. See you next month. * * * * * * * * * * * * * * * * * * * * * * * FIGURE 1. This is a display of track capacities for 3380 and 3390 disk devices. The display was produced by the BLKDISK command from File 296 of the CBT MVS Utilities Tape. The numbers shown are the largest blocksizes possible, to be able to fit "N" blocks into a track. A 3380 device can hold 47476 bytes in one track, but two blocks on the same track would require an "inter-block spacing". Three blocks would require two spacings, and so on. This chart shows the extent of "spacing losses" for each device. For both devices, 5 blocks/track is not very much worse than 2 blocks/track. An additional 3 spacings on a large track isn't too much, percentage wise. 3380 TRACK CAPACITY; KEY LENGTH=0 BLOCKS/TRACK BLKSIZE BYTES/TRACK UTILIZATION ------------ ------- ----------- ----------- 1 47,476 47,476 100.0% (2 blks/trk) 2 23,476 46,952 ---> 98.9% 3 15,476 46,428 97.8% 4 11,476 45,904 96.7% (5 blks/trk) 5 9,076 45,380 ---> 95.6% 6 7,476 44,856 94.5% 7 6,356 44,492 93.7% 8 5,492 43,936 92.5% 9 4,820 43,380 91.4% 10 4,276 42,760 90.1% 11 3,860 42,460 89.4% 12 3,476 41,712 87.9% 13 3,188 41,444 87.3% 14 2,932 41,048 86.5% 15 2,676 40,140 84.5% 16 2,484 39,744 83.7% DEVICE SUMMARY: MAX BLOCKSIZE=47,476 TRACKS=13,275 BYTES=630,243,900 NOCYLS=885 TRKS/CYL=15 TRKSIZE=47,968 DSCB/TRK=53 PDS/TRK=46 3390 TRACK CAPACITY; KEY LENGTH=0 BLOCKS/TRACK BLKSIZE BYTES/TRACK UTILIZATION ------------ ------- ----------- ----------- 1 56,664 56,664 100.0% (2 blks/trk) 2 27,998 55,996 ---> 98.8% 3 18,452 55,356 97.7% 4 13,682 54,728 96.6% (5 blks/trk) 5 10,796 53,980 ---> 95.3% 6 8,906 53,436 94.3% 7 7,548 52,836 93.2% 8 6,518 52,144 92.0% 9 5,726 51,534 90.9% 10 5,064 50,640 89.4% 11 4,566 50,226 88.6% 12 4,136 49,632 87.6% 13 3,768 48,984 86.4% 14 3,440 48,160 85.0% 15 3,174 47,610 84.0% 16 2,942 47,072 83.1% DEVICE SUMMARY: MAX BLOCKSIZE=56,664 TRACKS=16,695 BYTES=946,005,480 NOCYLS=1,113 TRKS/CYL=15 TRKSIZE=58,786 DSCB/TRK=50 PDS/TRK=45