MVS TOOLS AND TRICKS OF THE TRADE December 1991 Sam Golob MVS Systems Programmer sbgolob@cbttape.org ABOUT DATASET BLOCK SIZES AND I/O EFFICIENCY - PART I. I am sure all of you realize that most MVS shops are both the same and different. They are the same, in that our skills, developed at one site, can be largely used at another site without much modification--the type of work is similar at all sites. They are different, in that the actual hardware configurations and workload patterns can vary widely. IBM and "third party" software that exists in one place can be absent at another. Performance situations that are everyday occurrences in one shop, might never have happened in the entire history of a second shop. So, in talking about block sizes of datasets, one can only make rather general remarks, leaving it to the individual system programmers to see what applies in their own situations. A BIT OF BACKGROUND. Dataset management concerns data on disks, but the overriding concern about datasets is usually the efficiency of I/O operations to and from the data. The speed at which the day's work is done, is most often more important than squeezing the last free byte out of available disk storage. System programmers need to know how to plan the use of disk storage for I/O efficiency. Even a "system managed" storage environment can be mis-managed, causing needless expense and work delays. Since we system programmers still set up all disk-space environments, even today, we must become more familiar with the factors affecting their performance. We must also be very aware of things which can go wrong. Block sizing has traditionally been a balancing act. Small blocks waste a lot of disk space. Large blocks can often be used efficiently once they are read into central storage, but waste-causing factors might also make a large blocksize into a poor choice as well. From the point of view of pure disk pack utilization (percent of available track space used by a given block size), a tool such as the BLKDISK utility (also called BLK3350, BLK3380, and BLK3390), can quickly show you a panorama of alternatives. See Figure 1 for an illustration of what BLKDISK will show. Using such a tool, you can quickly weigh the space usage of any block size on a given disk pack type, without having to do laborious calculations. BLKDISK is a public domain tool available on File 296 of the CBT MVS Utilities Tape distributed by NaSPA (414-423-2420) or by SPLA (305-284-6257). We shall go a bit deeper here. My friend Rick Fochtman has done some studies about I/O efficiency that I've found to be both interesting and helpful. I'll attempt to convey some of his findings to you. Please be aware that any errors in "transmission" are mine, and not his. Rick Fochtman's contention is that for any type of sequential dataset access nowadays (since SAM-E), especially when allocating by TRACKS, the most efficient I/O should be attained by blocking 5 blocks to a track. I'll try to explain some of the concepts behind his reasoning. SOME "MAGIC NUMBERS". The Input-Output Subsystem (or "IOS") is a part of the MVS base product which handles the nitty-gritty or "low-level" disk access operations. The higher levels, "READs and WRITEs" which are understandable to most programmers, are the domain of the "access methods" in "DFP" (Data Facility Product), which is the data-management add-on to base MVS. For maximum efficiency of input-output operations to disk files, the designers of the access methods decided some years ago, that every disk-file "read" should best default to five buffers. In JCL, this would be specified by saying "BUFNO=5" in each disk DD card. The higher-level access methods in DFP specify "BUFNO=5" as the default for sequential disk access. That's very important. Simply, it means that every time you do a disk read, the system tries to fill five buffers. It's becoming clear now, why 5 blocks to a track might be especially good. The reason is that when you ask for a logical record, SAM (sequential access) decides when to go out and ask for more buffers of data. SAM buffer management, by default, works in groups of 5 buffers. Everything SAM does is in groups of five, with the exception of BSAM (where the programmer has complete control, but where one could lose much efficiency). You can fill all these 5 buffers in one sweep of a track, without having to skip tracks. Skipping tracks is wasteful. We hope to explain why, further on. You can argue (rightly) that skipping tracks is inefficient only if you allocate your datasets on a TRACK boundary. What if you allocate them on a CYLINDER boundary? The problem with cylinder allocation is that for small datasets, it's wasteful of disk space. Rick's studies and experience have told him that cylinder allocation is usually helpful for datasets of 3 cylinders and larger. Besides, he has found through detailed tracing that dataset access efficiency for track-allocated datasets blocked 5 blocks to a track, is nearly identical to efficiency for datasets allocated on cylinder boundaries. So this "5 blocks to a track" blocking seems to be a good "magic number" because of the design of IOS and the access methods. If you do some calculations of disk-space usage, (or you look at the BLKDISK output in Figure 1) you'll see that five blocks to a track on a 3380 device isn't significantly more wasteful than 2 blocks to a track. That's because modern disk devices have big tracks. TO GO ON. I'll have to continue this discussion next month, but we have some big plans. I want to talk more about access methods, and about how sizable performance gains can be obtained through maximizing channel usage. (If you still use ISAM, get rid of it and use VSAM.) I'll try to really get into I/O issues. Meanwhile, I hope this has been helpful. Good luck. See you next month. * * * * * * * * * * * * * * * * * * * * * * * FIGURE 1. Illustration of the BLK3380 TSO Command, invoked to show disk space usage on a 3380 device. Various block size scenarios are displayed in the upper table, enabling the user to graphically see the disk space utilization by the different block sizes. The Track Capacity Table for 3380 devices is shown as the lower table of the illustration. Please note that we do not "recommend" half-track blocking here. The "RECOMMENDED" remark is part of the program's normal output. blk3380 80 trackcap 3380 BLOCKSIZE SUMMARY; LRECL=80 KEY LENGTH=0 BLOCKSIZE BLOCKS/TRACK LRECLS/TRACK UTILIZATION --------- ------------ ------------ ----------- 80 83 83 14.0% 2,480 16 496 83.6% 2,640 15 495 83.4% 2,880 14 504 84.9% 3,120 13 507 85.4% 3,440 12 516 86.9% 3,840 11 528 89.0% 4,240 10 530 89.3% 4,800 9 540 91.0% 5,440 8 544 91.7% 6,320 7 553 93.2% 7,440 6 558 94.0% 5 BLOCKS/TRK 9,040 5 565 ---> 95.2% 11,440 4 572 96.4% 15,440 3 579 97.6% RECOMMENDED-->23,440 2 586 ---> 98.7% 32,720 1 409 68.9% FOR BLKSIZE 23,440 AND 100,000 RECORDS, ALLOCATE: 342 BLOCKS, 171 TRACKS, OR 12 CYLINDERS 3380 TRACK CAPACITY; KEY LENGTH=0 BLOCKS/TRACK BLKSIZE BYTES/TRACK UTILIZATION ------------ ------- ----------- ----------- 1 47,476 47,476 100.0% -----> 2 23,476 46,952 ---> 98.9% 3 15,476 46,428 97.8% 4 11,476 45,904 96.7% -----> 5 9,076 45,380 ---> 95.6% 6 7,476 44,856 94.5% 7 6,356 44,492 93.7% 8 5,492 43,936 92.5% 9 4,820 43,380 91.4% 10 4,276 42,760 90.1% 11 3,860 42,460 89.4% 12 3,476 41,712 87.9% 13 3,188 41,444 87.3% 14 2,932 41,048 86.5% 15 2,676 40,140 84.5% 16 2,484 39,744 83.7% DEVICE SUMMARY: MAX BLOCKSIZE=47,476 TRACKS=13,275 BYTES=630,243,900 NOCYLS=885 TRKS/CYL=15 TRKSIZE=47,968 DSCB/TRK=53 PDS/TRK=46 * * * * * * * * * * * * * * * * * * * * * * * FIGURE 1A. Illustration of the BLK3390 TSO Command, invoked to show disk space usage on a 3390 device. Various block size scenarios are displayed in the upper table, enabling the user to graphically see the disk space utilization by the different block sizes. The Track Capacity Table for 3390 devices is shown as the lower table of the illustration. blk3390 80 trackcap 3390 BLOCKSIZE SUMMARY; LRECL=80 KEY LENGTH=0 BLOCKSIZE BLOCKS/TRACK LRECLS/TRACK UTILIZATION --------- ------------ ------------ ----------- 80 78 78 11.0% 2,880 16 576 81.3% 3,120 15 585 82.6% 3,440 14 602 85.0% 3,760 13 611 86.3% 4,080 12 612 86.4% 4,560 11 627 88.5% 5,040 10 630 88.9% 5,680 9 639 90.2% 6,480 8 648 91.5% 7,520 7 658 92.9% 8,880 6 666 94.0% 5 BLOCKS/TRK 10,720 5 670 ---> 94.6% 13,680 4 684 96.6% 18,400 3 690 97.4% RECOMMENDED-->27,920 2 698 ---> 98.5% 32,720 1 409 57.7% FOR BLKSIZE 27,920 AND 100,000 RECORDS, ALLOCATE: 287 BLOCKS, 144 TRACKS, OR 10 CYLINDERS 3390 TRACK CAPACITY; KEY LENGTH=0 BLOCKS/TRACK BLKSIZE BYTES/TRACK UTILIZATION ------------ ------- ----------- ----------- 1 56,664 56,664 100.0% -----> 2 27,998 55,996 ---> 98.8% 3 18,452 55,356 97.7% 4 13,682 54,728 96.6% -----> 5 10,796 53,980 ---> 95.3% 6 8,906 53,436 94.3% 7 7,548 52,836 93.2% 8 6,518 52,144 92.0% 9 5,726 51,534 90.9% 10 5,064 50,640 89.4% 11 4,566 50,226 88.6% 12 4,136 49,632 87.6% 13 3,768 48,984 86.4% 14 3,440 48,160 85.0% 15 3,174 47,610 84.0% 16 2,942 47,072 83.1% DEVICE SUMMARY: MAX BLOCKSIZE=56,664 TRACKS=16,695 BYTES=946,005,480 NOCYLS=1,113 TRKS/CYL=15 TRKSIZE=58,786 DSCB/TRK=50 PDS/TRK=45