MVS TOOLS AND TRICKS OF THE TRADE NOVEMBER 2004 Sam Golob MVS Systems Programmer P.O. Box 906 Tallman, New York 10982 Sam Golob is a Senior Systems Programmer. He also participates in library tours and book signings with his wife, author Courtney Taylor. Sam can be contacted at sbgolob@cbttape.org. Information about the CBT MVS Tapes can be found on the web, at http://www.cbttape.org. PDS and PDSE Today I want to talk about the partitioned dataset (called a "pds" for short) in MVS, which is one of the principal ways of storing data in an MVS system besides VSAM. Both source code and executable load modules, as well as most "flat data", can be stored in pds'es. In the course of many years of MVS's existence, people found things "wrong" with the pds structure, and in 1988 or so, someone wrote a white paper for SHARE, outlining all of these "defects" that the pds supposedly (and actually) had. IBM then addressed the "problems" outlined in the SHARE pds white paper and came up with a new structure, known as the PDSE, which supposedly solved all those problems. The PDSE actually has a lot of its own problems, which IBM has been spending the last 17 or so MVS releases, working out. Let's begin our discussion by looking at some of the nitty gritty details of the partitioned dataset's structure. MVS has just about always had this "library" dataset known as the "partitioned dataset" or pds. The pds structure consists of a "directory" at the beginning, which is a keyed fixed unblocked structure having a key length of 8, and a data length of 256 bytes in addition. The pds directory points to units of sequential data, starting after the end of the directory and known as "members". Members follow each other and have end-of-file markers in between. DCB attributes of the data portion of a pds are exactly like the DCB attributes of sequential datasets. This is the "data" portion of the partitioned dataset. And the overall DCB attributes of the pds are those of the data portion, not those of the directory portion. As we said, the data is organized into "members", and the beginning TTR of each member is pointed to by a directory entry. Thus we have a library-type structure with a lot of data members grouped into one dataset structure. Locations within the pds are referred to by relative TTR (track-record) locations starting from the beginning of the dataset. Now that we've got the basic idea of what's happening in a pds, (you may have to re-read the previous paragraph several times), we can talk about what's good and what's bad. I'm assuming that all of you have worked in this field and have some practical experience with pds'es, so you'll understand what I'm saying. The first fact we need to mention is that when a pds is updated with a modified "member", the space taken up by the previous version of that member is not reused, but the modified member is written after the end of the last member in the pds, and the directory entry referring to its member name, is re-pointed from the previous member location TTR to the modified member location TTR. So that leaves the space occupied by the previous version of the member as "dead space" in the pds, which is not pointed to by any directory member at all. This dead space stays around, until a process called "compression" is done, usually using the IBM utility IEBCOPY. So one of the "defects" of a pds is that it is usually bigger than it has to be, due to the need for periodic compression. The SHARE white paper referred to the need for compression in a pds as a "defect". However, with the proper utilities, it can be an advantage. For example, when you "delete" a member of a pds, all you are really doing is wiping out the directory entry that points to that member name. The member's data actually remains there, though, as we mentioned. So it's logical that if you write a utility which will construct a new directory entry, and point it to the old data, you can "restore" the original pds member after it has been deleted. Of course, you wouldn't be able to do this after the pds has been compressed and the old member data was wiped out by the compression process. Now since IBM itself hasn't supplied MVS users with a utility that restores a deleted pds member, all this sounds theoretical. But MVS sysprogs have written such utilities, which are freely available, and using these utilities, pds members can be easily restored. IBM, not wishing to "officially" recognize the existence of any of these free utilities, considered the need for pds compression a "fault", but to a user who has the free PDS 8.5 package installed (CBT Tape File 182), and who can use its RESTORE subcommand to restore deleted members, it isn't a fault, but it's actually an advantage and a safety net if you make a mistake with a member and you need the old version again. So it all depends on your point of view, whether the need for pds compression is or is not a fault. Another pds "defect" that's worthy of mention, is what happens when the directory is too small, and is full, and you want to add more members to the pds. According to IBM, who hasn't supplied a utility which expands a pds directory, you have to make a new pds with a larger directory, copy the data members from the old pds to the new pds, and delete the old pds, all of which requires ALTER RACF authority and which is a great inconvenience. But if you are using the formerly mentioned free PDS 8.5 utility from the CBT Tape, all you have to do is to use its FIXPDS EXPANDDIR(nnn) subcommand, where nnn is the number of directory blocks you want to add. Then, PDS 8.5 will move some members from the beginning of the data part of the pds to the end, and will reformat the acquired space into the required number of new empty directory blocks. The pds does not have to be re-created by this process, and all you need from RACF, is UPDATE authority. It's much easier, and under those circumstances, I wouldn't call the need to add directory blocks a real "defect" (although IBM would). IBM's IMPLEMENTATION OF THE PDSE IBM went to a lot of trouble to create a new structure, called a PDSE, that supposedly solved all the pds "problems" (to be taken with a grain of salt, as we mentioned). Yes, the new PDSE doesn't have to be compressed; deleted space is automatically reused. Yes, the directory in the PDSE expands as more directory blocks are needed. Yes, the fixed unblocked keyed directory structure has been changed, and access time is not so slow. (Although I have noticed that IEBCOPY member copying is much slower.) All of this has been done, but at a very big price. The price is that PDSE's have to be managed by external services. If the PDSE is not on a SMS-managed disk pack, or if two extra address spaces don't happen to be up, PDSE services are dead in the water. The PDSE datasets are simply not usable. That's a VERY LARGE defect. On the other hand, normal pds'es don't require anything extra, other than basic MVS capabilities, to be accessible. Only the data itself needs to be accessed. No "extra services" are involved. So the "old-style" pds'es are "always there", while the PDSE's may not be. That's what IBM has been spending 17 MVS releases (so far) trying to fix up. My personal solution is to use old pds'es everywhere, unless I simply can't avoid a PDSE that someone else has created. And then I try to convert the data back to a pds if I can. I think that's a safer policy. And I can also get access to the data in that pds, even when I need to run an older release of MVS for some reason. Sometimes I'm at a site where I have to do that. IBM's current level of PDSE support is really good, only from z/OS Release 1.2 and later. So I think (even at this late point in time) that if I put my data in a pds instead of a PDSE, it's simply safer. The exceptions are only if a PDSE is unavoidable. That's my own personal (but flexible) opinion for now. But to be truthful and balanced, I have to add another statement. And that is, when IBM does something, you have to pay close attention. Remember when they set their mind to make PC's in the early 1980's? IBM's plan eventually took over and dominated the entire world of personal computing, and since they used Microsoft to develop their operating system for their PC's, they singlehandedly made Microsoft into the enormous giant it is today. So you can never take IBM's plans lightly, and they have put PDSEs into their plans. So we are forced to consider PDSEs to be "the wave of the future", and we must spend time learning as much as possible about them. HOW TO LEARN ABOUT PDSEs Now that we know that PDSEs are here to stay, and we have to learn about them, I am pleased to mention that the IBM Redbook people have created a wonderful piece of "one-stop shopping" for learning very much about PDSEs. This is the IBM Redbook, number SG24-6106, called "Partitioned Data Set Extended (PDSE) Usage Guide". You can download this book in PDF format from www.ibm.com/redbooks. When you point to z/OS books and do a search on this book number, this manual should come up for viewing and download. I've included Figure 1 to show you just a bit about the immense coverage of PDSEs in this Redbook. Space considerations have prevented me from printing the chapter details here also. I encourage you very much to download, print, and read this book from IBM. I don't think I can do a better job telling you about PDSEs than this book does. EXTRA TOOLS YOU CAN USE But here's where I can give you a lot of information that IBM can't (and of course won't) tell you. That is, I can show you how to tap into the large pool of user-written tools which will make the old pds'es easier to use, and which will eliminate 80 to 90 percent of the downside of the old pds'es, as mentioned in the SHARE white paper. These tools can be found in the free CBT MVS Utilities Tape collection that is accessible online. Just do a www.google.com search on CBT Tape. There are thousands of free MVS software tools in the CBT Tape collection. The first place to look, to obtain first-class partitioned dataset utility service, is CBT Tape File 182, which houses the multi-purpose "PDS 8.5 command utility package". The PDS command package is currently being maintained by John Kalinich, but it was largely developed by Bruce Leland and Steve Smith in former years. The PDS command package is capable of doing over 1000 separate utility functions with partitioned datasets, but among many other things, the PDS command package can expand the number of directory blocks in a pds "on the fly" as I've mentioned before. Also, the PDS command was the first package to offer multi-purpose (partial and) complete pds member lists under ISPF, when it runs in "ISPF mode" as opposed to "raw TSO mode". The PDS command also does character scans and replacements for subgroups of pds members, with "write in place". In other words, when PDS scans members for strings and replaces them with other strings, it does not create deleted members, but it rewrites the affected data blocks "in place". The PDS command does so many other useful functions to the old-style pds'es and a few for PDSEs, that if you know how to use it well, you won't miss having PDSEs at all, in all likelihood. The PDS package can RESTORE deleted members too. I think that the PDS command package from File 182 of the CBT Tape is the first place to look for extra leverage in dealing with old-style pds'es. But also check out File 036 from Bob Weinstein, which can restore deleted pds members by looking at them backwards, from the highest TTR that points to a deleted member, and back. Each deleted member, when found, is ISPF browsed for your examination. Then you decide whether to restore it or not. CBT File 316 contains a large number of batch utility tools, but the PDSGAS program is a batch utility which will also restore deleted pds members. However, the PDS command package (which can also be run under TSO-in-batch), I think, does a better job, because PDS can also restore deleted load modules and allow the restoration of their old attributes, if you know what they had been. Anyway, I'd highly recommend installing the PDS 8.5 command package, and I'd also recommend looking at the CBT Tape doc file, File 001, to see what other tools you can discover. The CDSCB authorized TSO command from CBT File 300, can (zap the VTOC to) change dataset attributes in general, in a very foolproof way. For example, if the secondary allocation amount is too small, and a dataset is in danger of running up to 16 extents, you use CDSCB to change the secondary allocation amount, so enough anticipated space is made available in the extents remaining. The PDS command also is good for stopping a dataset from overflowing 16 extents. Using the FIXPDS ADDTRK(nnn) or FIXPDS ADDCYL(nnn) subcommands, you can preallocate additional single extents to the dataset, of any specified size, one at a time. I think that by now, you're getting the idea that there are a lot of good pds utilites out there for the taking, in the CBT collection. So, to conclude our discussion for this month, the idea is that from IBM's point of view, PDSEs are the way to go. That's because IBM can't officially recognize the extra user-written pds utilities which answer the SHARE white paper's objections to the old pds structure. But I say that if you have the proper extra tools, even vendor tools like FileAid or the equivalent, then the old pds'es can last you a much longer time. And I personally prefer the old pds'es to the PDSEs. Have a nice month. I hope to see all of you here again next time. * * * * * * * * * * * * * * * * * * * * * * * * Figure 1. Chapter Headings of the PDSE Usage Guide Redbook SG24-6106. This is one-stop shopping for PDSE knowledge. The following outline of chapter headings illustrates the immense value of this IBM Redbook in learning about how and when to use PDSEs. The coverage of PDSE "knowhow" in this book is very thorough and detailed. Chapter 1. Introduction Chapter 2. Using PDSEs Chapter 3. Program management and PDSEs Chapter 4. PDSE sharing and serialization Chapter 5. Performance considerations Chapter 6. Migration Chapter 7. Managing PDSEs Chapter 8. PDSE diagnosis Appendix A. Software requirements A.1 Base PDSE support A.2 Extended sharing support A.3 Program object support A.4 Non-SMS PDSE support