DISC SPACE: HOW MUCH IS ENOUGH?
by Vladimir Volokh, VESOFT
Presented at 1993 INTEREX Conference, San Francisco, CA, USA
Published by INTERACT Magazine, June 1993.

ABSTRACT. In spite of many new developments in disc technology,
the good old Winchester drive, with its mechanically movable arms,
is still the primary medium for all our files -- be it programs,
sources or Data Bases.

It seems that many simple questions related to disc usage do not
have easy answers:

    How  do we measure disc space  -- what's the relation between megs
    and  sectors?  How can an MPE user find out the disc capacity? How
    much  of it is free and how usable is the free space? And is 'used
    space'   really  used  by  us?   What  is  reblocking,  squeezing,
    trimming, condensing and other space transformations?
    The  paper  presents  some observations on  HP3000 file structure,
    similarities and differences between "Classic" and "Spectrum".
    This  author's  hope  is  that  such knowledge will  help users to
    better control their computing environment.

BIOGRAPHY OF THE AUTHOR:

    Vladimir Volokh is the president of VESOFT, Inc., a software house
    based in Los Angeles, CA, USA which was founded in 1980 by him and
    his son, Eugene.
    They  are  the  creators  of MPEX/3000, a  productivity and system
    control  tool, SECURITY/3000, a log-on access control package, and
    VEAUDIT/3000,  an  auditing  tool  that  reports loopholes  on the
    HP3000  system,  of  which  there  are 10,000+  packages installed
    worldwide.
    Vladimir Volokh is a computer scientist with more than 10 years of
    HP3000  experience  as  system analyst,  consultant, and technical
    manager;  he  is  a  frequent  speaker at users  groups around the
    world.

In spite of many new developments in disc technology, the good old Winchester drive, with its mechanically movable arms, is still the primary medium for all of our files.

In this article I will try to present some observations on HP3000 file structure -- both for "Classic" (MPE/V) and "Spectrum" (MPE/iX) computers -- in the hope that it might help HP3000 users manage disc space better. It seems that many simple questions have answers that are not so simple.

HOW DO WE MEASURE DISC SPACE?

In various discussions about disc space, you've seen terms like "sectors", "kilobytes", "megabytes", and "gigabytes". What do these words mean? Well, nothing is simple. A sector, by HP's definition, is 256 bytes; "kilo" (K) is 1000, or 1024 for memory devices; "mega" is 1000000, or 1024K for memory devices; and "giga" is a prefix denoting a billion, or 1.073 billion for memory devices. My dictionary tells me that "tera" means one trillion (10`^12) of a given unit. HP's Glossary of Terms mentions only "kilo" and "mega" (in 1989). So, considering all this mathematics, how many megabytes does your disc have if after :DISCFREE C on your XL machine you see the following:

ALL MEASUREMENTS ARE IN SECTORS. ALL PERCENTAGES ARE RELATIVE TO THE DEVICE SIZE.

                |    Configured     |      In Use       |     Availab
     -----------+-------------------+-------------------+---------------
     LDEV :     1 -- (MPEXL_SYSTEM_VOLUME_SET:MEMBER1)
      Device    |    2232192        |    1708144 ( 77%) |     524048
      Permanent |    1852720 ( 83%) |    1605344 ( 72%) |     247376
      Transient |    1852720 ( 83%) |     102800 (  5%) |     524048

Considering that 4*256 is close enough to 1000 you can do it easily -- just divide the number of sectors by 4000 and you will have it in megs. In this case, it'll be 2232192/4000 = 558, close enough to the real answer, 545 megs.

HOW BIG IS THE DISC?

As you've seen above, MPE/iX gives you an answer via :DISCFREE. In MPE/V the utility FREE5.PUB.SYS -- true to its name -- shows only free space. But if it shows you that X sectors are free, that's X sectors out of how many? This information is hidden deep inside the VINIT utility (as if it were unimportant). Try this:

      :VINIT >pfspace 1;addr

You will see a lot of information about addresses and sizes of free space (and you don't care much about that). But at the end of this listing you will see:

      TOTAL VOLUME CAPACITY:   216832 SECTORS
      TOTAL FREE SPACE AVAILABLE: 16490 SPACE
      MAXIMUM CONTIGUOUS AREA: 5505 SECTORS

By the way, it's not my typo (if you're wondering about "16490 SPACE") -- it's an unknown MPE designer's mistake, frozen in time ....

HOW MUCH OF IT IS FREE?

MPE/iX gives a pretty straightforward answer to this question: look at its output in the example above -- this time not on the first line but on the second:

    Permanent |    1852720 ( 83%) |    1605344 ( 72%) |     247376

As you see, 83% of the whole space is configured to be used as permanent, 72% is used, so only 11% (which is 83-72) is available for permanent files. But why doesn't this simple calculation work for the third line (transient space)?

    Transient |    1852720 ( 83%) |     102800 (  5%) |     524048

Even though transient space can also take up to 83% of the space on LDEV 1, in this case only 28% is left for that: 17% can't be permanent and 11% is unused by permanent files; because 5% is actually used by transient space, 23% is available.

On MPE/V machines available space is supposed to be shown by the FREE5.PUB.SYS utility or via the PFSPACE command of :VINIT (16490 sectors in the PFSPACE example above). But what about virtual (transient) space? This information, again, is hidden -- this time inside the :SYSDUMP output:

     :SYSDUMP $NULL
     ANY CHANGES? YES
     ...
     DISC ALLOCATION CHANGES? YES
     VIRTUAL MEMORY CHANGES? YES
     LIST VIRTUAL MEMORY DEVICE ALLOCATION? YES
     VOLUME NAME   LDEV #   VM ALLOCATION
       LDEV1       1         25
     ...
     ENTER VOLUME NAME , SIZE IN KILOSECTORS (MAX = 255 )?

This means that MPE/V knows nothing about virtual space utilization at the moment; some space is also taken (possibly) by spool files and by temporary files. Note also that even though total, free and virtual space is given by DEV#, used space is not. (The :REPORT command gives used filespace-sectors by group and accounts.) One way to know this distribution is to use the MPEX command %LISTF @.@.@,DISCUSE.

IS FREE SPACE REALLY AVAILABLE TO US?

Seeing the FREE5 output on "Classic" one should pay attention not only to the "TOTAL FREE SPACE" line but also to the preceding ones:

:RUN FREE5.PUB.SYS
VOLUME MH7945U1            LDEV 1
LARGEST FREE AREA= 25530
  SIZE  COUNT  SPACE   AVERAGE
>100000 0      0       0
>10000  2      42796   21398
>1000   0      0       0
>100    4      540     135
>10     29     1065    36
>1      107    217     2
TOTAL FREE SPACE=44618

If you have a lot of small pieces, they might not be usable at all because none of your files may have small enough extents (more on this later). What you need is not just free space but CONTIGUOUS space. On "Classics", disc space can be condensed to some degree by the >COND command in :VINIT; on "Spectrum" machines the disc fragmentation shouldn't be a problem (or so HP tells us).

IS THE "USED" SPACE REALLY USED BY US?

OK, by subtracting "free" space from the "total" space or just looking
at  the :DISCFREE output we might get  an idea of how many sectors are
"used"  -- physically, that is.  Keep  in mind, however, that probably
about  half  of  those files which you see  on the full backup listing
|1have not been used| (either modified or accessed) for a long time --
6  months or more. But which half?   Some answers to this question can
be  found in the :STORE command of MPE or, better yet, using selection
by  ACCDATE  and/or MODDATE in MPEX (with  totals of files and space).
Archiving  and  purging  seldom used files saves  a lot of disc space,
directory space, and backup time.

TO BLOCK OR NOT TO BLOCK?

Another question is: how is the space used inside "active" files? One factor -- relevant on MPE/V machines but not on MPE/iX machines -- is blocking. MPE/V does all disc I/Os in multiples of one sector (256 bytes). The blocking factor is the number of records that we choose to fit into a certain number of sectors (block). But very often we don't choose -- we simply rely on MPE/V defaults, which can range from good to very bad (see [1] for more details). A bad blocking factor wastes not only disc space, but also I/O time -- the more records per one I/O we read/write, the better. Consider some examples:

ACCOUNT= SYS GROUP= OPERATOR

FILENAME CODE ------------LOGICAL RECORD----------- ----SPACE---- SIZE TYP EOF LIMIT R/B SECTORS #X MX

REPORT1           132B  FA          26      10000   1     1251  1  8
REPORT2           132B  FA          26      10000  60      651  1  8
REPORT3           132B  FA          26         26  60       62  1  1
REPORT4           132B  FA          26         26   9       20  1  1

Here the file REPORT1 is built with the default blocking factor 1 (1 = 256 bytes / 132 bytes); the remainder (256 - 132 = 124 bytes) is simply wasted, though it's almost 50% of the space; this file is like a piece of swiss cheese -- with many big holes inside. The second file is the result of changing the blocking factor to 60, thus achieving the BEST space utilization for this file -- now 60 records take 60*132=7920 bytes which is close to the size of a block of 31 sectors (256*31=7936). However, we can get an even bigger saving by SQUEEZEing this file (setting FLIMIT down to EOF) -- that's how we got the file REPORT3. By reblocking it again we save more space; as a result, the difference in size between REPORT1 and REPORT4 files is quite significant. Things like this can be done using our very own MPEX (the %ALTFILE command with options SQUEEZE and BLKFACT=BEST).

And what about XL (or should we say iX) computers? The blocking factor does not mean much there; all the records are tightly packed, except for the last extent which can (for very big files) be up to 2048 sectors. The good news is that the FCLOSE intrinsic (on the XL) has a new option called "XLTRIM", which allows the system to reuse free space beyond the end of file without decreasing the file limit. Look at the following before-and-after example:

ACCOUNT= SYS GROUP= PUB

FILENAME CODE ------------LOGICAL RECORD----------- ----SPACE---- SIZE TYP EOF LIMIT R/B SECTORS #X MX

REPORT1 132B FA 26 10000 1 256 1 * REPORT2 132B FA 26 10000 1 16 1 8

Quite a savings (MPEX's %ALTFILE ;XLTRIM does it) -- and we can append to the file!

THE EXTENT QUESTION, OR WHERE THE FILE IS?

The extent is MPE's compromise between two extremes in file size management: assigning all file space requested to the file immediately or giving space one record (or sector) at a time. In MPE/V a file can consist of anywhere from 1 to 32 extents (the default number is 8). Each extent resides wholly on one disc, but different extents may be located on different discs. So where is any given file? You have to know the full extent map of the file and only then can you think about improving system performance through disc balancing. If you use the LISTDIR5 >LISTF you might see the DISC DEV # line, but this of course is only the device of the first extent (the same goes for :STORE listings). >LISTF ...;MAP, however, gives you a map (the first digit is the "volume table index", which is not necessarily the device number, and is hard to convert to the device number):

LISTDIR5 G.06.00 (C) HEWLETT-PACKARD CO., 1983 >LISTF VESOFT.PUB.SYS

FCODE: 0              FOPTIONS: STD,ASCII,VARIABLE
BLK FACTOR: 1         CREATOR: **
REC SIZE: 1276(B)     LOCKWORD: **
BLK SIZE: 640(W)      SECURITY--READ:    ANY
EXT SIZE: 10(S)                 WRITE:   ANY
# REC: 482                      APPEND:  ANY
# SEC: 70                       LOCK:    ANY
# EXT: 7                        EXECUTE: ANY
MAX REC: 13                   **SECURITY IS ON
MAX EXT: 7            COLD LOAD ID: %24025
# LABELS: 0           CREATED: THU,  9 APR 1992
MAX LABELS: 0         MODIFIED: THU,  9 APR 1992
DISC DEV #: 3         ACCESSED: THU,  9 APR 1992
DISC TYPE: 3          LABEL ADR: **
DISC SUBTYPE: 4       SEC OFFSET: %5
CLASS: DISC           FLAGS: NO ACCESSORS

>LISTF VESOFT.PUB.SYS;MAP

FCODE: 0              FOPTIONS: STD,ASCII,VARIABLE
BLK FACTOR: 1         CREATOR: **
REC SIZE: 1276(B)     LOCKWORD: **
BLK SIZE: 640(W)      SECURITY--READ:    ANY
EXT SIZE: 10(S)                 WRITE:   ANY
# REC: 482                      APPEND:  ANY
# SEC: 70                       LOCK:    ANY
# EXT: 7                        EXECUTE: ANY
MAX REC: 13                   **SECURITY IS ON
MAX EXT: 7            COLD LOAD ID: %24025
# LABELS: 0           CREATED: THU,  9 APR 1992
MAX LABELS: 0         MODIFIED: THU,  9 APR 1992
DISC DEV #: 3         ACCESSED: THU,  9 APR 1992
DISC TYPE: 3          LABEL ADR: **
DISC SUBTYPE: 4       SEC OFFSET: %5
CLASS: DISC           FLAGS: NO ACCESSORS
EXT MAP: %300161067   %200233735   %300162124   %200240307
         %300162207   %100211521   %200240326
>
In  MPE/XL  file  labels  are  kept separately from  the data, and yet
:LISTF  ,3  still  shows  the file label address,  which might have no
relevance  to  the location of the data at  all. Here is an example of
:LISTF  ,3  and  MPEX's  %LISTF ,4 showing the  full extent map of the
file:

:LISTF LOG3320,3

FILE CODE : 0                   FOPTIONS: BINARY,VARIABLE,NOCCTL,STD
BLK FACTOR: 1                   CREATOR : **
REC SIZE: 2044(BYTES)           LOCKWORD: **
BLK SIZE: 2048(BYTES)           SECURITY--READ    : CR
EXT SIZE: 0(SECT)                         WRITE   : CR
NUM REC: 2720                             APPEND  : CR
NUM SEC: 2304                             LOCK    : CR
NUM EXT: 9                                EXECUTE : CR
MAX REC: 1024                           **SECURITY IS ON
                                FLAGS   : 1 ACCESSORS,SHARED,1 R,1 W
NUM LABELS: 0                   CREATED : THU, APR  9, 1992,  2:01 PM
MAX LABELS: 0                   MODIFIED: THU, APR  9, 1992,  2:01 PM
DISC DEV #: 1                   ACCESSED: THU, APR  9, 1992,  2:01 PM
SEC OFFSET: 0                   LABEL ADDR: **
VOLCLASS  : MPEXL_SYSTEM_VOLUME_SET:DISC
               MPEX %LISTF log3320   PAGE 1
       MANAGER.SYS,PUB   THU, APR  9, 1992,  4:01 PM

ACCOUNT= SYS GROUP= PUB

-----FILE------ EXTENTS -----SECTORS----- DEVICE NAME CODE NUM MAX USED NOW SAVABLE CLASS

LOG3320          10   *               2560      208  DISC
   Dev/Sector:    2/%0000004516700   2/%0000000444440   2/%0000000072040
   Dev/Sector:    3/%0000001407620   3/%0000007737620   1/%0000003207300
   Dev/Sector:    1/%0000002675520   2/%0000006572620   3/%0000000536200
   Dev/Sector:    2/%0000006577760

To finish this little essay I propose a puzzle to MPE/iX users: what do these two "*" mean in the following :LISTF ,2 ??

ACCOUNT= SYS GROUP= PUB

FILENAME CODE ------------LOGICAL RECORD----------- ----SPACE---- SIZE TYP EOF LIMIT R/B SECTORS #X MX

PUZZLE 128W FB 1608 2222 1 1616 * *

The answer is in one of the recommended reading items:

1. Eugene Volokh, "The Truth About Disc Files", Presented at 1982 HPIUG Conference, San Antonio, TX, USA

2. Andy Tauber, "Disc Balancing", INTERACT Magazine, Jan. 1986

3. Greg Englestad, "HP3000 Disc Management", SUPERGROUP Magazine, Sep.-Nov. 1987

4. Eugene Volokh, "The Truth About MPE/XL Disc Files", Presented at 1989 INTEREX Conference, San Francisco, CA USA

5. S.Gordon, V.Volokh, "The Art And Science Of Disc Space Management", INTERACT Magazine, July 1991

Go to Adager's index of technical papers ⮂ View original 1980s typography