[Dirvish] The Options hash (tutorial, long)

Keith Lofstrom keithl at kl-ic.com
Wed Feb 9 09:12:06 PST 2005


The whole Options hash is starting to make sense to me.  Or in
some cases, nonsense.  I want to refactor the behavior for
Options in 1.3 .  Eric M. is looking at Options also, and
we will coordinate.  Are any of the reset of you YAPHs, and
willing to contribute to the code refactoring?

What follows is a description of what the Options hash does,
and how I think it should change.

Lets talk Perl.

The Options hash is a "referenced, anonymous hash".  The scalar
$Options contains a "reference", which you can think of as a
memory address.  In this case, $Options contains a reference
to an anonymous hash, which is a hash that has no other name.
We can do something like $AnotherReference = $Options, and we
will have 2 references to the hash, or we can do $Options = undef,
and the hash will lose a reference.  When it goes to zero 
references, the perl interpreter will discard the hash.  
You refer to scalar elements of the hash with calls like
$$Options{ summary } which is the scalar value of the summary
string, or @$Options{ bank }, which is the array value that 
lists the banks .

One big problem, both for readability and behavior, is that the
Options hash contains 7 anonymous subroutines: config, client,
branch, vault, reset, version, and help.  That makes the original
definition for Options hard to read.  Not only that, but it makes
some of the behavior disappear for improperly constructed Config
files or Option lines.

What is an "anonymous subroutine", you may be asking?  In Perl,
you can also do a "reference" to a subroutine. When you "dereference"
that variable, you get an executable subroutine.  This allows you
to embed subroutine references in scalars, or as elements of arrays
or hashes, and pass them around as arguments to other subroutines.

This trick is used by the 1.2 version of dirvish to pass subroutine
references to the GetOptions subroutine (part of the Perl library).
So when we call dirvish with the command line option --version, 
for example, GetOptions processes the command line, sees --version,
and finds that in the Options hash.  The value associated with the
hash key "version" is an anonymous subroutine, a little piece of 
code that prints out the version and exits.  

Well, that is well and good, and the options --config, --reset,
--version, and --help all do some bit of behavior then return
or exit.  However, the --branch, --vault, and --client options
are more clever than they ought to be.  When these options are
called, they actually replace the anonymous subroutine with a
string.  This wipes out the anonymous subroutine that was stored
there when dirvish was started; the code is discarded.

Further, if you do something odd like --reset=config, that will 
replace the anonymous subroutine code for config with the Perl value
"undef".  So after that, you are no longer able to read in another
config file, though you can set the value of $$Options(config) to
some string or number.  The same for the other 6 values.  That is
just weird.  I would expect --reset to set a configuration parameter
to the initial value (which may be an anonymous sub), not just clean
it out.  However, there may be some legacy config files that make
use of this weird behavior.

There are other odd things that happen with the Option hash.  We
can feed hash key and value pairs in through GetOptions;  however,
we can also feed them in through a config file.  Before we run
GetOptions, we load the master configuration file with a call to
the subroutine loadconfig().  You can put almost anything in a
configuration file;  if you have a line in it saying:

foo: fum

... you will end up with a new hash key of foo, with a scalar value
of fum.  What do you think happens if you have lines like:

help: wipeout

or

vault:
   string1
   string2

... in your master config file?  That's right, these will wipe out
the ability to interpret the --help or --vault options properly
from the command line, and cause dirvish to fail in a wierd way.
loadconfig(), which is called in a number of places to load
accumulating configuration files, does not know whether a
particular configuration option should be valid,  or what type it
should be.  I will talk more about loadconfig() in another email;
I want to refactor that too, so it does error checking and knows
what type of variable it should be reading in.  By improving
loadconfig(), we can make the config files a LOT more forgiving.


Proposed CHANGES to the Options hash: -------------------------------

First, the Options hash is not fully defined at initialization.  
About half the hash values are defined later in the program, making
it very difficult to tell what it will eventually contain.  I propose
that we initialize the Options hash in a subroutine "options-initialize",
which sets *every* hash key currently used with the initial value. 
That does not prevent some future code hacker from adding a key,
but at least we can define and describe the keys we know about now.  

Second, we add an "if" clause to the --reset option, so that when we
enter the option --reset=default, it executes options-initialize() 
and removes the effect of any previous configuration.  This will be
very useful for testing.

Third, rather than using anonymous subroutines for config, client,
branch, vault, reset, and version, we should use references to named
subroutines.  That way, the behavior does not disappear when we set
the values of these hash keys to a scalar rather than a reference.


Eventually, I would like to put "options-initialize" and the 
dirvish.pl code for reading config files into a dirvish library
file (dirvishlib.pl ?), along with loadconfig and other subroutines.
That way, if we want to build another program, say dirvish-restore-file,
we can use the same routines to make that work.  I want to append the
"pod" documentation to each of the programs, and the pod info that
constructs dirvish.conf will be appended to dirvishlib.pl .

Keith

-- 
Keith Lofstrom          keithl at keithl.com         Voice (503)-520-1993
KLIC --- Keith Lofstrom Integrated Circuits --- "Your Ideas in Silicon"
Design Contracting in Bipolar and CMOS - Analog, Digital, and Scan ICs


More information about the Dirvish mailing list