ISO/IEC JTC1/SC22/WG5-N1433

To:         WG5
Subject:    Proposal to support separating module specification and body
From:       Van Snyder
Date:       23 July 2001
References: WG5/N1434

J3 paper 98-104 details a proposal to support the separation of module
specifications and bodies into separately compilable units.

This paper summarizes that proposal, and provides a proposed work plan.
The full proposal is in WG5/N1434, formatted as a TR.

1. What is the problem?

In large programs, composed of many modules, a change in a low-level
module can cause a "compilation cascade:"  Everything that depends on the
low-level module is re-compiled, even if there are no changes in its
interface, and everything that depends on those modules is re-compiled,
etc.  This occurs because "make" notices that the object files (or the
module information files, depending on how one writes the makefile) are
older than the source files.

Some argue that this is not a problem, because compilers are fast. This
is true for moderately large systems, but not very large ones -- and such
exist.  For example, the suite of spacecraft navigation software at JPL
is six million lines.  It requires roughly one day to compile everything,
depending on the platform.  It is still in Fortran 77, and will not use
the module facilities of Fortran 95 when it is converted, in part because
of this problem.

A more serious problem, however, is re-certification of a system.

Many organizations have a not-unreasonable policy that any module that is
re-compiled shall be re-certified.  The reason for this policy is the
same as the reason for the compilation cascade:  The "make" processor
can't know the reason that an object file is out-of-date with respect to
another file on which it depends -- it could be that an interface is
changed, or it could be that only an implementation has changed -- and
managers of a large software project (frequently justifiably) don't trust
their programmers to know the reason, either.

Re-certification of the entire six million line software suite mentioned
above requires roughly twenty five work weeks.  This is the primary
reason that module features will not be used when the system is converted
to Fortran 95.

2. How would the proposed TR solve the problem?

The TR proposes to allow (but not require) that a module be specified by
an interface part, that is accessible by use association, and one or more
implementation parts that are not directly accessible by use association.
The proposed modification is compatible with Fortran 95 and with the
current draft of Fortran 2000.

As proposed in the TR, the implementation parts of a module would depend
on its interface part, not vice-versa.  Since the implementation parts
are cot directly accessible by use association, changes therein could
neither affect the interface part, nor any program units that access the
interface part by use association.  A "make" file would reflect this
structure.  Therefore, changes in an implementation part of a module
would not cause a compilation cascade, or a re-certification cascade.

3. Is this untried technology?

The ability to separate a module into interface and implementation parts
has been incorporated at least into Modula-2 and Ada.  The first edition
of "Programming in Modula-2" was published in 1982.  The first Ada
standard was published in 1983.  Separating a module into interface and
implementation parts is not new technology.

4. Which standard would a TR modify?

When I first proposed it, I had hoped it would modify the 1995 standard.
It now looks more like it ought to modify the 2002 standard.  (Actually,
I proposed it during the public comment period for Fortran 90.  I
received no answer to my comment concerning Fortran 90.  The first formal
presentation to J3 of the proposal in its current form was in February of
1998.  See J3 paper 98-104.)

5. What are the organizational aspects? who will do the work?

I will do the work.  I believe N1434 is a reasonably complete paper.  Out
of experience, Malcolm Cohen is certain there are holes in it.  He's
probably right, but neither he nor anybody else has reviewed it in
detail.  The proposal has very well-contained affect; it touches only
small parts of Section 11 and Annex C.

6. What is the time scale?

I had hoped to launch the project last year, but the agenda at Oulu was
too crowded with corrigendum work, and I had to leave early.  If the
proposal is as complete as I hope it is, we should be able to finish it
in two more WG5 meetings. In addition to an hour or two at each of the
next two WG5 meetings, I don't anticipate it taking more than four or
five hours of J3 time.

There has been some discussion of simpler alternatives, but so far none
of them fulfills the goals that I set for the project.  Disposing of
these (that is, convincing those who have given only a little thought to
the matter, but are sure they have a better solution than I, that they
don't have a viable solution at all) may take more time than I have
anticipated.

7. Does this have the approval of J3?

This proposal is not being advanced as a formal J3 proposal.  It is, in
some sense, a personal proposal.

At meeting 156 (or maybe 155) J3 took some straw votes concerning TR's in
general and this one in particular.  J3 was not opposed to TR's, so long
as the developer is a regular contributor to Fortran standards work, or a
veteran (but no longer regular) contributor.  This resulted from the
observation that the "IEEE" and "Allocatable" TR's were successful, but
the "C interoperability" TR was not.  The result of another vote was that
J3 was not opposed in principle to the particular TR proposed in N1434,
but that it (or any other TR) takes a secondary priority to getting the
2002 standard ready.

8. Why not wait for the public comment period for Fortran 2000?

I plan to remark about this facility during the public comment period,
but I am not optimistic that it will result in a change to the 2002
standard.  Starting a TR may be viewed as my backup plan to get this work
progressed, in case action during the public comment period has no
effect.  It is difficult to say which is "Plan A" and which is "Plan B".

The reason for the urgency is that the problem addressed by the TR has
deleterious effect on my work, and the work of colleagues at JPL whom I
represent by my membership in J3 and WG5.

During my development work, changing a low-level module results in its
recompilation, which takes two or three seconds.  Then the rest of the
program is compiled, which takes five or ten minutes.  If I am in an
edit-compile-debug cycle, my work is slowed substantially -- depending on
the amount of time I spend studying the code, changing the code, and
wandering around in the debugger.  Sometimes, the latter three activities
take a relatively small fraction of the time, as compared to waiting for
the admittedly quick, but frequent, compilations.  The compilations are
quick, but not near as quick as they would be if there were no cascade,
and they add up to a significant fraction of some work days.

Also, my work concerns a library of software that is used by five other
developers who are working on the same program on which I am working; it
is also used in two other programs, involving at least four more
developers.  Although I don't check every experiment back into the
project archive, when I ultimately commit a change, it affects the work
of nine other people in the same way that I am affected.

One of my colleagues is considering the conversion of six million lines
of Fortran 77 to Fortran 95.  One of the obstacles to that conversion (or
at least to using the module facilities of Fortran 95 in that conversion)
is the prospect of re-certifying the entire system if a change is needed
in the implementation of a low-level routine, even if that change has no
effect on the interface.

This is not automatically a problem in Fortran 77, but as Fortran 95 now
stands, the commonly used conspiracy of compilers and "make" does not
know whether a low-level module was compiled solely because of a change
in its implementation, or a change in the module affects its interface.
Therefore, a compilation cascade results.

There have been proposals that address this, having to do with comparing
module information files.  This is only a half-measure, because (1) if a
module is changed and the methodology notices that the module information
is unchanged, its (old, retained) module information file is forever
after out-of-date with respect to the source text.  Therefore, although a
compilation cascade doesn't result, the module is forever after
recompiled.

Also (2), the method is processor-dependent: Some vendors write module
information files with names all in lower case, some with names all in
upper case, some with names in the same case as the file name of the
module.  Some use a .mod extension, some use a .kmd extension and some
use a .d extension.  Some put a time-and-date stamp in the module
information file (making comparison whether the rest of the file has
changed difficult). Some use a database that has no obvious relation to
the file names or module names.  The result of this overly long tale is
that the "compare the module information" method of suppressing
compilation cascades is a half-measure that has significant portability
problems.

The most serious problem for my six-million-line colleague is not that it
takes more than a day to compile his entire program suite (that's bad
enough).  Rather, it is that the not-unreasonable certification protocol
requires that every re-compiled module be re-certified.  This involves
running and checking the results of thousands of tests, at a cost
exceeding $50,000.  In the case of using the method of comparing module
information files to limit compilation cascades, the total amount of work
is quadratic in the number of changed modules.  (First time, one module
needs to be re-certified.  Second time, the newly changed module needs to
be re-certified, and the old one needs to be re-certified too -- although
it hasn't been changed, it has been re-compiled, because its module
information file is out-of-date with respect to its source).

There are other problems, having to do with "programming in the large,"
that are addressed by the proposal but that aren't discussed above. They
are real problems that were reported to me by my colleagues, and that
impede their work, not hypothetical ones that I dreamed up to support the
proposal.

For all of these reasons, I would like to move this proposal forward,
even if only by a few inches.