ISO/IEC JTC1/SC22/WG5 N2126

       A Strategy for Reckoning the Content of the Next Revision
	   
	                         Dan Nagle

I. Introduction
---------------

This document argues for a process for determining the content of the
next revision that considers the overall cost to implement the next
revision.  It also argues that the best return on that investment in
features added is to consider the use cases motivating each proposed
feature as an integral part of its justification.


II. The View from the Mesa
--------------------------

The experiences of publication and ongoing implementation of the
Fortran 2003 and Fortran 2008 revisions leads to some thoughts
regarding the process of defining the contents of a future revision of
Fortran.  Specifically, the latency between publication of those
revisions and the general availability of fully-compliant compilers is
seen by many as far too long.  Many believe that the long latency is
largely attributable to two circumstances: These revisions simply had
too many new features; and, with hindsight, some included new features
that had a cost to implement that significantly outweighs their
incremental benefit to applications programmers.  

The rate of new feature implementation in compilers is set by the
availability of compiler engineers; only they can implement a new
revision. The ill effects of the oversubscription of compiler
engineers' time include the general observations that bug resolutions
are taking longer than in the past, and that optimizations to take
advantage of new-generation hardware seem to be fewer, or at least
delayed.  Some of this is due, no doubt, to the support of modern
software engineering techniques in Fortran.  Some may reflect nothing
more than recent greater complexity of the language.  Edge cases in
implementation, and even applications' use of a new feature, may not
be fully anticipated. Some new features were specifically designed to
promote highly efficient execution on a wide variety of hardware. The
expected efficient execution is not appearing as quickly as hoped.
Whatever the cause, overall, these effects allow other languages to
appear relatively more attractive than previously for the tasks for
which Fortran has traditionally been well-suited.

Perhaps worse, portability suffers when the time from publication to
complete implementation is too long.  Each successive compiler release
from different suppliers implements different items from the long
lists of new features.  Therefore, for too long a time, the latest
compilers from different suppliers do not all support the same subset
of the new features.  So use of any new feature is limited, and bugs
in compilers, and perhaps even in the feature itself, are not detected
in a timely manner.

These factors discourage applications programmers from learning new
features, discourage textbook authors from producing explanatory
texts, discourage development of training materials for want of tested
examples, and limit applications programmers' ability to clearly
imagine what further potential new features would provide additional
benefit in conjunction with the published but as-yet-unimplemented
ones.  So when the previous revisions are too large, we may have some
difficulty clarifying the next revision.

Over the last several revisions of the standard, only one new proposal
in five or six formally registered with J3 has actually
resulted in a new feature in Fortran.  Some have argued that adding
every possible new feature to a revision speeds progress in the
development of Fortran.  The observations above argue strongly that
this is not so.

And indeed, WG5 has recognized that compiler engineers are
having difficulty keeping the pace of successive revisions when the
new feature work list for Fortran 2015 was set to be a repair of
simple deficiencies only.  Fortran 2015 also includes two substantial
Technical Standards which, in principle, should give implementors a
head start on Fortran 2015.  That was mainly the case for only the
interoperability TS.  The coarray TS implementation was mostly deferred
while vendors were still not finished with Fortran 2008 features.
However much the phasing helps implementors, the aggregate remains
more substantial than hoped.

The issue facing WG5 now is what to do going forward.

III. A Funny Thing Happened on the Way to the Revision
------------------------------------------------------

When WG5 declares a time during which new feature requests
will be considered, historically, beyond the aging N1349,
there have been few requirements or guidelines on proposals made,
and there is no recently-agreed process in place for WG5
to treat new proposals now.  The resulting default strategy might be
described as casting a wide net and filtering the results.  We have
proven ourselves to be far better at casting the wide net than we are
at filtering the results.  Surely, this explains, in great part at
least, the over-sized revisions.  Both WG5 and J3 have
limited resources, and on occasion, features have been accepted that
consumed large amounts of these limited resources to be completely
defined.  We should examine examples of individual new features, large
and small, to see some of the mechanisms causing the bloat.

Among large features, consider the length type parameter of
parameterized derived types.  Uniform opinion among compiler engineers
has been expressed to me that implementation of length type parameters
in derived types is very costly.  The corresponding increment of
benefit to applications programmers is relatively modest.  This should
not be taken as a claim that derived type length parameters are
unworthy, merely that the high implementation cost appears to
challenge whether the modest benefit is worthwhile on overall
balance. Compiler suppliers face a large cost to implement, and do not
hear from their customers that the feature is critically needed.  So
the decision to lower the priority of this feature's implementation is
understandable.  Relatively low priority of a difficult task is a
sufficient explanation for the delay of the appearance of length type
parameters in compilers.

Among small features, consider the NEW_LINE intrinsic procedure.  It
returns a character result that has a know value for an
implementation. No computation is involved.  Supposing the
implementation cost to be completely trivial, the incremental benefit
to applications programmers is a standard name for a constant.

An example of a feature that consumed large amounts of WG5's and J3's
limited resources is the contiguous attribute.  Appearing conceptually
simple, this one attribute required so much time in subgroup that a
second work item was added to the list just to complete it.  The
expected gain of this feature is superior execution efficiency that
may be available when arrays are known to be stored in sequential
storage locations.  So while the potential benefits are substantial,
so was the cost in WG5 and J3 time and effort.

In these above cases, an argument may be made to support inclusion of
the feature in the standard.  I am not attempting to debate the
particulars, pro or con.  I am stating that the lack of a cost-benefit
analysis for these features has weakened whatever case there may be
for their inclusion in the standard.  WG5's focus has long
been on the feature proposed, but not the underlying problem to be
solved.  So alternative solutions, perhaps with better cost-benefit
ratios than the feature proposed, were simply foreclosed from the
beginning.  Indeed, without focusing on the problem to be solved,
WG5 cannot have complete confidence that the underlying problem
is actually solved by the feature, no matter how much effort
WG5, J3, and compiler engineers, spend on it.  Yet many argue
that the revisions where these features first appeared were too big!

I believe there is no evidence that shows a formal evaluation of the
problems to be solved, nor of the costs and the benefits of new
feature proposals.  The "JKR" scale used during processing proposals
for Fortran 2008 included difficulty to implement and difficulty to
standardize, but not benefit to applications.  It completely ignored
any mention of the problem to be solved.  And it was never treated in
sum, to scale the whole revision.  There is nothing in
the record to show a measure of the size of the whole revision, let
alone any comparison made against any particular target size, either
as a preset or as a running total.

The poor cost-benefit of some new features cited above, whether the
feature is large or small, appears to derive exactly from a lack of a
clearly stated estimation either of cost to implement or of benefit to
applications.  Very clearly, cost and benefit were not formally
considered together.  And no quantitative appreciation appears to have
been given to the resulting overall magnitude of the whole revision.

Thus, a different approach may be profitably considered.

IV. Winter Witter about Whither
-------------------------------

We should address both the overall magnitude of a new revision,
including the cost to implement a solution of any particular problem
to be solved.  We should also keep in mind our own limited resources
as WG5 and J3.  That is, we should clearly know why any new feature is
to be included.  In short, we should know enough to improve our
estimates of the cost-benefit ratio of any particular proposal.  Doing
so for each proposed new feature is the only way we can improve the
cost-benefit ratio of the whole revision.

Whether the choice is made to measure the costs in delay of
fully-compliant compilers in calendar units, or in some other,
non-time-related, or even arbitrary work unit, it is essential to have
a budget and to respect it during the whole schedule of publishing a
revision.

Placing a limit on the size of the whole revision is not as unfair to
individual proposals as it might seem at first sight.  Sticking to a
budget also helps to stick to a schedule, and leaves suppliers in a
better position to implement the whole revision more rapidly.
(Indeed, that is the motive.)  Thus, if a proposal must await a
subsequent revision, it is not as great a delay as it might otherwise
be.  And a new feature proposal may benefit from experience using
those portions of the revision undergoing implementation while the new
feature is being considered for inclusion in the subsequent revision.

Recognizing that compiler engineers' time is the throttle, the
question becomes how to maximize the return on the investment of this
limited resource.  This represents a budget to be spent, albeit an
inexact one, and with inexact expenses.  So it is a challenge, but a
challenge that must be met if WG5 is to do its job as a
standardization committee.  A new feature proposal must contain a
credible estimate of the level of effort required for implementation.
This estimate must be set by compiler engineers only, as only they
have the necessary experience.  Absent alternative measures of the
cost of a proposal, the argument that any such estimates are
uncertain, and even subject to sandbagging, is unpersuasive.  The only
alternative to an imperfect estimate is no estimate at all.
Experience with recent revisions too clearly shows that lack of an
explicitly stated estimate of cost simply will not do, and results in
too-large a revision of the standard.

To have a better understanding of new feature proposals, I suggest
that, rather than accepting more-or-less fully formed new feature
proposals, use cases where Fortran is currently wanting should be
considered.  That is, we should ask that problems to be solved be
brought to WG5, rather than proposed solutions, as was done
previously.  This should not be taken to imply that a new feature
proposal cannot contain any description at all of a desired new
feature, only that the problem to be solved is the focus, and any
proposed solution is merely an explanatory example.

The appropriate use case is a difficulty encountered writing an
application using Fortran where the cause of the difficulty is related
to Fortran, rather than the problem to be solved.  Before vetting a
new feature proposal, the use cases that motivate the proposed new
feature should be stated.  This allows subgroups to find common, or
similar, problems.  The number of similar use cases gives a measure of
the breadth and severity of the issue presented.  So equipped, the
information available to subgroup is at least as much, if not more,
than the information available to any single proposer.  Thus, subgroup
may address the motivating issues at least as effectively as any
single proposal.  WG5 also has a better basis for believing
that the feature actually proposed for the revision addresses
the motivating problem.  As with compiler engineers and cost estimates,
the information at hand may not be perfect, but proceeding with more
information is clearly better than proceeding with less.

Subgroups, including their compiler engineers, can then distill,
combine, split, enhance or limit, and rank use cases.  Where consensus
is gained in subgroup that a set of use cases has sufficient and
sufficiently wide-spread weight, then subgroup can design the feature.
Compiler engineers will be involved in the design of each new feature
from the beginning.  Thus, compiler engineers can review and critique
proposals, so features, or portions of features, may be avoided where
there is a high risk of incurring excessive costs for the benefit
delivered.  Of course, the idea is not to let compiler engineers
simply gut proposals, but to actually solve the issues identified by
applications programmers.  A balance is sought at the point of best
cost-benefit ratio.  Consensus judgement is required.  The cumulative
workload of the entire revision is better justified than is the
cumulative workload where such estimation is absent.

The notion of cumulative workload raises an important, if perhaps
subtle, point.  Compiler suppliers have a wide variety of sizes of
engineering teams, from several tens to one.  The
professionals-versus-volunteers divide is another major source of
variety among engineering teams.  Suppliers with larger engineering
teams may, perforce, apply more resources to new feature
implementation than can suppliers with smaller teams.  So the
incremental delay towards fully-compliant compilers may scale
differently, considering a given new feature, among suppliers with
various sizes of engineering teams.  But however many resources a
compiler supplier has, the needed resources are taken from other
tasks, such as bug fixing, optimization, and efficient use of new
hardware.  One simply must assume that no compiler supplier has
resources lying idly around.  Any supplier may have at play other
factors, perhaps managerial or budget, such as a willingness or
otherwise to commit resources to Fortran, that limit implementation
rate.  The free compilers, so important in academia, face their own
challenges regarding difficult to implement new features, especially
large ones.

We must have a means of determining when the cumulative workload has,
in fact, reached the preset target amount.  One approach that may be
agreeable to all is to let compiler suppliers each state their own
estimate of their own time to implement.  When the proportion of
suppliers who pass the target time passes a preset proportion of all
suppliers, then the target is declared to be met.  The views of every
compiler supplier are included in the declaration, yet no one supplier
dominates WG5's decision.

No matter what one considers of either compiler engineers' ability to
estimate implementation costs, or of applications programmers' ability
to state the weight of use cases, having the estimates at hand is
clearly better than not having them.  The problem to be solved is
better analyzed by having the focus on the problem rather than a
single proposed solution; having various approaches to a problem's
resolution under discussion is better than having a single proposed
new feature.

Applications programmers may well find that compiler suppliers are
less reluctant to treat a proposal for a new feature when they more
fully understand the motivation for it, and have some confidence that
implementation has low risk of surprising difficulties.  Likewise,
applications programmers may be more willing to accept that a feature
is very expensive to implement after discussion of the issues involved
with compiler engineers.  And the interaction between compiler
engineers and applications programmers may well result in a solution
that neither had considered before.

In sum, a more detailed process may be expected to help WG5
understand better each step of the process.  The resulting revision
may be expected to be fully implemented in less time while delivering
more benefit to applications programmers.

This represents a positive response to the difficulties evident during
the development and ongoing implementation of Fortran 2003 and Fortran
2008, and recognizing that Fortran 2015 will be shortly available.

V. Thou Shalt Sharpen Thine Pencil
----------------------------------

To make a specific proposal, I suggest a process with the following
steps be adopted for the handling of new feature requests for the next
revision of Fortran.

1. Plenary will discuss and adopt a budget prior to acceptance of any
   new feature proposals.  It shall not subsequently be changed during
   the drafting of the revision.  In whatever units preferred, and
   with whatever means of acceptance preferred, the budget is a
   statement of how long following publication of the new revision it
   is that fully-compliant compilers are desired.  My personal view is
   that a latency of around three-to-five years' time is best.

2. Any new feature request shall be stated as a set of use cases where
   applications programmers currently have undue difficulties.  These
   use cases will be assigned to a subgroup, or subgroups, by topic.
   When plenary assigns a set of use cases to a subgroup, plenary may
   state its preferences for the kind of solution sought.

3. Subgroup will consider use cases, distilling, combining, splitting,
   enhancing or limiting, and ranking, as appropriate.  The designated
   goal is to get the best return on investment, where investment is
   defined as the compiler engineers' time, and return is defined as
   the greatest benefit to the greatest number of applications
   programmers.

4. When a set of use cases has been accepted by subgroup as motivating
   a new feature, subgroup will design a set of requirements, together
   with a cost estimate from the compiler engineers in subgroup.  This
   will be reported to plenary.  Plenary may choose to constrain the
   resulting feature, or to accept as-proposed.

5. If the requirements are acceptable to WG5 in plenary,
   subgroup will make a set of specifications.  This will be reported
   to plenary.  All the compiler engineers in plenary will make an
   estimate of cost of implementation.

6. WG5 in plenary will then accept or decline the proposed
   new feature.

7. At the end of a meeting, a poll of suppliers will be made.  If more
   than the preset proportion of suppliers give an estimate of longer
   than the preset overall time to implement, the budget is spent.

8. When the budget is spent, starting with the next meeting and
   beyond, no new feature requests will be accepted in plenary, nor
   assigned to subgroup.  The worklist is fixed.

Whether this proposal is accepted or not, WG5 should have
some process in place prior to accepting any new feature proposals.
Discussion of how to proceed will be more productive if a set of
proposals is not already on the table.