ISO/IEC JTC1/SC22/WG5 N2126 A Strategy for Reckoning the Content of the Next Revision Dan Nagle I. Introduction --------------- This document argues for a process for determining the content of the next revision that considers the overall cost to implement the next revision. It also argues that the best return on that investment in features added is to consider the use cases motivating each proposed feature as an integral part of its justification. II. The View from the Mesa -------------------------- The experiences of publication and ongoing implementation of the Fortran 2003 and Fortran 2008 revisions leads to some thoughts regarding the process of defining the contents of a future revision of Fortran. Specifically, the latency between publication of those revisions and the general availability of fully-compliant compilers is seen by many as far too long. Many believe that the long latency is largely attributable to two circumstances: These revisions simply had too many new features; and, with hindsight, some included new features that had a cost to implement that significantly outweighs their incremental benefit to applications programmers. The rate of new feature implementation in compilers is set by the availability of compiler engineers; only they can implement a new revision. The ill effects of the oversubscription of compiler engineers' time include the general observations that bug resolutions are taking longer than in the past, and that optimizations to take advantage of new-generation hardware seem to be fewer, or at least delayed. Some of this is due, no doubt, to the support of modern software engineering techniques in Fortran. Some may reflect nothing more than recent greater complexity of the language. Edge cases in implementation, and even applications' use of a new feature, may not be fully anticipated. Some new features were specifically designed to promote highly efficient execution on a wide variety of hardware. The expected efficient execution is not appearing as quickly as hoped. Whatever the cause, overall, these effects allow other languages to appear relatively more attractive than previously for the tasks for which Fortran has traditionally been well-suited. Perhaps worse, portability suffers when the time from publication to complete implementation is too long. Each successive compiler release from different suppliers implements different items from the long lists of new features. Therefore, for too long a time, the latest compilers from different suppliers do not all support the same subset of the new features. So use of any new feature is limited, and bugs in compilers, and perhaps even in the feature itself, are not detected in a timely manner. These factors discourage applications programmers from learning new features, discourage textbook authors from producing explanatory texts, discourage development of training materials for want of tested examples, and limit applications programmers' ability to clearly imagine what further potential new features would provide additional benefit in conjunction with the published but as-yet-unimplemented ones. So when the previous revisions are too large, we may have some difficulty clarifying the next revision. Over the last several revisions of the standard, only one new proposal in five or six formally registered with J3 has actually resulted in a new feature in Fortran. Some have argued that adding every possible new feature to a revision speeds progress in the development of Fortran. The observations above argue strongly that this is not so. And indeed, WG5 has recognized that compiler engineers are having difficulty keeping the pace of successive revisions when the new feature work list for Fortran 2015 was set to be a repair of simple deficiencies only. Fortran 2015 also includes two substantial Technical Standards which, in principle, should give implementors a head start on Fortran 2015. That was mainly the case for only the interoperability TS. The coarray TS implementation was mostly deferred while vendors were still not finished with Fortran 2008 features. However much the phasing helps implementors, the aggregate remains more substantial than hoped. The issue facing WG5 now is what to do going forward. III. A Funny Thing Happened on the Way to the Revision ------------------------------------------------------ When WG5 declares a time during which new feature requests will be considered, historically, beyond the aging N1349, there have been few requirements or guidelines on proposals made, and there is no recently-agreed process in place for WG5 to treat new proposals now. The resulting default strategy might be described as casting a wide net and filtering the results. We have proven ourselves to be far better at casting the wide net than we are at filtering the results. Surely, this explains, in great part at least, the over-sized revisions. Both WG5 and J3 have limited resources, and on occasion, features have been accepted that consumed large amounts of these limited resources to be completely defined. We should examine examples of individual new features, large and small, to see some of the mechanisms causing the bloat. Among large features, consider the length type parameter of parameterized derived types. Uniform opinion among compiler engineers has been expressed to me that implementation of length type parameters in derived types is very costly. The corresponding increment of benefit to applications programmers is relatively modest. This should not be taken as a claim that derived type length parameters are unworthy, merely that the high implementation cost appears to challenge whether the modest benefit is worthwhile on overall balance. Compiler suppliers face a large cost to implement, and do not hear from their customers that the feature is critically needed. So the decision to lower the priority of this feature's implementation is understandable. Relatively low priority of a difficult task is a sufficient explanation for the delay of the appearance of length type parameters in compilers. Among small features, consider the NEW_LINE intrinsic procedure. It returns a character result that has a know value for an implementation. No computation is involved. Supposing the implementation cost to be completely trivial, the incremental benefit to applications programmers is a standard name for a constant. An example of a feature that consumed large amounts of WG5's and J3's limited resources is the contiguous attribute. Appearing conceptually simple, this one attribute required so much time in subgroup that a second work item was added to the list just to complete it. The expected gain of this feature is superior execution efficiency that may be available when arrays are known to be stored in sequential storage locations. So while the potential benefits are substantial, so was the cost in WG5 and J3 time and effort. In these above cases, an argument may be made to support inclusion of the feature in the standard. I am not attempting to debate the particulars, pro or con. I am stating that the lack of a cost-benefit analysis for these features has weakened whatever case there may be for their inclusion in the standard. WG5's focus has long been on the feature proposed, but not the underlying problem to be solved. So alternative solutions, perhaps with better cost-benefit ratios than the feature proposed, were simply foreclosed from the beginning. Indeed, without focusing on the problem to be solved, WG5 cannot have complete confidence that the underlying problem is actually solved by the feature, no matter how much effort WG5, J3, and compiler engineers, spend on it. Yet many argue that the revisions where these features first appeared were too big! I believe there is no evidence that shows a formal evaluation of the problems to be solved, nor of the costs and the benefits of new feature proposals. The "JKR" scale used during processing proposals for Fortran 2008 included difficulty to implement and difficulty to standardize, but not benefit to applications. It completely ignored any mention of the problem to be solved. And it was never treated in sum, to scale the whole revision. There is nothing in the record to show a measure of the size of the whole revision, let alone any comparison made against any particular target size, either as a preset or as a running total. The poor cost-benefit of some new features cited above, whether the feature is large or small, appears to derive exactly from a lack of a clearly stated estimation either of cost to implement or of benefit to applications. Very clearly, cost and benefit were not formally considered together. And no quantitative appreciation appears to have been given to the resulting overall magnitude of the whole revision. Thus, a different approach may be profitably considered. IV. Winter Witter about Whither ------------------------------- We should address both the overall magnitude of a new revision, including the cost to implement a solution of any particular problem to be solved. We should also keep in mind our own limited resources as WG5 and J3. That is, we should clearly know why any new feature is to be included. In short, we should know enough to improve our estimates of the cost-benefit ratio of any particular proposal. Doing so for each proposed new feature is the only way we can improve the cost-benefit ratio of the whole revision. Whether the choice is made to measure the costs in delay of fully-compliant compilers in calendar units, or in some other, non-time-related, or even arbitrary work unit, it is essential to have a budget and to respect it during the whole schedule of publishing a revision. Placing a limit on the size of the whole revision is not as unfair to individual proposals as it might seem at first sight. Sticking to a budget also helps to stick to a schedule, and leaves suppliers in a better position to implement the whole revision more rapidly. (Indeed, that is the motive.) Thus, if a proposal must await a subsequent revision, it is not as great a delay as it might otherwise be. And a new feature proposal may benefit from experience using those portions of the revision undergoing implementation while the new feature is being considered for inclusion in the subsequent revision. Recognizing that compiler engineers' time is the throttle, the question becomes how to maximize the return on the investment of this limited resource. This represents a budget to be spent, albeit an inexact one, and with inexact expenses. So it is a challenge, but a challenge that must be met if WG5 is to do its job as a standardization committee. A new feature proposal must contain a credible estimate of the level of effort required for implementation. This estimate must be set by compiler engineers only, as only they have the necessary experience. Absent alternative measures of the cost of a proposal, the argument that any such estimates are uncertain, and even subject to sandbagging, is unpersuasive. The only alternative to an imperfect estimate is no estimate at all. Experience with recent revisions too clearly shows that lack of an explicitly stated estimate of cost simply will not do, and results in too-large a revision of the standard. To have a better understanding of new feature proposals, I suggest that, rather than accepting more-or-less fully formed new feature proposals, use cases where Fortran is currently wanting should be considered. That is, we should ask that problems to be solved be brought to WG5, rather than proposed solutions, as was done previously. This should not be taken to imply that a new feature proposal cannot contain any description at all of a desired new feature, only that the problem to be solved is the focus, and any proposed solution is merely an explanatory example. The appropriate use case is a difficulty encountered writing an application using Fortran where the cause of the difficulty is related to Fortran, rather than the problem to be solved. Before vetting a new feature proposal, the use cases that motivate the proposed new feature should be stated. This allows subgroups to find common, or similar, problems. The number of similar use cases gives a measure of the breadth and severity of the issue presented. So equipped, the information available to subgroup is at least as much, if not more, than the information available to any single proposer. Thus, subgroup may address the motivating issues at least as effectively as any single proposal. WG5 also has a better basis for believing that the feature actually proposed for the revision addresses the motivating problem. As with compiler engineers and cost estimates, the information at hand may not be perfect, but proceeding with more information is clearly better than proceeding with less. Subgroups, including their compiler engineers, can then distill, combine, split, enhance or limit, and rank use cases. Where consensus is gained in subgroup that a set of use cases has sufficient and sufficiently wide-spread weight, then subgroup can design the feature. Compiler engineers will be involved in the design of each new feature from the beginning. Thus, compiler engineers can review and critique proposals, so features, or portions of features, may be avoided where there is a high risk of incurring excessive costs for the benefit delivered. Of course, the idea is not to let compiler engineers simply gut proposals, but to actually solve the issues identified by applications programmers. A balance is sought at the point of best cost-benefit ratio. Consensus judgement is required. The cumulative workload of the entire revision is better justified than is the cumulative workload where such estimation is absent. The notion of cumulative workload raises an important, if perhaps subtle, point. Compiler suppliers have a wide variety of sizes of engineering teams, from several tens to one. The professionals-versus-volunteers divide is another major source of variety among engineering teams. Suppliers with larger engineering teams may, perforce, apply more resources to new feature implementation than can suppliers with smaller teams. So the incremental delay towards fully-compliant compilers may scale differently, considering a given new feature, among suppliers with various sizes of engineering teams. But however many resources a compiler supplier has, the needed resources are taken from other tasks, such as bug fixing, optimization, and efficient use of new hardware. One simply must assume that no compiler supplier has resources lying idly around. Any supplier may have at play other factors, perhaps managerial or budget, such as a willingness or otherwise to commit resources to Fortran, that limit implementation rate. The free compilers, so important in academia, face their own challenges regarding difficult to implement new features, especially large ones. We must have a means of determining when the cumulative workload has, in fact, reached the preset target amount. One approach that may be agreeable to all is to let compiler suppliers each state their own estimate of their own time to implement. When the proportion of suppliers who pass the target time passes a preset proportion of all suppliers, then the target is declared to be met. The views of every compiler supplier are included in the declaration, yet no one supplier dominates WG5's decision. No matter what one considers of either compiler engineers' ability to estimate implementation costs, or of applications programmers' ability to state the weight of use cases, having the estimates at hand is clearly better than not having them. The problem to be solved is better analyzed by having the focus on the problem rather than a single proposed solution; having various approaches to a problem's resolution under discussion is better than having a single proposed new feature. Applications programmers may well find that compiler suppliers are less reluctant to treat a proposal for a new feature when they more fully understand the motivation for it, and have some confidence that implementation has low risk of surprising difficulties. Likewise, applications programmers may be more willing to accept that a feature is very expensive to implement after discussion of the issues involved with compiler engineers. And the interaction between compiler engineers and applications programmers may well result in a solution that neither had considered before. In sum, a more detailed process may be expected to help WG5 understand better each step of the process. The resulting revision may be expected to be fully implemented in less time while delivering more benefit to applications programmers. This represents a positive response to the difficulties evident during the development and ongoing implementation of Fortran 2003 and Fortran 2008, and recognizing that Fortran 2015 will be shortly available. V. Thou Shalt Sharpen Thine Pencil ---------------------------------- To make a specific proposal, I suggest a process with the following steps be adopted for the handling of new feature requests for the next revision of Fortran. 1. Plenary will discuss and adopt a budget prior to acceptance of any new feature proposals. It shall not subsequently be changed during the drafting of the revision. In whatever units preferred, and with whatever means of acceptance preferred, the budget is a statement of how long following publication of the new revision it is that fully-compliant compilers are desired. My personal view is that a latency of around three-to-five years' time is best. 2. Any new feature request shall be stated as a set of use cases where applications programmers currently have undue difficulties. These use cases will be assigned to a subgroup, or subgroups, by topic. When plenary assigns a set of use cases to a subgroup, plenary may state its preferences for the kind of solution sought. 3. Subgroup will consider use cases, distilling, combining, splitting, enhancing or limiting, and ranking, as appropriate. The designated goal is to get the best return on investment, where investment is defined as the compiler engineers' time, and return is defined as the greatest benefit to the greatest number of applications programmers. 4. When a set of use cases has been accepted by subgroup as motivating a new feature, subgroup will design a set of requirements, together with a cost estimate from the compiler engineers in subgroup. This will be reported to plenary. Plenary may choose to constrain the resulting feature, or to accept as-proposed. 5. If the requirements are acceptable to WG5 in plenary, subgroup will make a set of specifications. This will be reported to plenary. All the compiler engineers in plenary will make an estimate of cost of implementation. 6. WG5 in plenary will then accept or decline the proposed new feature. 7. At the end of a meeting, a poll of suppliers will be made. If more than the preset proportion of suppliers give an estimate of longer than the preset overall time to implement, the budget is spent. 8. When the budget is spent, starting with the next meeting and beyond, no new feature requests will be accepted in plenary, nor assigned to subgroup. The worklist is fixed. Whether this proposal is accepted or not, WG5 should have some process in place prior to accepting any new feature proposals. Discussion of how to proceed will be more productive if a set of proposals is not already on the table.