ISO/IEC JTC1/SC22/WG5 N1532

	                    NaN I/O
                         
                         Richard Maine

As currently specified, I/O of NaNs does not achieve the stated
goal of portability.  I briefly mentioned one aspect of this on
the floor of the last meeting.  My comment not being accepted by
the mover, I didn't push it at the time because the paper in
question had much other material that seemed worth getting done
and this issue seemed localized enough to be suitable for fixing
in a next iteration.

The time for that next iteration is now here.  And I also now see
some additional aspects of the problem in addition to those that
had occured to me before.  They are still localized and should be
easy to fix, but I think it needs doing.

The problem is that the current draft does very little to
guarantee that a NaN written by one processor will be readable as
a NaN by another, which was supposed to be one of the main points
of this feature.  I am not talking about the fine point of
reading exactly the same NaN, but rather the more fundamental
point of having a valid input field at all.

I see it as reasonable to allow processor dependence in the exact
output forms.  However, we need to specify things in such a way
that an output written by one processor can be read by another.
We haven't done that.  We haven't actually even guaranteed it for
the same processor, but I agree we can count on the vendor to get
that right; we can't count on different vendors to get each other's
forms right though.

PROBLEMS

1. The one I mentioned at the last meeting - we need to specify more
   about the allowed "processor-dependent" characters.  We
   haven't ruled out things like delimiters that would cause
   ambiguities in list-directed I/O.  We haven't ruled out close
   parens, making the end possibly ambiguous.  We haven't even
   ruled out characters that are not in the Fortran character set
   and thus might not be representable on all processors.
   Although my floor comment about this was "brushed off" as
   something the vendors would just get right, I don't think
   that's a good enough answer for portability.

   I'm not sure whether it is best to take a very restrictive
   approach (perhaps limit it to alphanumeric characters - that
   would have the advantage of being easy to describe) or a more
   liberal one (disallowing only specific characters likely to
   cause problems).  I certainly think that we should at least
   restrict it to the Fortran character set.

2. It is fine to have some things processor-dependent on output, but
   too many of the words about output were "copied" to the
   requirements for input, where they have very different effect.
   For input, it isn't the processor that supplies the
   characters; the processor interprets the characters supplied
   to it.  Thus saying that a form is processor-dependent for input
   is a killer for portability; it means that the user is
   responsible for making the data match whatever requirements
   the processor defines.  What we want to do here is require the
   processor to accept anything that meets the form specified by
   the standard.  What is processor dependent is the
   *INTERPRETATION* rather than the form; the distinction between
   interpretation and form is really important here.

3. Mostly just a wording problem, but one that causes possible
   technical confusion.  Sometimes we talk about nonblank
   characters enclosed in parens.  Other times we talk about
   nonblank characters following the NaN string.  Do we mean this
   distinction literally, or is the phrasing just sloppy?  I'm
   guessing that the wording is just sloppy, but there is always
   the danger that someone might read them literally.  Or perhaps
   we did mean them literally, in which case some people will no
   doubt make the same guess as I did (and they'll be wrong).

   Note that the parens themselves would be nonblank characters
   following the NaN string, even if there were no characters
   between the parens.  I am assuming that "any number" includes
   zero; if it isn't intended to, then we need to say so.

4. We specified that input of NaN with nothing following it
   gives a quiet NaN, but we didn't say anything corresponding
   for output.  Did we just forget that or did we really mean
   for it to be unspecified?  If it is unspecified on output,
   then I don't see much benefit to specifying it on input.

EDITS (relative to J3/03-007)

  These edits make some choices for solutions to the above
  problems.  They are not the only plausible choices.  If we
  make different ones, then the edits can be revised, but I
  wanted to have a draft of edits to start from.  (I choose
  alphanumeric out of pure laziness - it was easiest choice
  to express).

  {This is input, so son't say it is processor-dependent.}

  [230:19] "a processor-dependent" -> "any"

  {My arbitrary choice}

  [230:19] "nonblank" -> "alphanumeric"

  {We don't allow blanks in the first place, so the nonblank
   qualifier here is just confusing.}

  [230:24] delete "nonblank".

  {I am interpreting NaN() to be the same as NaN.  If that
   wasn't the intent, then change this.}

  [230:24] After "characters" add "in the optional parentheses".

  {My arbitrary choice again}

  [231:31] "nonblank" -> "alphanumeric"

  {Make outputs correspond to what we require for inputs}

  [231:32] After the "." Insert
      "If the NaN is a quiet NaN, there shall be no characters
       within the optional parentheses."