ISO/IEC JTC1/SC22/WG5-N1749 VOLATILE Coarrays Break Existing Code ------------------------------------- Nick Maclaren, 31st October 2008 In my view, one of the most serious faults of VOLATILE coarrays is that they will cause existing compiled code to break - i.e. code that uses neither VOLATILE nor coarrays. There are two reasons for this, which are explained in N1745, but this paper provides more detail on one of the points (and a lot less context). 1) It is permitted to define a coarray in one image without it having the VOLATILE attribute, and to reference it in another with the VOLATILE attribute, without the segments being ordered. This does not work, because many compilers on many architectures access non-VOLATILE objects in ways that are not safe together with cross-processor access. An example is given below. The only solution is to require coarrays to have the VOLATILE attribute in all scopes or none. 2) Note 12.51 states that constraints C1274 to 1285 are designed to guarantee that a PURE procedure is free from side effects, and therefore may be called safely where there is no explicit order of evaluation, and need not be called if their value is not needed. The introduction of VOLATILE coarrays makes those constraints inadequate. No example is given here, because one is given in section 3 of N1745. The only solution is to forbid any reference to a VOLATILE coarray inside PURE procedures. N1745 makes similar remarks about functions, but that has been said by one person to be incorrect. The semantics of impure function calls has always been a source of heated and unproductive debate, and the point is not critical anyway, so I propose to concede it. Note that this paper is NOT arguing for the preservation of VOLATILE coarrays, as resolving the specification problems is a much harder task, but that is not covered here. Example of First Issue ---------------------- In the following, I shall use the Intel 64 Architecture as an example, but similar remarks apply to several other architectures. Consider the following program: PROGRAM Main INTEGER :: x(100) = 123456789, y(100)[*] = 0 IF (THIS_IMAGE() == 1) THEN CALL Fred(x,y) ELSE IF (THIS_IMAGE() == 2) THEN CALL Joe(y) END IF CONTAINS SUBROUTINE Fred (in, out) INTEGER, INTENT(IN) :: in(100) INTEGER, INTENT(OUT), TARGET :: out(100) out = in+in END SUBROUTINE Fred SUBROUTINE Joe (data) INTEGER, VOLATILE :: data(100)[*] PRINT *, data[1] END SUBROUTINE Joe END PROGRAM Main Note that SUBROUTINE Fred contains no use of either VOLATILE or coarrays. Compiling it on its own (or in the whole program with cosubscripts removed) using the Intel 10.1 Fortran compiler with the -fast option generates the following instructions: movdqa (%rdx,%rdi), %xmm0 paddd %xmm0, %xmm0 movdqu %xmm0, (%rdx,%rsi) The following statements are taken from the Intel 64 Architecture Memory Ordering White Paper: http://www.intel.com/products/processor/manuals/318147.pdf Section 1 states that aligned loads and stores of 1, 2, 4 and 8 bytes are implemented atomically, and then includes the following paragraph: Other instructions may be implemented with multiple memory accesses. From a memory-ordering point of view, there are no guarantees regarding the relative order in which the constituent memory accesses are made. There is also no guarantee that the constituent operations of a store are executed in the same order as the constituent operations of a load. The above program is therefore undefined behaviour, because the store is of 16 bytes, and therefore may store the bytes in any order. Aside: Using a VOLATILE coarray with two different base types (especially INTEGER and REAL) would cause similar problems for a few architectures, because some architectures require special action to ensure that the integer and floating-point memory pipelines are synchronised across processors. However, that seems to be already excluded in the current draft.