ISO/IEC JTC1/SC22/WG5-N1751 An Analysis of NOTE 8.38 ------------------------ Nick Maclaren, 24th October 2008 This paper arises out of Email discussions, and is an attempt to explain why defining the semantics (i.e. memory model) of VOLATILE coarrays is an intractable task. It should be noted that few other interfaces even attempt it, and the use of volatile in POSIX threads, Unified Parallel C and most C-derived parallel languages is implicitly undefined behaviour. SC22WG21 (C++) is attempting it, and it has caused a completely disproportionate amount of trouble. The Example ----------- LOGICAL, VOLATILE :: LOCKED[*] = .TRUE. INTEGER :: IAM, P, Q IAM = THIS_IMAGE() IF (IAM == P) THEN ! Preceding segment SYNC MEMORY ! A LOCKED[Q] = .FALSE. ! segment Pi SYNC MEMORY ! B ELSE IF (IAM == Q) THEN DO WHILE (LOCKED); END DO ! segment Qj SYNC MEMORY ! C ! Subsequent segment END IF On the face of it, this looks fine. But where, in the normative text of the standard, is it specified that segment Qj must be executed in parallel with segment Pi? Segments Pi and Qj are unordered, and it is therefore permitted for a processor to execute segment Qj to completion before starting segment Pi. That, of course, means that this example will not work. Unfortunately, that is not only possible, but actually likely, in some important environments. Consider a shared-use multi-core system, or cluster built out of them, such as is ubiquitous in scientific research organisations for program development and moderate-scale calculations. These are configured for shared use and not for gang-scheduled HPC work, and the programmer has no control over the scheduling policy. At the point of synchronisation, image P may well be 'swapped out' in favour of image Q, because the latter is actively executing and the former is in a wait state. I have observed this phenomenon, and had to do horrible things to the system configuration (as superuser) to resolve the problem, which were incompatible with supporting shared use. Adding SYNC MEMORY ------------------ Let us change the loop to include a SYNC MEMORY statement, which has been claimed to resolve the problem at considerable cost in efficiency. It would be possible to add an extra SYNC MEMORY after the IF statement, but that would change nothing. LOGICAL, VOLATILE :: LOCKED[*] = .TRUE. INTEGER :: IAM, P, Q IAM = THIS_IMAGE() IF (IAM == P) THEN ! Preceding segment SYNC MEMORY ! A LOCKED[Q] = .FALSE. ! segment Pi SYNC MEMORY ! B ELSE IF (IAM == Q) THEN DO SYNC MEMORY IF (LOCKED) EXIT ! segment Qj(k), k=1... END DO SYNC MEMORY ! C ! Subsequent segment END IF It would seem that at least one of the segments Qj(k) would be ordered after segment Pi, but that actually depends on a circular argument, and is therefore fallacious. Segment Qj(k) is clearly ordered after segment Pi if and only if LOCKED was .TRUE. during the execution of segment Qj(k-1). But LOCKED is required to be .TRUE. in segment Qj(k-1) if and only if segment Qj(k-1) is ordered after segment Pi. That obviously recurses back to the origin, and we have a reductio ad absurdum. So introducing the SYNC MEMORY statement does not help. An Attempted Fix ---------------- It is attractive to regard a VOLATILE coarray reference as a pseudo-segment with the property that the ordering and access are simultaneous; that could be said to fix the above example, but it introduces an even more intractable problem. VOLATILE coarray references may occur in expressions, which need not always be evaluated in full (7.1.7) nor in syntactic order (7.1.5.2), so we now have the concept of an undefined number of pseudo-segments that may be evaluated in an undefined order, nested within a real segment. While it might be possible to define ordering in terms of those pseudo-segments, I cannot think of how. Conclusion ---------- The example program given in NOTE 8.38 relies on undefined behaviour, and is effectively unfixable. Because it is probably the simplest plausible use of VOLATILE coarrays, that implies that all realistic uses of them are, too. Therefore there are three options: 1) To accept a standard with known inconsistencies, and hope that it does not cause too much trouble. Experience with C89, C99 and POSIX shows that this is a forlorn hope. 2) To copy SC22WG21 and put the effort necessary into defining a proper memory model that includes VOLATILE coarrays. I know that I am not capable of that task. 3) To exclude VOLATILE coarrays from the standard, not least because there is no prior art (neither Co-Array Fortran nor the current Cray compiler documentation include them).