ISO/IEC JTC1/SC22/WG5 N1844 Result of WG5 informal ballot on the schedule and draft of TR 29113 John Reid 1. FIRST QUESTION The first question asked wheter this proposed schedule for TR 29113 is acceptable. N1838 reviewed by WG5 2010-11 Third draft 2011-02 WG5 ballot 2011-02 PDTR forwarded to SC22 2011-03 PDTR ballot initiated 2011-03 PDTR ballot comments available 2011-05 DTR constructed 2011-06 DTR ballot initiated 2011-07 DTR ballot results available 2011-10 TR published 2011-11 ANSWERS: Bader, Donev, Long, Rasmussen, Reid, Takata Yes. Cohen It is certainly better than the previous one. Some technical aspects of the TR continue to be challenging, so I am not completely confident that we have relaxed the deadlines enough. Corbett I do not believe that the revised schedule proposed above is achievable. I do not think agreement on the technical issues by February 2011 is likely. Maclaren (late comment) While I believe that the timescale is feasible, in theory, I believe that it is undesirable. Almost certainly, the specification will be settled with major problems unresolved, and these may include actual inconsistencies in the basic model. Muxworthy Yes but it is extremely optimistic. It depends (a) on all the UTIs being resolved at the February J3 meeting, (b) the revised document being distributed immediately afterwards, and (c) WG5 then finding no faults in it. Whitlock Yes, although I am concerned it still does not give us enough time to resolve technical issues. CONCLUSION: We will use the revised schedule. 2. SECOND QUESTION The second question was: Do you have any comments on N1838? Please give special attention to the UTIs. For each significant change, please provide text for a new UTI. The answers are grouped into topics. In some cases, replies in emails are included. .............................................................. Not specific Cohen (0) The way that assumed-type is supposed to work when not in conjunction with assumed-rank/shape seems philosophically inconsistent with the way that it works in conjunction with assumed-rank/shape. This seems to lead to either technical problems or usage problems. (1) For assumed-type assumed-rank, passing the type information to the C program does not seem to work when the actual argument is assumed-type. Having the type information for TYPE(*) dummies not working on TYPE(*) actuals is just bizarre. Inclusion of the type field in the descriptor should probably be reconsidered. A reasonable alternative would be to forbid passing TYPE(*) actuals to TYPE(*) dummies. A third alternative would be to require the type (and size, see 2) information to be passed around for all TYPE(*) things, but that might not meet the interoperability goals. (2) For assumed-type assumed-rank, passing the element size to the C program does not seem to work when the actual argument is assumed-type. If we really want this field, then either we need to forbid passing TYPE(*) actuals to TYPE(*) dummies or to require the compiler to pass the size information around. If we go for the "forbid" route, I hope that we can pass CLASS(*) actuals to TYPE(*) dummies - and that if this ends up in a C descriptor that the descriptor fields "type" (if we keep that) and "elem_size" (and "length" if we get that) are set appropriately. In fact I think we can do that anyway as the TR stands. (3) I don't understand what the second sentence of 5.2.7p1 is trying to say. Donev Let me write my proposed solution, but I won't try to get the language perfect: An assumed-type dummy argument that is of assumed-shape or assumed-rank shall not correspond to an explicit-shape or assumed-size actual argument that is itself an assumed-type dummy argument. [I think allocatables and pointers are OK. Also note that a CLASS(*) actual is allowed.] If this is done, the "unknown type" should be removed from the list of type enumerators in the C header file. Note for Rationale: This ensures that a caller can always pass type information for an assumed-shape or assumed-rank to the callee, even though there is no means to inquire it within Fortran. .............................................................. Not specific Cohen (4) In the past we have always required the TR to give actual edits to the standard to implement the extension. That could well be necessary this time around to get the required clarity as to which provisions/rules of F2008 will continue to apply and which ones will not. Synder This is extremely valuable. I found many of what would otherwise have been hidden corners with subtle problems lurking in them when I developed edits for the submodule TR. Takata I agree with Malcolm. .............................................................. Not specific Maclaren (late comment) The worst problems relate to the issues raised in 10-235, which are that the semantic model of descriptor use is incompatible with both Fortran's and C's. Furthermore, certain critical operations (like passing an allocatable array as held by C as an assumed-shape dummy argument to Fortran) can be done only by allowing C to create descriptors directly (i.e. not using provided calls), but that also permits many actions that are incompatible with Fortran semantics. There is no specification of what actions the C code may perform on descriptors, except in a few cases. If these issues are not addressed properly, they will lead to variant de facto constraints, a consequence loss of portability, and extremely problematic requests for interpretation. They are NOT the sort of issue that can be resolved by wording later, as they involve the basic specification of the semantics of descriptors. .............................................................. Page 3, UTI 1 Donev Re UTI 1: I do not like "unlimited polymorphic", and in fact strongly prefer that it me made very clear assumed type has nothing to do with unlimited polymorphic. But the standardese may need some more work than I have time for. Cohen But it *is* unlimited polymorphic. It's certainly not "CLASS(*)", but it would be wrong to say that it has "nothing to do with [it]". And it is certainly polymorphic (that is what "assumed type" actually means) and if it is not limited as to what type it can assume, then it is also unlimited. Not being able to do type enquiry, and not being able to copy it, does not make it not unlimited polymorphic. Cohen In any case, adding TYPE(*) is a big deal as far as the standard goes and careful scrutiny of the wording and technical interactions will be required. Whether we choose to reuse the "unlimited polymorphic" term or not, nearly every occurrence of unlimited polymorphic will bear examination to see whether TYPE(*) is relevant, or indeed whether it should be relevant. Furthermore, there are also a whole bunch of places where we currently say unlimited polymorphic and we do want that to apply to TYPE(*), so if we call it something different then we need to edit all of *those* places! (This is why I suggested using the term in the first place.) At the end of the day, what we have is CLASS(*): no declared type, any dynamic type, few restrictions on use, not very interoperable TYPE(*) no declared type, any dynamic type, many restrictions on use, interoperable IMO there is no better description for the declared/dynamic type parts of that than "unlimited polymorphic". Unless the wording problems using the term would introduce are greater than the wording simplicity it allows elsewhere, we should use it. Not being able to do type enquiry, and not being able to copy it, does not make it not unlimited polymorphic. Bader Aleks has already pointed out some shortcomings here. Given the example in A.2 I'd particularly like to know what a_desc->type will be if the calling Fortran code does CLASS(*), ALLOCATABLE :: A(:,:), ... ALLOCATE(INTEGER(C_INT) :: A(10, 10), ...) CALL elemental_mult(A, B, C) as opposed to INTEGER(C_INT) :: A(10, 10), ... CALL elemental_mult(A, B, C) Quite generally, the "type" member in CFI_cdesc_t appears to cause more confusion than provide actual type safety. For statically typed dummy arguments, the C programmer will not need the information in the descriptor, because it is already provided by the signature. For TYPE(*) there either is no access to type information at all (in the descriptorless case), or complete type information is limited to intrinsic types (i.e., two different derived types with same elem_len cannot be disambiguated at all via the information available in the descriptor). These limitations should be pointed out in a NOTE. The recommended usage pattern for non-dynamic TYPE(*) entities should be that the type information is explicitly provided by the programmer; an appropriate comment should also be added to the elemental_mult() example A.2. (TYPE(*) arguments with the POINTER or ALLOCATABLE attribute will need to be associated with ultimate CLASS(*) actual arguments given the type compatility rules). Furthermore, examples with object management inside C should be added to A.2 (using CFI_associate as well as CFI_allocate). .............................................................. Page 5 Donev 3.3 paragraph 2 rules out certain actual arguments: derived types that have type parameters, type-bound procedures, or final procedures. It should be made clear whether these restrictions apply to the dynamic type only. Specifically, if the actual is CLASS(*), it seems to me there is no way to do any safety checking until runtime. Typically this means people will be using CLASS(*) as a way to bypass the restriction and no one will notice or complain. I would suggest that polymorphic actuals corresponding to assumed-type dummies be forbidden entirely. .............................................................. Page 9 Donev Consider the example: interface subroutine sub_c(x) bind(c) type(*),dimension(:) :: x end subroutine end interface subroutine sub(x) type(*),dimension(*) :: x call sub_c(x(1:10)) ! What type does sub_c get? end subroutine Is the intention that the CFI_type_t field that sub_c gets in its descriptor be "CFI_type_unspecified"? If so, I suggest that this be made explicit, that is, there is a rule that explains how the type field in the descriptor gets filled when the actual is assumed type or, even worse, polymorphic (see my point re 3.3 paragraph 2 above). .............................................................. Page 10, 5.2.2: Bader The definition of the member dim[] of CFI_cdesc_t mentions a "corresponding dimension" of the object without actually defining how this correspondence is set up. Presumably the zeroth element of dim corresponds to the first dimension in Fortran. Or is it the last? At the very least, a NOTE should be provided so C programmers do not get confused. .............................................................. Page 11, UTI TR3 Donev I personally hate the proposed "solution" for characters. When it comes to voting again, count me or alternative (1), making elem_len equal to the character length (and if we do add wchar_t we will have another type identifier, different from char, so I do not see what the problem would be). Reid I favour option (2): add an additional character length member. Reason: this is the most straightforward way to solve the problem. Bader I agree with the argument that the rank field should match the return value of the RANK intrinsic. Of the contenders, I favor "(2) add an additional character length member" (this avoids contradictions with the definition of the elem_len member in 5.2.2, a change to which would be needed if (1) is adopted). Long Regarding UTI TR3, I definitely do not like the current rank-expansion scheme in the TR. The main alternate options seem to be 1 (use elem_len) and 2 (add a new element length member). I would argue that option 1 is better because: 1) From the Fortran program point of view, the length of the element of the array IS the len() of an element of the array for a character array. This is the value proposed to be used as the elem_len member in the C descriptor. 2) A character length member in the descriptor is dead weight for all types except char. And for the specific type of char, the elem_len for a single char is defined to be 1 (since the value is in units of size of char), so sticking 1 into the elem_len field provides no added information. 3) In many cases (especially in the MPI usage context) the objective is to just move a block of memory from one place to another. With option 1, there is noting special about the character len()>1 case as the appropriate number of bytes to move is the elem_len field value. With option 2, the user would have to check for type char and, for that case only, use the new field value to get the number of bytes to move per array element. (Nominally, it would be new_member*elem_len, but elem_len=1 for option 2.) This seems like an unnecessary coding complication. Rasmussen Regarding UTI TR3: I mildly prefer option 1 (use elem_len) over option 2. But either of the two options are much preferable to rank expansion. Whitlock UTI TR2 (page 10). I prefer option (2): add an additional character length field. .............................................................. Page 11, macro CFI_DESC_T Donev Some more specification is needed to clarify what this macro does, or an explicit form for it provided as an example. Otherwise one cannot evaluate whether it will be useful in practice. Consider CFI_DESC_T(5) object; // Does object.base_addr work? Does CFI_DESC_T(5) expand to some opaque block-of-enough-bytes or is it an actual typed struct? If so, one can only use the object of type CFI_DESC_T(5) via pointers and casts, as done in the example in Note 5.2, but one could not in fact do something like object.base_addr. Maclaren 10-232r2 was written in haste, but the example in that does permit such use (and initialisation using designators). That is fairly clear from the C standard. However: There is a trivial bug in the example, because I shouldn't have used rank as a macro argument (it was a last-minute edit). Any name that is NOT a type or field name will do - e.g. 'arg'. Upon deeper study, C99's 'definitions' of compatibility and completeness of types are far more confused than I had remembered, and some more precise wording is needed to avoid ambiguity. I have written a paper and will upload it, but it is PURELY wordsmithing. .............................................................. Page 15, 5.2.6.5 CFI_associate: Possible UTI Bader The applicability to assumed-shape entities appears spurious, at least in the case the ultimate actual argument was created in Fortran, possibly as a non-dynamic entity. I suggest limiting the use of this function to associating an existing C memory area with a Fortran entity with the POINTER attribute, analogous to C_F_POINTER, but with the additional feature that the resulting Fortran pointer can be non-contiguous. Dynamic creation of memory for Fortran objects should always be done using CFI_allocate, or by explicitly invoking malloc previous to a call to CFI_associate. Furthermore, discussion of memory requirements should either be removed or quite generally refer to the information stored in the descriptor. .............................................................. Page 16, 5.2.6.9 Donev A routine such as CFI_cdesc_to_bounds cannot be implemented in general. It only works for contiguous objects, since strides do not have to be integer multiples of the elem_length. Consider for example an array of derived type and a data-ref such as array_of_dt%integer_component. Depending on what the other components of the derived type and the compiler aligment choices are, one cannot reverse-engineer the Fortran triplets from the C strides. The routine CFI_cdesc_to_bounds should be removed. Maclaren (late comment) I agree that CFI_desc_to_bounds should be removed, as it currently specifies an impossibility. .............................................................. Page 16, 5.2.8 Bader Para 3 item (2)(a) implies that functions with results which are arrays of interoperable type with or without the POINTER or ALLOCATABLE attributes are not interoperable. N1820 does not mention this case at all (there are of course some other cases which are not covered, but this seems the most glaring omission). Was this intended or is it an oversight? ........................................................... Minor comments Reid 3:16-17. On line 16, add "one of" before "the intrinsic"; on line 17, change "or" to "and". 5:21+. Reword NOTE to "Because the type and type parameters of an assumed-type dummy argument are assumed from its effective argument, neither can be used for generic resolution. Similarly, the rank of an assumed-rank dummy argument cannot be used for generic resolution." 9:30. Add period at line end. 11:3. Change "is equal to (CFI_index_t)-2" by "shall have the value -2". 14:7-8. Replace "Pointer objects" by "A pointer object". 15:34. Add comma after "pointer". 15:44-45. Move the sentence "Since ..." to a note, since it is just explanation. Bader 5.2.3: para 3: Replace "the member attribute shall be" by "the member attribute shall have the value" para 3: Replace "member dim is equal to" by "member dim shall be equal to" 5.2.6.1: Suggested wording fix: In para 1, replace "for use in C functions" by "for invocation from C". 5.2.6.x: For the functions returning an integer value, the sentence "The result is an error indicator" is a general statement and should be moved to before specific instances of errors are listed wherever that is the case (given the text in 5.2.6.1 para 3 the sentence might even be removed). 5.2.6.5 CFI_associate: remove the comma in the prototype at: "void *, base_addr" 5.2.7: The first sentence of para 1 appears to be outdated. I suggest replacing "The base address ... assignment." by "The base address in the C descriptor for a Fortran pointer shall be only modified by execution of either the CFI_associate or CFI_allocate functions, by pointer association or nullification inside a Fortran procedure, or by deallocation inside a Fortran procedure if this is permissible as stated in 6.7.3.3 of ISO/IEC 1539-1:2010." Similarly, in para 3, the sentence "It is possible ... in a C descriptor." should be replaced by "It is possible to associate a memory area defined within C with a Fortran pointer in a C descriptor (5.2.6.5)." (this memory area need not have been generated by a call to malloc()). 5.3: Is this really needed? The contents of 5.2.6.7 and 5.2.7 appear to me to be sufficient to ensure the desired semantics. A.1.1: In para 1, remove ", as a solution to the "-i8" compiler switch problem."