ISO/IEC JTC1/SC22/WG5 N1913 The module for Part 2, varying length character strings John Reid 16 April 2012 I promised at Garching to look into the conversion to Fortran 2003 of the procedures of subclauses 3.6 and 3.7 of Part 2 of the Standard. These are the I/O procedures GET, PUT, and PUT_LINE and the elemental procedures EXTRACT, INSERT, REMOVE, REPLACE, and INSERT. The other procedures of Part 2, except for CHAR with a character argument and VAR_STR, are already in Part 1. Lawrie Schonfelder wrote a sample module to implement Part 2. This is available from ftp://ftp.nag.co.uk/sc22wg5/ISO_VARYING_STRING/ He also wrote two sample programs ilustrating the use of the facilites. These are included as an annex to Part 2 and are available from the same ftp site. Bill Long has pointed out that the sample programs are not portable because they rely on the IOSTAT value -1 indicating that the end of the file has been reached. I have made a new module and new versions of the two sample programs. These are appended to this paper. They run successfully under the Nag compiler. Each occurrence of type(VARYING_STRING),INTENT(IN) was replaced by CHARACTER(LEN=*),INTENT(IN) Each other occurrence of type(VARYING_STRING) is replaced by CHARACTER(LEN=:),ALLOCATABLE This resulted in arguments that were previously documented as either of type CHARACTER or VARYING_STRING becoming only of type CHARACTER. As well as the procedures subclauses 3.6 and 3.7, I included CHAR with a character argument and VAR_STR. The number of specific procedures reduced from 43 to 18. There were obvious mechanical changes needed within the procedures. The functions EXTRACT, INSERT, REMOVE, and REPLACE cannot be elemental because each function result is allocatable with deferred character length. The subroutine SPLIT cannot be elemental because the arguments string, word, and separator are allocatable with deferred character length. This is because the elements of a character array have to have the same character length. A similar problem occurs in the vocabulary sample program, where an array holds words of varying length. Following a suggestion from Malcolm Cohen, I declared my own derived type with a single component of deferred character length and worked with arrays of this type. In this example, none of the functions was called elementally, but a user that wishes to do this could write impure elemental wrappers to get the effect. It might be thought that non-elemental invocations of EXTRACT, INSERT, REMOVE, REPLACE, and SPLIT might be replaced by simple use of substrings and concatenation. For example, REMOVE(string,start,finish) can usually be replaced by string(1:start-1)//string(finish+1:), but there are end cases where it cannot. The reader can see the details in the code that is appended. In both sample programs, I access ISO_FORTRAN_ENV to get the IOSTAT value that indicates that the end of the file has been reached. Lawrie Schonfelder wrote a test suite that is available from: http://www.fortran.com/vst_95a.zip I have adapted codes from here to test the new module, but I am not confident that these adapted codes are adequate to fully test the revised code. In particular, they rely on interactive input from the keyboard and I was unsure of what is needed. If this work is to be taken further, new tests are needed. Two questions were asked at the meeting in Garching: 1. Can programs that use the varying length strings module be so easily converted that Part 2 of the Standard should be withdrawn? 2. Should extra intrinsic procedures be added to part 1? The user with arrays of type VARYING_STRING may declare a new derived type with a single component of deferred character length and write impure elemental wrappers to support elemental calls. Apart from this, the module given here (after full testing) should allow simple conversion. Changes are needed for the extraction of a single character; for example, set%chars(j) needs to be replaced by set(j:j). They are also needed if array features are employed for the character array component. For example IF( ALL(string%chars(ipos:ipos+lt-1) == target%chars) )THEN becomes IF( string(ipos:ipos+lt-1) == target )THEN I prefer not to comment on whether these functions should be added to Part 1. The case needs to be made by those who need them. We might consider providing an open source and fully tested version of the appended module. ............................................................................... MODULE ISO_VARYING_STRING USE,INTRINSIC :: ISO_FORTRAN_ENV IMPLICIT NONE ! Adapted by John Reid from the module written by J.L.Schonfelder. ! Version produced (12 April 2012) !----------------------------------------------------------------------------- ! This module provides revised versions of the generic procedures GET, PUT, ! PUT_LINE, EXTRACT, INSERT, REMOVE, REPLACE, and INSERT of ! ISO/IEC 1539-2:2000(E). !----------------------------------------------------------------------------- ! NB This module has not been fully tested. PRIVATE !-----------------------------------------------------------------------------! ! By default all entities declared or defined in this module are private to ! ! the module. Only those entities declared explicitly as being public are ! ! accessible to programs using the module. In particular, the procedures ! ! defined herein are made accessible via their generic identifiers only; ! ! their specific names are private. ! !-----------------------------------------------------------------------------! !----- GENERIC PROCEDURE INTERFACE DEFINITIONS -------------------------------! !----- Conversion procedure interfaces ---------------------------------------! INTERFACE VAR_STR MODULE PROCEDURE c_to_s ! character to string ENDINTERFACE INTERFACE CHAR MODULE PROCEDURE s_to_c, & ! string to character s_to_fix_c ! string to specified length character ENDINTERFACE !----- Input procedure interfaces --------------------------------------------! INTERFACE GET MODULE PROCEDURE get_d_eor, & ! default unit, EoR termination get_u_eor, & ! specified unit, EoR termination get_d_tset_c, & ! default unit, char set termination get_u_tset_c ! specified unit, char set termination ENDINTERFACE !----- Output procedure interfaces -------------------------------------------! INTERFACE PUT MODULE PROCEDURE put_d_c, & ! char to default unit put_u_c ! char to specified unit ENDINTERFACE INTERFACE PUT_LINE MODULE PROCEDURE putline_d_c, & ! char to default unit putline_u_c ! char to specified unit ENDINTERFACE !----- Insert procedure interfaces -------------------------------------------! INTERFACE INSERT MODULE PROCEDURE insert_cc ! string in string ENDINTERFACE !----- Replace procedure interfaces ------------------------------------------! INTERFACE REPLACE MODULE PROCEDURE replace_ss, & ! string by string, at specified replace_ss_sf,& ! string by string, between replace_sss ! in string replace string by string ENDINTERFACE !----- Remove procedure interface --------------------------------------------! INTERFACE REMOVE MODULE PROCEDURE remove_c ! characters from char , and finish ENDINTERFACE !----- Extract procedure interface -------------------------------------------! INTERFACE EXTRACT MODULE PROCEDURE extract_c ! from char extract string, and finish ENDINTERFACE !----- Split procedure interface ---------------------------------------------! INTERFACE SPLIT MODULE PROCEDURE split_c ! character in set ENDINTERFACE !----- specification of publicly accessible entities -------------------------! PUBLIC :: CHAR,GET,PUT,PUT_LINE,EXTRACT,INSERT,REMOVE,REPLACE,SPLIT,VAR_STR CONTAINS !----- Conversion Procedures ------------------------------------------------! FUNCTION c_to_s(chr) ! generic VAR_STR CHARACTER(LEN=:),ALLOCATABLE :: c_to_s CHARACTER(LEN=*),INTENT(IN) :: chr c_to_s = chr ENDFUNCTION c_to_s PURE FUNCTION s_to_c(string) ! generic CHAR CHARACTER(LEN=*),INTENT(IN) :: string CHARACTER(LEN=:),ALLOCATABLE :: s_to_c ! returns the characters of string as an automatically sized character s_to_c = string ENDFUNCTION s_to_c PURE FUNCTION s_to_fix_c(string,length) ! generic CHAR CHARACTER(LEN=*),INTENT(IN) :: string INTEGER,INTENT(IN) :: length CHARACTER(LEN=length) :: s_to_fix_c ! returns the character of fixed length, length, containing the characters ! of string either padded with blanks or truncated on the right to fit s_to_fix_c = string ENDFUNCTION s_to_fix_c !----- Input string procedure -----------------------------------------------! SUBROUTINE get_d_eor(string,maxlen,iostat) CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string ! the variable to be filled with ! characters read from the ! file connected to the default unit INTEGER,INTENT(IN),OPTIONAL :: maxlen ! if present indicates the maximum ! number of characters that will be ! read from the file INTEGER,INTENT(OUT),OPTIONAL :: iostat ! if present used to return the status ! of the data transfer ! if absent errors cause termination ! reads string from the default unit starting at next character in the file ! and terminating at the end of record or after maxlen characters. call get_u_eor (input_unit, string, maxlen, iostat) ENDSUBROUTINE get_d_eor SUBROUTINE get_u_eor(unit,string,maxlen,iostat) INTEGER,INTENT(IN) :: unit ! identifies the input unit which must be ! connected for sequential formatted read CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string ! the variable to be filled with ! characters read from the ! file connected to the unit INTEGER,INTENT(IN),OPTIONAL :: maxlen ! if present indicates the maximum ! number of characters that will be ! read from the file INTEGER,INTENT(OUT),OPTIONAL :: iostat ! if present used to return the status ! of the data transfer ! if absent errors cause termination ! reads string from unit starting at next character in the file and ! terminating at the end of record or after maxlen characters. CHARACTER(LEN=80) :: buffer INTEGER :: ist,nch,toread,nb IF(PRESENT(maxlen))THEN toread=maxlen ELSE toread=HUGE(1) ENDIF string="" ! Start with an empty string DO ! repeatedly read buffer and add to string until EoR ! or maxlen reached IF(toread <= 0)EXIT nb=MIN(80,toread) READ(unit,FMT='(A)',ADVANCE='NO',EOR=9999,SIZE=nch,IOSTAT=ist) buffer(1:nb) IF( ist /= 0 )THEN IF(PRESENT(iostat)) THEN iostat=ist RETURN ELSE WRITE(*,*) " Error No.",ist, & " during READ_STRING of varying string on UNIT ",unit STOP ENDIF ENDIF string = string //buffer(1:nb) toread = toread - nb ENDDO IF(PRESENT(iostat)) iostat = 0 RETURN 9999 string = string //buffer(1:nch) IF(PRESENT(iostat)) iostat = ist ENDSUBROUTINE get_u_eor SUBROUTINE get_d_tset_c(string,set,separator,maxlen,iostat) CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string ! the variable to be filled with ! characters read from the ! file connected to the default unit CHARACTER(LEN=*),INTENT(IN) :: set ! the set of characters which if found in ! the input terminate the read CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator ! the actual separator character from set ! found as the input string terminator ! returned as zero length if termination ! by maxlen or EOR INTEGER,INTENT(IN),OPTIONAL :: maxlen ! if present indicates the maximum ! number of characters that will be ! read from the file INTEGER,INTENT(OUT),OPTIONAL :: iostat ! if present used to return the status ! of the data transfer ! if absent errors cause termination ! reads string from the default unit starting at next character in the file and ! terminating at the end of record, occurance of a character in set, ! or after reading maxlen characters. CALL get_u_tset_c(input_unit,string,set,separator,maxlen,iostat) ENDSUBROUTINE get_d_tset_c SUBROUTINE get_u_tset_c(unit,string,set,separator,maxlen,iostat) INTEGER,INTENT(IN) :: unit ! identifies the input unit which must be ! connected for sequential formatted read CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string ! the variable to be filled with ! characters read from the ! file connected to the unit CHARACTER(LEN=*),INTENT(IN) :: set ! the set of characters which if found in ! the input terminate the read CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator ! the actual separator character from set ! found as the input string terminator ! returned as zero length if termination ! by maxlen or EOR INTEGER,INTENT(IN),OPTIONAL :: maxlen ! if present indicates the maximum ! number of characters that will be ! read from the file INTEGER,INTENT(OUT),OPTIONAL :: iostat ! if present used to return the status ! of the data transfer ! if absent errors cause termination ! reads string from unit starting at next character in the file and ! terminating at the end of record, occurance of a character in set, ! or after reading maxlen characters. CHARACTER :: buffer ! characters must be read one at a time to detect ! first terminator character in set INTEGER :: ist,j,toread,lenset lenset = LEN(set) IF(PRESENT(maxlen))THEN toread=maxlen ELSE toread=HUGE(1) ENDIF string = "" ! clears return string N.B. will also deallocate string via the ! assignment operation IF(PRESENT(separator)) separator="" ! clear separator readchar:DO ! repeatedly read buffer and add to string IF(toread <= 0)EXIT readchar ! maxlen reached READ(unit,FMT='(A)',ADVANCE='NO',EOR=9999,IOSTAT=ist) buffer IF( ist /= 0 )THEN IF(PRESENT(iostat)) THEN iostat=ist RETURN ELSE WRITE(*,*) " Error No.",ist, & " during GET of varying string on unit ",unit STOP ENDIF ENDIF ! check for occurance of set character in buffer DO j = 1,lenset IF(buffer == set(j:j))THEN IF(PRESENT(separator)) separator=buffer EXIT readchar ! separator terminator found ENDIF ENDDO string = string//buffer toread = toread - 1 ENDDO readchar IF(PRESENT(iostat)) iostat = 0 RETURN 9999 CONTINUE ! EOR terminator read IF(PRESENT(iostat)) iostat = ist ENDSUBROUTINE get_u_tset_c !----- Output string procedures ----------------------------------------------! SUBROUTINE put_d_c(string,iostat) CHARACTER(LEN=*),INTENT(IN) :: string ! the character variable to be appended to ! the current record or to the start of ! the next record if there is no ! current record ! uses the default unit INTEGER,INTENT(OUT),OPTIONAL :: iostat ! if present used to return the status ! of the data transfer ! if absent errors cause termination CALL put_u_c(output_unit,string,iostat) ENDSUBROUTINE put_d_c SUBROUTINE put_u_c(unit,string,iostat) INTEGER,INTENT(IN) :: unit ! identifies the output unit which must ! be connected for sequential formatted ! write CHARACTER(LEN=*),INTENT(IN) :: string ! the character variable to be appended to ! the current record or to the start of ! the next record if there is no ! current record INTEGER,INTENT(OUT),OPTIONAL :: iostat ! if present used to return the status ! of the data transfer ! if absent errors cause termination INTEGER :: ist WRITE(unit,FMT='(A)',ADVANCE='NO',IOSTAT=ist) string IF( ist /= 0 )THEN IF(PRESENT(iostat))THEN iostat = ist RETURN ELSE WRITE(*,*) " Error No.",ist," during PUT of character on UNIT ",unit STOP ENDIF ENDIF IF(PRESENT(iostat)) iostat=0 ENDSUBROUTINE put_u_c SUBROUTINE putline_d_c(string,iostat) CHARACTER(LEN=*),INTENT(IN) :: string ! the character variable to be appended to ! the current record or to the start of ! the next record if there is no ! current record ! uses the default unit INTEGER,INTENT(OUT),OPTIONAL :: iostat ! if present used to return the status ! of the data transfer ! if absent errors cause termination ! appends the string to the current record and then ends the record ! leaves the file positioned after the record just completed which then ! becomes the previous and last record in the file. CALL putline_u_c(output_unit,string,iostat) ENDSUBROUTINE putline_d_c SUBROUTINE putline_u_c(unit,string,iostat) INTEGER,INTENT(IN) :: unit ! identifies the output unit which must ! be connected for sequential formatted ! write CHARACTER(LEN=*),INTENT(IN) :: string ! the character variable to be appended to ! the current record or to the start of ! the next record if there is no ! current record INTEGER,INTENT(OUT),OPTIONAL :: iostat ! if present used to return the status ! of the data transfer ! if absent errors cause termination ! appends the string to the current record and then ends the record ! leaves the file positioned after the record just completed which then ! becomes the previous and last record in the file. INTEGER :: ist WRITE(unit,FMT='(A,/)',ADVANCE='NO',IOSTAT=ist) string IF(PRESENT(iostat))THEN iostat = ist RETURN ELSEIF( ist /= 0 )THEN WRITE(*,*) " Error No.",ist, & " during WRITE_LINE of character on UNIT",unit STOP ENDIF ENDSUBROUTINE putline_u_c !----- Insert procedure ----------------------------------------------------! FUNCTION insert_cc(string,start,substring) CHARACTER(LEN=:),ALLOCATABLE :: insert_cc CHARACTER(LEN=*),INTENT(IN) :: string INTEGER,INTENT(IN) :: start CHARACTER(LEN=*),INTENT(IN) :: substring ! calculates result string by inserting the substring into string ! beginning at position start pushing the remainder of the string ! to the right and enlarging it accordingly, ! if start is greater than LEN(string) the substring is simply appended ! to string by concatenation. if start is less than 1 ! substring is inserted before string, ie. start is treated as if it were 1 INTEGER :: ip,is,lsub,ls lsub = LEN(substring); ls = LEN(string) is = MAX(start,1) ip = MIN(ls+1,is) insert_cc = string(1:ip-1)//substring//string(ip:ls) ENDFUNCTION insert_cc !----- Replace procedures ---------------------------------------------------! FUNCTION replace_ss(string,start,substring) CHARACTER(LEN=*),INTENT(IN) :: string INTEGER,INTENT(IN) :: start CHARACTER(LEN=*),INTENT(IN) :: substring CHARACTER(LEN=:),ALLOCATABLE :: replace_ss ! calculates the result string by the following actions: ! inserts the substring into string beginning at position ! start replacing the following LEN(substring) characters of the string ! and enlarging string if necessary. if start is greater than LEN(string) ! substring is simply appended to string by concatenation. If start is less ! than 1, substring replaces characters in string starting at 1 INTEGER :: ip,is,lsub,ls lsub = LEN(substring); ls = LEN(string) is = MAX(start,1) ip = MIN(ls+1,is) replace_ss = string(1:ip-1)//substring//string(ip+lsub:) ENDFUNCTION replace_ss FUNCTION replace_ss_sf(string,start,finish,substring) CHARACTER(LEN=*),INTENT(IN) :: string INTEGER,INTENT(IN) :: start,finish CHARACTER(LEN=*),INTENT(IN) :: substring CHARACTER(LEN=:),ALLOCATABLE :: replace_ss_sf ! calculates the result string by the following actions: ! inserts the substring into string beginning at position ! start replacing the following finish-start+1 characters of the string ! and enlarging or shrinking the string if necessary. ! If start is greater than LEN(string) substring is simply appended to string ! by concatenation. If start is less than 1, start = 1 is used ! If finish is greater than LEN(string), finish = LEN(string) is used ! If finish is less than start, substring is inserted before start INTEGER :: ip,is,if,nw,lsub,ls lsub = LEN(substring); ls = LEN(string) is = MAX(start,1) ip = MIN(ls+1,is) if = MAX(ip-1,MIN(finish,ls)) nw = lsub + ls - if+ip-1 replace_ss_sf = string(1:ip-1)//substring//string(if+1:ls) ENDFUNCTION replace_ss_sf FUNCTION replace_sss(string,target,substring,every,back) CHARACTER(LEN=*),INTENT(IN) :: string,target,substring LOGICAL,INTENT(IN),OPTIONAL :: every,back ! calculates the result string by the following actions: ! searches for occurences of target in string, and replaces these with ! substring. if back present with value true search is backward otherwise ! search is done forward. if every present with value true all occurences ! of target in string are replaced, otherwise only the first found is ! replaced. if target is not found the result is the same as string. LOGICAL :: dir_switch, rep_search CHARACTER(LEN=:),ALLOCATABLE :: work INTEGER :: ls,lt,lsub,ipos,ipow CHARACTER(LEN=:),ALLOCATABLE :: replace_sss ls = LEN(string); lt = LEN(target); lsub = LEN(substring) IF(lt==0)THEN IF(ls==0)THEN replace_sss = substring ELSE replace_sss = string ENDIF RETURN ENDIF work = string IF( PRESENT(back) )THEN dir_switch = back ELSE dir_switch = .FALSE. ENDIF IF( PRESENT(every) )THEN rep_search = every ELSE rep_search = .FALSE. ENDIF IF( dir_switch )THEN ! backwards search ipos = ls-lt+1 DO IF( ipos < 1 )EXIT ! search past start of string ! test for occurance of target in string at this position IF( string(ipos:ipos+lt-1) == target )THEN ! match found allocate space for string with this occurance of ! target replaced by substring ! copy work into temp replacing this occurance of target by ! substring work = work(1:ipos-1)//substring//work(ipos+lt:) IF(.NOT.rep_search)EXIT ! exit if only first replacement wanted ! move search and replacement positions over the effected positions ipos = ipos-lt+1 ENDIF ipos=ipos-1 ENDDO ELSE ! forward search ipos = 1; ipow = 1 DO IF( ipos > ls-lt+1 )EXIT ! search past end of string ! test for occurance of target in string at this position IF( string(ipos:ipos+lt-1) == target )THEN ! match found allocate space for string with this occurance of ! target replaced by substring work = work(1:ipow-1)//substring//work(ipow+lt:) IF(.NOT.rep_search)EXIT ! exit if only first replacement wanted ! move search and replacement positions over the effected positions ipos = ipos+lt-1; ipow = ipow+lsub-1 ENDIF ipos=ipos+1; ipow=ipow+1 ENDDO ENDIF replace_sss = work ENDFUNCTION replace_sss !----- Remove procedure ----------------------------------------------------! FUNCTION remove_c(string,start,finish) CHARACTER(LEN=:),ALLOCATABLE :: remove_c CHARACTER(LEN=*),INTENT(IN) :: string INTEGER,INTENT(IN),OPTIONAL :: start INTEGER,INTENT(IN),OPTIONAL :: finish ! returns as result the string produced by the actions ! removes the characters between start and finish from string reducing it in ! size by MAX(0,ABS(finish-start+1)) ! if start < 1 or is missing then assumes start=1 ! if finish > LEN(string) or is missing then assumes finish=LEN(string) INTEGER :: is,if,ls ls = LEN(string) IF (PRESENT(start)) THEN is = MAX(1,start) ELSE is = 1 ENDIF IF (PRESENT(finish)) THEN if = MIN(ls,finish) ELSE if = ls ENDIF IF( if < is ) THEN ! zero characters to be removed, string is unchanged remove_c = string ELSE remove_c = string(1:is-1)//string(if+1:) ENDIF ENDFUNCTION remove_c !----- Extract procedure ---------------------------------------------------! FUNCTION extract_c(string,start,finish) CHARACTER(LEN=*),INTENT(IN) :: string INTEGER,INTENT(IN),OPTIONAL :: start INTEGER,INTENT(IN),OPTIONAL :: finish CHARACTER(LEN=:),ALLOCATABLE :: extract_c ! extracts the characters between start and finish from character string and ! delivers these as the result of the function, string is unchanged ! if start < 1 or is missing then it is treated as 1 ! if finish > LEN(string) or is missing then it is treated as LEN(string) INTEGER :: is,if IF (PRESENT(start)) THEN is = MAX(1,start) ELSE is = 1 ENDIF IF (PRESENT(finish)) THEN if = MIN(LEN(string),finish) ELSE if = LEN(string) ENDIF extract_c = string(is:if) ENDFUNCTION extract_c !----- Split procedure ------------------------------------------------------! SUBROUTINE split_c(string,word,set,separator,back) CHARACTER(LEN=:),ALLOCATABLE,INTENT(INOUT) :: string CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: word CHARACTER(LEN=*),INTENT(IN) :: set CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator LOGICAL,INTENT(IN),OPTIONAL :: back ! splits the input string at the first(last) character in set ! returns the leading(trailing) substring in word and the trailing(leading) ! substring in string. The search is done in the forward or backward ! direction depending on back. If separator is present, the actual separator ! character found is returned in separator. ! If no character in set is found string and separator are returned as ! zero length and the whole input string is returned in word. LOGICAL :: dir_switch INTEGER :: i,ls,tpos,lset CHARACTER(LEN=:),ALLOCATABLE :: wst ! working copy of string ls = LEN(string); lset = LEN(set) wst=string IF( PRESENT(back) )THEN dir_switch = back ELSE dir_switch = .FALSE. ENDIF IF(dir_switch)THEN ! backwards search BSEARCH:DO tpos = ls,1,-1 DO i=1,lset IF(wst(tpos:tpos) == set(i:i))EXIT BSEARCH ENDDO ENDDO BSEARCH word = wst(tpos+1:ls) IF(PRESENT(separator))THEN IF(tpos==0)THEN separator = "" ELSE separator = wst(tpos:tpos) ENDIF ENDIF string = wst(1:tpos-1) ELSE ! forwards search FSEARCH:DO tpos =1,ls DO i=1,lset IF(wst(tpos:tpos) == set(i:i))EXIT FSEARCH ENDDO ENDDO FSEARCH word = wst(1:tpos-1) IF(PRESENT(separator))THEN IF(tpos==ls+1)THEN separator = "" ELSE separator = wst(tpos:tpos) ENDIF ENDIF string = wst(tpos+1:ls) ENDIF ENDSUBROUTINE split_c ENDMODULE ISO_VARYING_STRING ............................................................................... PROGRAM word_count !-----------------------------------------------------------------------------! ! Counts the number of "words" contained in a file. The words are assumed to ! ! be terminated by any one of: ! ! space,comma,period,!,?, or the EoR ! ! The file may have records of any length and the file may contain any number ! ! of records. ! ! The program prompts for the name of the file to be subject to a word count ! ! and the result is written to the default output unit ! !-----------------------------------------------------------------------------! USE ISO_VARYING_STRING USE,INTRINSIC :: ISO_FORTRAN_ENV IMPLICIT NONE CHARACTER(len=:),ALLOCATABLE :: line,fname INTEGER :: ierr,nd,wcount=0 WRITE(*,ADVANCE='NO',FMT='(A)') " Input name of file?" CALL GET(STRING=fname) ! read the required filename from the default ! input unit assumed to be the whole of the record read OPEN(UNIT=10,FILE=fname) file_read: DO ! until EoF reached CALL GET(10,line,IOSTAT=ierr) ! read next line of file IF(ierr == IOSTAT_END .OR. ierr > 0 )EXIT file_read word_scan: DO ! until end of line nd=SCAN(line," ,.!?") ! scan to find end of word IF(nd == 0)THEN ! EoR is end of word nd = LEN(line) EXIT word_scan ENDIF IF(nd > 1) wcount=wcount+1 ! at least one non-terminator character ! in the word line = REMOVE(line,1,nd) ! strips the counted word and its terminator ! from the line reducing its length before ! rescanning for the next word ENDDO word_scan IF(nd > 0) wcount=wcount+1 ENDDO file_read IF(ierr < 0)THEN WRITE(*,*) "No. of words in file =",wcount ELSEIF(ierr > 0)THEN WRITE(*,*) "Error in GET file in word_count, No. ",ierr ENDIF ENDPROGRAM word_count ............................................................................... PROGRAM vocabulary_word_count !-----------------------------------------------------------------------------! ! Counts the number of "words" contained in a file. The words are assumed to ! ! be terminated by any one of: ! ! space,comma,period,!,?, or the EoR ! ! The file may have records of any length and the file may contain any number ! ! of records. ! ! The program prompts for the name of the file to be subject to a word count ! ! and the result is written to the default output unit ! ! Also builds a list of the vocabulary found and the frequency of occurence ! ! of each different word. ! !-----------------------------------------------------------------------------! USE ISO_VARYING_STRING USE,INTRINSIC :: ISO_FORTRAN_ENV IMPLICIT NONE type VARYING_STRING CHARACTER(LEN=:),ALLOCATABLE :: chars end type VARYING_STRING CHARACTER(LEN=:),ALLOCATABLE :: line,word,fname INTEGER :: ierr,nd,wcount=0 !-----------------------------------------------------------------------------! ! Vocabulary list and frequency count arrays. The size of these arrays will ! ! be extended dynamically in steps of 100 as the used vocabulary grows ! !-----------------------------------------------------------------------------! type(VARYING_STRING),ALLOCATABLE,DIMENSION(:) :: vocab INTEGER,ALLOCATABLE,DIMENSION(:) :: freq INTEGER :: list_size=200,list_top=0 INTEGER :: i ! loop index !-----------------------------------------------------------------------------! ! Initialise the lists and determine the file to be processed ! !-----------------------------------------------------------------------------! ALLOCATE(vocab(1:list_size),freq(1:list_size)) WRITE(*,ADVANCE='NO',FMT='(A)') " Input name of file?" CALL GET(STRING=fname) ! read the required filename from the default ! input unit assumed to be the whole of the record read OPEN(UNIT=1,FILE=CHAR(fname)) ! CHAR(fname) converts to the type ! required by FILE= specifier file_read: DO ! until EoF reached CALL GET(1,line,IOSTAT=ierr) ! read next line of file IF(ierr == IOSTAT_END .OR. ierr > 0)EXIT file_read word_scan: DO ! until end of line nd=SCAN(line," ,.!?") ! scan to find end of word IF(nd == 0)THEN ! EoR is end of word nd = LEN(line)+1 EXIT word_scan ENDIF IF(nd > 1)THEN ! at least one non-terminator character in the word wcount=wcount+1 word = EXTRACT(line,1,nd-1) CALL update_vocab_lists ENDIF line = REMOVE(line,1,nd) ! strips the counted word and its terminator ! from the line reducing its length before ! rescanning for the next word ENDDO word_scan IF(nd > 1)THEN ! at least one character in the word wcount=wcount+1 word = EXTRACT(line,1,nd-1) CALL update_vocab_lists ENDIF ENDDO file_read IF(ierr < 0)THEN WRITE(*,*) "No. of words in file =",wcount WRITE(*,*) "There are ",list_top," distinct words" WRITE(*,*) "with the following frequencies of occurence" print_loop: DO i=1,list_top WRITE(*,FMT='(1X,I6,2X)',ADVANCE='NO') freq(i) CALL PUT_LINE(STRING=vocab(i)%chars) ENDDO print_loop ELSEIF(ierr > 0)THEN WRITE(*,*) "Error in GET in vocabulary_word_count, No.",ierr ENDIF CONTAINS SUBROUTINE extend_lists !-----------------------------------------------------------------------------! ! Accesses the host variables: ! ! type(VARYING_STRING),ALLOCATABLE,DIMENSION(:) :: vocab ! ! INTEGER,ALLOCATABLE,DIMENSION(:) :: freq ! ! INTEGER :: list_size ! ! so as to extend the size of the lists preserving the existing vocabulary ! ! and frequency information in the new extended lists ! !-----------------------------------------------------------------------------! type(VARYING_STRING),DIMENSION(list_size) :: vocab_swap INTEGER,DIMENSION(list_size) :: freq_swap INTEGER,PARAMETER :: list_increment=100 INTEGER :: new_list_size,alerr vocab_swap = vocab ! copy old list into temporary space freq_swap =freq new_list_size = list_size + list_increment DEALLOCATE(vocab,freq) ALLOCATE(vocab(1:new_list_size),freq(1:new_list_size),STAT=alerr) IF(alerr /= 0)THEN WRITE(*,*) "Unable to extend vocabulary list" STOP ENDIF vocab(1:list_size) = vocab_swap ! copy old list back into bottom freq(1:list_size) = freq_swap ! of new extended list list_size = new_list_size ENDSUBROUTINE extend_lists SUBROUTINE update_vocab_lists !-----------------------------------------------------------------------------! ! Accesses the host variables: ! ! type(VARYING_STRING),ALLOCATABLE,DIMENSION(:) :: vocab ! ! INTEGER,ALLOCATABLE,DIMENSION(:) :: freq ! ! INTEGER :: list_size,list_top ! ! type(VARYING_STRING) :: word ! ! searches the existing words in vocab to find a match for word ! ! if found increments the freq if not found adds word to ! ! list_top + 1 vocab list and sets corresponding freq to 1 ! ! if list_size exceeded extend the list size before updating ! !-----------------------------------------------------------------------------! INTEGER :: i ! loop index list_search: DO i=1,list_top IF(word == vocab(i)%chars)THEN freq(i) = freq(i) + 1 RETURN ENDIF ENDDO list_search IF(list_top == list_size)THEN CALL extend_lists ENDIF list_top = list_top + 1 vocab(list_top)%chars = word freq(list_top) = 1 ENDSUBROUTINE update_vocab_lists ENDPROGRAM vocabulary_word_count ...............................................................................