25 Strings and Characters

  • IsChar( obj ) C

    A character is simply an object in GAP that represents an arbitrary character from the character set of the operating system. Character literals can be entered in GAP by enclosing the character in singlequotes '.

    gap> x:= 'a';  IsChar( x );
    'a'
    true
    gap> '*';
    '*'
    

  • IsString( obj ) C

    A string is simply a dense list of characters (see IsList, IsDenseList). Strings are used mainly in filenames and error messages. A string literal can either be entered simply as the list of characters or by writing the characters between doublequotes ". GAP will always output strings in the latter format.

    gap> s1 := ['H','e','l','l','o',' ','w','o','r','l','d','.'];
    "Hello world."
    gap> IsString( s1 );
    true
    gap> s2 := "Hello world.";
    "Hello world."
    gap> s1 = s2;
    true
    gap> s3 := "";
    ""           # the empty string
    gap> s3 = [];
    true
    gap> IsString( [] );
    true
    gap> IsString( "123" );  IsString( 123 );
    true
    false
    gap> IsString( [ '1', '2', '3' ] );
    true
    gap> IsString( [ '1', '2', , '4' ] );  IsString( [ '1', '2', 3 ] );
    false        # strings must be dense
    false        # strings must only contain characters
    

    Note that a string is just a special case of a list. So everything that is possible for lists (see Lists) is also possible for strings. Thus you can access the characters in such a string (see List Elements), test for membership (see Membership Test for Collections), ask for the length, concatenate strings (see Concatenation), form substrings etc. You can even assign to a mutable string (see List Assignment). Of course unless you assign a character in such a way that the list stays dense, the resulting list will no longer be a string.

    gap> Length( s2 );
    12
    gap> s2[2];
    'e'
    gap> 'a' in s2;
    false
    gap> s2[2] := 'a';;  s2;
    "Hallo world."
    gap> s1{ [1..4] };
    "Hell"
    gap> Concatenation( s1{ [ 1 .. 6 ] }, s1{ [ 1 .. 4 ] } );
    "Hello Hell"
    

    If a string is displayed by View, for example as result of an evaluation (see Main Loop), or by ViewObj and PrintObj, it is displayed with enclosing doublequotes. However, if a string is displayed by Print, PrintTo, or AppendTo (see View and Print, PrintTo, AppendTo) the enclosing doublequotes are dropped. So strings behave differently from other GAP objects w.r.t. printing in the sense that the output of Print for a string is not equal to the output of PrintObj.

    gap> s4:= "abc\"def\nghi";;
    gap> View( s4 );  Print( "\n" );
    "abc\"def\nghi"
    gap> ViewObj( s4 );  Print( "\n" );
    "abc\"def\nghi"
    gap> PrintObj( s4 );  Print( "\n" );
    "abc"def\nghi"
    gap> Print( s4 );  Print( "\n" );
    abc"def
    ghi
    

    Note that only those line breaks are printed by Print that are contained in the string (\n characters, see Special Characters), as is shown in the example below.

    gap> s1;
    "Hello world."
    gap> Print( s1 );
    Hello world.gap> Print( s1, "\nnext line\n" );
    Hello world.
    next line
    gap>
    

    Sections

    1. Special Characters
    2. Recognizing Characters
    3. Comparisons of Strings
    4. Operations to Produce Strings
    5. Operations to Evaluate Strings
    6. Calendar Arithmetic
    7. Internally Represented Strings

    25.1 Special Characters

    There are a number of special character sequences that can be used between the singlequotes of a character literal or between the doublequotes of a string literal to specify characters, which may otherwise be inaccessible. They consist of two characters. The first is a backslash \. The second may be any character. The meaning is given in the following list

    \n
    newline character. This is the character that, at least on UNIX systems, separates lines in a text file. Printing of this character in a string has the effect of moving the cursor down one line and back to the beginning of the line.

    \"
    doublequote character. Inside a string a doublequote must be escaped by the backslash, because it is otherwise interpreted as end of the string.

    \'
    ' *singlequote character*. Inside a character a singlequote must escaped by the backslash, because it is otherwise interpreted as end of the character.

    \ \
    *backslash character*. Inside a string a backslash must be escaped by another backslash, because it is otherwise interpreted as first character of an escape sequence.

    `` \b'
    backspace character. Printing this character should have the effect of moving the cursor back one character. Whether it works or not is system dependent and should not be relied upon.

    \r
    carriage return character. Printing this character should have the effect of moving the cursor back to the beginning of the same line. Whether this works or not is again system dependent.

    \c
    flush character. This character is not printed. Its purpose is to flush the output queue. Usually GAP waits until it sees a newline before it prints a string. If you want to display a string that does not include this character use \c.

    other
    For any other character the backslash is simply ignored.

    Again, if the line is displayed as result of an evaluation, those escape sequences are displayed in the same way that they are input. They are displayed in their special way only by Print, PrintTo, or AppendTo.

    gap> "This is one line.\nThis is another line.\n";
    "This is one line.\nThis is another line.\n"
    gap> Print( last );
    This is one line.
    This is another line.
    

    It is not allowed to enclose a newline inside the string. You can use the special character sequence \n to write strings that include newline characters. If, however, an input string is too long to fit on a single line it is possible to continue it over several lines. In this case the last character of each input line, except the last line must be a backslash. Both backslash and newline are thrown away. Note that the same continuation mechanism is available for identifiers and integers.

    gap> "This is a very long string that does not fit on a line \
    gap> and is therefore continued on the next line.";
    "This is a very long string that does not fit on a line and is therefo\
    re continued on the next line."
    

    Note that the output is also continued, but at a different place that is determined by the value of SizeScreen (see SizeScreen).

    25.2 Recognizing Characters

  • IsDigitChar( c ) F

    checks whether the character c is a digit, i.e., occurs in the string "0123456789".

  • IsLowerAlphaChar( c ) F

    checks whether the character c is a lowercase alphabet letter, i.e., occurs in the string "abcdefghijklmnopqrstuvwxyz".

  • IsUpperAlphaChar( c ) F

    checks whether the character c is an uppercase alphabet letter, i.e., occurs in the string "ABCDEFGHIJKLMNOPQRSTUVWXYZ".

  • IsAlphaChar( c ) F

    checks whether the character c is either a lowercase or an uppercase alphabet letter.

    25.3 Comparisons of Strings

  • string1 = string2
  • string1 <> string2

    The equality operator = returns to true if the two strings string1 and string2 are equal and false otherwise. The inequality operator <> returns true if the two strings string1 and string2 are not equal and false otherwise.

    gap> "Hello world.\n" = "Hello world.\n";
    true
    gap> "Hello World.\n" = "Hello world.\n";
    false # string comparison is case sensitive
    gap> "Hello world." = "Hello world.\n";
    false # the first string has no <newline>
    gap> "Goodbye world.\n" = "Hello world.\n";
    false
    gap> [ 'a', 'b' ] = "ab";
    true
    

  • string1 < string2

    The ordering of strings is lexicographically according to the order implied by the underlying, system dependent, character set.

    gap> "Hello world.\n" < "Hello world.\n";
    false # the strings are equal
    gap> "Hello World.\n" < "Hello world.\n";
    true # in ASCII uppercase letters come before lowercase letters
    gap> "Hello world." < "Hello world.\n";
    true # prefixes are always smaller
    gap> "Goodbye world.\n" < "Hello world.\n";
    true # `G' comes before `H', in ASCII at least
    

    Strings can be compared via < with certain GAP objects that are not strings, see Comparisons for the details.

    25.4 Operations to Produce Strings

  • String( obj ) A
  • String( obj, length ) O

    String returns a representation of obj, which may be an object of arbitrary type, as a string. This string should approximate as closely as possible the character sequence you see if you print obj.

    If length is given it must be an integer. The absolute value gives the minimal length of the result. If the string representation of obj takes less than that many characters it is filled with blanks. If length is positive it is filled on the left, if length is negative it is filled on the right.

    In the two argument case, the string returned is a new mutable string (in particular not a part of any other object); it can be modified safely, and MakeImmutable may be safely applied to it.

    gap> String(123);String([1,2,3]);
    "123"
    "[ 1, 2, 3 ]"
    

  • StringPP( int ) F

    returns a string representing the prime factor decomposition of the integer int.

    gap> StringPP(40320);
    "2^7*3^2*5*7"
    

  • WordAlp( alpha, nr ) F

    returns a string that is the nr-th word over the alphabet list alpha, w.r.t. word length and lexicographical order. The empty word is WordAlp( alpha, 0 ).

    gap> List([0..5],i->WordAlp("abc",i));
    [ "", "a", "b", "c", "aa", "ab" ]
    

  • LowercaseString( string ) F

    returns a lowercase version of the string string, that is, a string in which each uppercase alphabet character is replaced by the corresponding lowercase character.

    gap> LowercaseString("This Is UpperCase");
    "this is uppercase"
    

  • SplitString( string, seps[, wspace] ) O

    This function accepts a string string and lists seps and, optionally, wspace of characters. Now string is split into substrings at each occurrence of a character in seps or wspace. The characters in wspace are interpreted as white space characters. Substrings of characters in wspace are treated as one white space character and they are ignored at the beginning and end of a string.

    Both arguments seps and wspace can be single characters.

    Each string in the resulting list of substring does not contain any characters in seps or wspace.

    A character that occurs both in seps and wspace is treated as a white space character.

    A separator at the end of a string is interpreted as a terminator; in this case, the separator does not produce a trailing empty string.

    gap> SplitString( "substr1:substr2::substr4", ":" );
    [ "substr1", "substr2", "", "substr4" ]
    gap> SplitString( "a;b;c;d;", ";" );
    [ "a", "b", "c", "d" ]
    gap> SplitString( "/home//user//dir/", "", "/" );
    [ "home", "user", "dir" ]
    

    For the possibility to print GAP objects to strings, see String Streams.

    25.5 Operations to Evaluate Strings


  • Int(str)indexevaluation!strings
  • Rat(str)

    return an integer, respectively a rational as represented by the string str. Int returns fail if non-digit characters occur in str. For Rat, the argument string may start with the sign character '-', followed by either a sequence of digits or by two sequences of digits that are separated by one of the characters '/' or '.', where the latter stands for a decimal dot. (The methods only evaluate numbers but do not perform arithmetic!)

    gap> Int("12345")+1;
    12346
    gap> Int("123/45");
    fail
    gap> Int("1+2");
    fail
    gap> Int("-12");
    -12
    gap> Rat("123/45");
    41/15
    gap> Rat( "123.45" );
    2469/20
    

  • Ordinal( n ) F

    returns the ordinal of the integer n as a string.

    gap> Ordinal(2);  Ordinal(21);  Ordinal(33);  Ordinal(-33);
    "2nd"
    "21st"
    "33rd"
    "-33th"
    

    25.6 Calendar Arithmetic

    All calendar functions use the Gregorian calendar.

  • DaysInYear( year ) F

    returns the number of days in a year.

  • DaysInMonth( month, year ) F

    returns the number of days in month number month of year.

    gap> DaysInYear(1998);
    365
    gap> DaysInMonth(3,1998);
    31
    

  • DMYDay( day ) F

    converts a number of days, starting 1-Jan-1970 to a list [day,month,year]

  • DayDMY( dmy ) F

    returns the number of days from 01-Jan-1970 to the day given by dmy. dmy must be a list of the form [day,month,year].

  • WeekDay( date ) F

    returns the weekday of a day given by date. date can be a number of days since 1-Jan-1970 or a list [day,month,year].

  • StringDate( date ) F

    converts date to a readable string. date can be a number of days since 1-Jan-1970 or a list [day,month,year].

    gap> DayDMY([1,1,1970]);DayDMY([2,1,1970]);
    0
    1
    gap> DMYDay(12345);
    [ 20, 10, 2003 ]
    gap> WeekDay([11,3,1998]);
    "Wed"
    gap> StringDate([11,3,1998]);
    "11-Mar-1998"
    

  • HMSMSec( msec ) F

    converts a number msec of milliseconds into a list [hour,min,sec,milli].

  • SecHMSM( hmsm ) F

    is the reverse of HMSMSec.

  • StringTime( time ) F

    converts time (given as a sumber of milliseconds or a list [hour,min,sec,milli]) to a readable string.

    gap> HMSMSec(Factorial(10));
    [ 1, 0, 28, 800 ]
    gap> SecHMSM([1,10,5,13]);
    4205013
    gap> StringTime([1,10,5,13]);
    " 1:10:05.013"
    

    25.7 Internally Represented Strings

  • IsStringRep( obj ) R

    IsStringRep is a special (internal) representation of dense lists of characters. Dense lists of characters can be converted into this representation using ConvertToStringRep. Note that calling IsString does not change the representation.

  • ConvertToStringRep( obj ) F

    If obj is a dense internally represented list of characters then ConvertToStringRep changes the representation to IsStringRep. This is useful in particular for converting the empty list [], which usually is in IsPlistRep, to IsStringRep. If obj is not a string then ConvertToStringRep signals an error.

  • IsEmptyString( str ) F

    IsEmptyString returns true if str is the empty string in the representation IsStringRep, and false otherwise. Note that the empty list [] and the empty string "" have the same type, the recommended way to distinguish them is via IsEmptyString. For formatted printing, this distinction is sometimes necessary.

    gap> l:= [];;  IsString( l );  IsEmptyString( l );  IsEmpty( l );
    true
    false
    true
    gap> l;  ConvertToStringRep( l );  l;
    [  ]
    ""
    gap> IsEmptyString( l );  IsEmptyString( "" );  IsEmptyString( "abc" );
    true
    true
    false
    gap> ll:= [ 'a', 'b' ];  IsStringRep( ll );  ConvertToStringRep( ll );
    "ab"
    false
    gap> ll;  IsStringRep( ll );
    "ab"
    true
    

  • CharsFamily V

    Each character lies in the family CharFamily, each nonempty string lies in the collections family of this family. Note the subtle differences between the empty list [] and the empty string "" when both are printed.

    [Top] [Previous] [Up] [Next] [Index]

    GAP 4 manual
    July 1999