Class std::Str

Implements Sequence<Str>, Iterable<Str>, Comparable<Str>, Multipliable<Int, Str>

Instances of the Str class are immutable strings. A string is a potentially empty sequence of characters. A character is logically represented by an integer code between 0 and 65535, inclusive. Alore has no special data type for characters — characters are represented as strings of length 1. Character codes can be queried using the Ord function, and strings with specific character codes can be created using the Chr function.

String objects may contain data in various encodings. String methods that interpret string contents (e.g. upper) assume that strings are encoded in the 16-bit Unicode encoding or any subset of Unicode, such as ASCII or Latin 1. Some other modules can work with arbitrary narrow strings, i.e. string objects with only 8-bit characters (character codes between 0 and 255, inclusive).

Characters in a string can be accessed using integer indices starting from 0 (the first character). Alternatively, negative indices can used to refer to characters from the end of the string: -1 refers to the last character, -2 to the second to last character, etc.

See also: The string and re modules contain useful functions for dealing with strings. The encodings module provides conversions between different character encodings.

class Str(x)
Construct an object of the Str type. Call the _str() method of the argument and return a value equal to the result, provided that it is a string. Objects of all primitive types (except Str) and the standard collection types provide a _str method.

See also: The function string::IntToStr and the method Str format are alternative ways of converting objects to strings.

Methods

length() as Int
Return the length of the string.
lower() as Str
Return a copy of the string with upper case characters converted into lower case.
upper() as Str
Return a copy of the string with lower case characters converted into upper case.
find(substring as Str[, start as Int])
Return the index of the first occurrence of a substring in the string, or -1 if the substring cannot be found. The returned index is the index of the start of the match. If the argument start is present, only occurrences at index start or higher are considered.
index(substring as Str) as Int
Return the index of the first occurrence of a substring in the string, or raise ValueError if the substring cannot be found. The returned index is the index of the start of the match.
replace(old as Str, new as Str[, max as Int])
Return a copy of the string with occurrences of old replaced with new, starting from the beginning of the string. If the max argument is present, only replace up to max instances of old. Examples:
"x..x.".replace("x", "yy")    -- Result: "yy..yy."
"x..x.".replace("x", "yy", 1) -- Result: "yy..x."
split([separator as Str[, max as Int]]) as Array<Str>
Split the string into fields separated by the separator or by a run of whitespace characters, if the separator is not specified or it is nil. In the latter case, whitespace characters at the start and the end of string are not included in the fields.

Return an array containing the fields. If the separator is given, the result contains always at least a single field, which may be empty. The optional max parameter specifies the maximum number of splits. The rest of the string will be returned as the last element in the array. Examples:

" a    black cat  ".split()  -- Result: ["a", "black", "cat"]
"a,black, cat".split(",")    -- Result: ["a", "black", " cat"]
"a,b,c".split(",", 1)        -- Result: ["a", "b,c"]
join(sequence as Sequence<Str>) as Str
Concatenate the strings in a sequence. Use the string as the separator.
" ".join(["a", "black", "cat"])  -- Result: "a black cat"
"".join(["a", "b", "cd"])        -- Result: "abcd"
", ".join(["cat"])               -- Result: "cat"
count(substring as Str) as Int
Return the number of times a substring occurs in the string (without overlapping).
iterator() as Iterator<Str>
Return an iterator object that can be used to sequentially iterate over the characters in the string, starting from the first character.
strip() as Str
Return a copy of the string with leading and trailing whitespace characters removed. Only ASCII space, tab, CR and LF characters are removed.
format(...) as Str
When this method is called on a format string object, return a string constructed according to the format string and the optional arguments. Most characters in the format string are returned unmodified:
"foo bar".format()  -- Result: "foo bar"
Empty brace expressions are replaced with method arguments converted to strings:
"{} and {}".format(1, "2")  -- Result: "1 and 2"
Brace characters can be added to the result by duplicating them in the format:
"{{ and }}".format() -- Result: "{ and }"
The contents of brace expressions may optionally be prefixed with a field width specifier, an integer followed by a colon. The replacement is padded with spaces to have at least as many characters as the absolute value of the width. If the width is positive, the result is aligned to right, otherwise to left:
"{4:}/{-3:}".format("ab", "c")  -- Result: "  ab/c  "
Brace expression for numeric arguments may contain an additional format template that specifies the format of the result. A fractional format template contains one or more zeroes, optionally followed by a dot, a (potentially empty) run of zeroes and a (potentially empty) run of hash (#) characters. The zeroes specify the minimum number of digits in the integer part and the fraction, and the hash characters specify optional fraction digits that are only included if they are non-zero:
"{0000}".format(12)      -- Result: "0012"
"{0.00}".format(12.345)  -- Result: "12.35"
"{0.0####}".format(1.23) -- Result: "1.23"
"{5:0.0}".format(1.2)    -- Result: "  1.2"
A scientific format template contains a zero, optionally followed by a dot and a run of zeroes and a run of hash characters; and an exponent template. The exponent template contains 'e' or 'E', an optional '+' and a non-empty run of zeroes. The dot and the following zeroes and hash characters specify the number of decimals shown in the coefficient; the exponent template specifies the minimum number of digits and the type of the exponent:
"{0.0e0}".format(1234)     -- Result: "1.2e3"
"{0.0E0}".format(1234)     -- Result: "1.2E3"
"{0.###e+00}".format(1000) -- Result: "1e+03"
"{0.###e+00}".format(1200) -- Result: "1.2e+03"
"{0.00##e0}".format(0.1)   -- Result: "1.00e-1"
"{0e0}".format(15)         -- Result: "2e1"
startsWith(prefix as Str) as Boolean
Return a boolean indicating whether the string starts with the prefix.
endsWith(suffix as Str) as Boolean
Return a boolean indicating whether the string ends with the suffix.
decode(encoding as Encoding[, mode as Constant]) as Str
Decode the string to 16-bit Unicode using the given character encoding. The mode argument may be encodings::Strict (the default) or encodings::Unstrict. Use this to convert strings in 8-bit binary encodings to Unicode so that you can use them with operations such as lower() that expect Unicode strings.

Example:

"\u00c3\u00a4".decode(encodings::Utf8)  -- Decode "ä" in UTF-8 to 16-bit Unicode

See also: Module encodings

encode(encoding as Encoding[, mode as Constant]) as Str
Encode the string (interpreted as 16-bit Unicode) using the given character encoding. The mode argument may be encodings::Strict (the default) or encodings::Unstrict. Calling this method is equivalent to encoding.encoder([mode]).encode(str).

Example:

"\u20ac".encode(encodings::Utf8)        -- Encode the Euro sign using UTF-8

See also: Module encodings

Operations

Str objects support the following operations:

str[n] (Str[Int] ⇒ Str; Str[Pair<Int, Int>] ⇒ Str)
If the index n is an integer, return the character at the specified index as a string of length 1. If the index value is out of bounds, raise an IndexError exception.

If the index is a pair x : y, return a slice containing the indices x, x + 1, ..., y - 1. If the left value of the pair is omitted or nil, it is assumed to be 0. If the right value is omitted or nil, the result is a substring extending to the end of the string. Invalid indices in range bounds are clipped to lie within the string.

"hello"[2]       -- "e"
"hello"[1:3]     -- "el"
"hello"[3:]      -- "lo"
"hello"[:-1]     -- "hell"
substr in str (Str in StrBoolean)
Test whether a string contains a substring. Return a boolean value.
for ch in str (for Str in Str)
The characters in a string can be iterated with a for loop, starting from the first character.
x + y (Str + StrStr)
Return the concatenation of two strings.
str * n (Str * IntStr)
n * str (Int * StrStr)
A string can be repeated any number of times by multiplying it with an integer. The integer must not be negative. Multiplying a string with zero results in an empty string.
"foo" * 3    -- "foofoofoo"
"x" * 0      -- ""
x == y (Str == ObjectBoolean)
Strings can be compared for equality.
x < y (Str < StrBoolean)
x > y (Str > StrBoolean)
Strings can be compared lexicographic order. Order comparisons are based on the numeric values of characters.
Repr(str)
Return a string representing the string using Alore string literal syntax, and only using printable ASCII characters. Characters other than printable ASCII character are represented using the \uNNNN escape sequences.
WriteLn(Repr("""foo" + Tab + "\uffff"))   -- Print """foo\u0009\uffff"
Int(str)
Convert a string to an integer.
Float(str)
Convert a string to a float.
Hash(str)
Return the hash value of a string.