string.ss: String Utilities
To load: (require (lib "string.ss"))
(eval-string
str
[err-handler
])
PROCEDURE
Reads and evaluates S-expressions from the string str
, returning
results for all of the expressions in the string. Note that if
str
contains only whitespace and comments, zero values are
returned, and if str
contains multiple expressions, the result
will be contain multiple values from all subexpression. str
can
also be a byte string.
err-handler
can be:
#f
(the default) which means that errors are not caught;a one-argument procedure, which will be used with an exception (when an error occurs) and its result will be returned
a thunk, which will be used to prduce a result.
Prints expr
into a string and returns the string.
(real->decimal-string
n
[digits-after-decimal-k
])
PROCEDURE
Prints n
into a string and returns the string. The printed form
of n
shows exactly digits-after-decimal-k
digits after the
decimal point, where digits-after-decimal-k
defaults to
2
.
Before printing, the n
is converted to an
exact number, multiplied by (
, rounded, and then divided again by
expt
10
digits-after-decimal-k)(
. The result of ths process
is an exact number whose decimal representation has no more than
expt
10 digits-after-decimal-k)digits-after-decimal-k
digits after the decimal (and it is
padded with trailing zeros if necessary). The printed for uses a
minus sign if n
is negative, and it does not use a plus
sign if n
is positive.
(read-from-string
str
[err-handler
])
PROCEDURE
Reads the first S-expression from the string (or byte string)
str
and returns it.
The err-handler
is as in eval-string
.
(read-from-string-all
str
[err-handler
])
PROCEDURE
Reads all S-expressions from the string (or byte string) str
and
returns them in a list.
The err-handler
is as in eval-string
.
(regexp-match*
pattern string
[start-k end-k
])
PROCEDURE
(regexp-match*
pattern bytes
[start-k end-k
])
PROCEDURE
(regexp-match*
pattern input-port
[start-k end-k
])
PROCEDURE
Like regexp-match
(see section 10 in PLT MzScheme: Language Manual), but the result is
a list of strings or byte strings corresponding to a sequence of matches of
pattern
in string
, bytes
, or input-port
. (Unlike
regexp-match
, results for parenthesized sub-patterns in
pattern
are not returned.) If pattern
matches a
zero-length string or byte sequence along the way, the exn:fail
exception is raised.
If string
, bytes
, or input-port
contains no matches (in the range
start-k
to end-k
), null
is returned. Otherwise,
each item in the resulting list is a distinct substring or byte sequence from
string
, bytes
, or input-port
that matches pattern
. The
end-k
argument can be #f
to match to the end of
string
or baytes
or to an end-of-file in input-port
.
(regexp-match/fail-without-reading
pattern input-port
[start-k end-k output-port
])
PROCEDURE
Like regexp-match
on input ports (see section 10 in PLT MzScheme: Language Manual),
except that if the match fails, no characters are read and discarded
from input-port
.
This procedure is especially useful with a pattern
that begins
with a start-of-string caret (``^'') or with a non-#f
end-k
, since each limits the amount of peeking into the port.
(regexp-match-exact?
pattern string
)
PROCEDURE
(regexp-match-exact?
pattern bytes
)
PROCEDURE
(regexp-match-exact?
pattern input-port
)
PROCEDURE
This procedure is like MzScheme's built-in regexp-match
(see
section 10 in PLT MzScheme: Language Manual), but the result is always #t
or
#f
; #t
is only returned when the entire content of
string
, bytes
, or input-port
matches pattern
.
(regexp-match-peek-positions*
pattern input-port
[start-k end-k
])
PROCEDURE
Like regexp-match-positions*
, but it works only on input ports, and
the port is peeked instead of read for matches.
(regexp-match-positions*
pattern string
[start-k end-k
])
PROCEDURE
(regexp-match-positions*
pattern bytes
[start-k end-k
])
PROCEDURE
(regexp-match-positions*
pattern input-port
[start-k end-k
])
PROCEDURE
Like regexp-match-positions
(see section 10 in PLT MzScheme: Language Manual), but the
result is a list of integer pairs corresponding to a sequence of
matches of pattern
in string-or-input-port
. (Unlike
regexp-match-positions
, results for parenthesized
sub-patterns in pattern
are not returned.) If pattern
matches a zero-length string along the way, the exn:fail
exception is raised.
If string
, bytes
, or input-port
contains no matches
(in the range start-k
to end-k
), null
is
returned. Otherwise, each position pair in the resulting list
corresponds to a distinct substring in string
or byte sequence
in bytes
, input-port
, or string
(as UTF-8 encoded
when pattern
is a byte pattern), that matches
pattern
. The end-k
argument can be #f
to match
to the end of string
or bytes
or to an end-of-file in
input-port
.
(regexp-quote
str
[case-sensitive?
])
PROCEDURE
(regexp-quote
bytes
[case-sensitive?
])
PROCEDURE
Produces a string or byte string suitable for use with
(see section 10 in PLT MzScheme: Language Manual) to match the literal sequence of characters in
regexp
str
or sequence of bytes in bytes
. If
case-sensitive?
is true, the resulting regexp matches letters in
str
or bytes
case-insensitively, otherwise (and by default) it matches
case-sensitively.
(regexp-replace-quote
str
)
PROCEDURE
(regexp-replace-quote
bytes
)
PROCEDURE
Produces a string suitable for use as the third argument to
(see section 10 in PLT MzScheme: Language Manual) to insert the literal
sequence of characters in regexp-replace
str
or bytes in bytes
as a replacement.
Concretely,
every backslash and ampersand in str
or bytes
is protected by a
quoting backslash.
(glob->regexp
str
[hide-dots? case-sensitive? simple?
])
PROCEDURE
Produces a regexp for a an input ``glob pattern'' in str
. A
glob pattern is one that matches ``*'' with any string,
``?'' with a single character, and character ranges are the
same as in regexps. In addition, the resulting regexp does not match
strings that begin with a period, unless the glob string begins with
a literal period. The resulting regexp can be used with string file
names to check the glob pattern. If the glob pattern is provided as
a byte string, the result is a byte regexp.
If hide-dots?
is true (the default), the resulting regexp will
not match names that begin with a dot.
If case-sensitive?
is given, it determines whether the resulting
regexp is case-sensitive; otherwise the default case sensitivity
depends on the system-type.
Finally, if simple?
is provided as #t
, then the glob is
not expected to contain ranges (if it does, they will be
regexp-quote
d).
(regexp-split
pattern string
[start-k end-k
])
PROCEDURE
(regexp-split
pattern bytes
[start-k end-k
])
PROCEDURE
(regexp-split
pattern input-port
[start-k end-k
])
PROCEDURE
The complement of regexp-match*
(see above): the result is a
list of strings or byte strings from in string
, bytes
, or
input-port
that are separated by matches to pattern
;
adjacent matches are separated with ""
or #""
. If pattern
matches a zero-length string or byte sequence along the way, the exn:fail
exception is raised.
If string
, bytes
, or input-port
contains no matches
(in the range start-k
to end-k
), the result is be a list
containing string
(UTF-8 encoded if pattern
is a byte
pattern), bytes
, or the content of input-port
-- from
start-k
to end-k
. If a match occurs at the beginning of
string
, bytes
, or input-port
(at start-k
),
the resulting list will start with an empty string or empty byte
string, and if a match occurs at the end (at end-k
), the list
will end with an empty string or empty byte string. The end-k
argument can be #f
, in which case splitting goes to the end
of string
or bytes
or to an end-of-file in
input-port
.
(string-lowercase!
str
)
PROCEDURE
Destructively changes str
to contain only lowercase characters.
(string-uppercase!
str
)
PROCEDURE
Destructively changes str
to contain only uppercase characters.