Basic Data Extensions
3.1 Void and Undefined
MzScheme returns the unique void value -- printed as
#<void>
-- for expressions that have unspecified results in
R5RS. The procedure
takes any
number of arguments and returns void:
void
Variables bound by letrec-values
that are accessible but
not yet initialized are bound to the unique undefined
value, printed as #<undefined>
.
3.2 Booleans
Unless otherwise specified, two instances of a particular MzScheme
data type are
only when they are equal?
. Two
values are eq?
only when they are either eqv?
,
both eq?
+nan.0
, or both =
and have the same exactness
and sign. (The inexact numbers 0.0
and -0.0
are
not
, although they are eqv?
=
.)
The
and andmap
procedures apply a test
procedure to the elements of a list, returning immediately when the
result for testing the entire list is determined. The arguments to
ormap
and andmap
are the same as for ormap
, but a
single boolean value is returned as the result, rather than a list:
map
(andmap
proc list
···1)
appliesproc
to elements of thelist
s from the first elements to the last, returning#f
as soon as any application returns#f
. If no application ofproc
returns#f
, then the result of the last application ofproc
is returned. If thelist
s are empty, then#t
is returned.(ormap
proc list
···1)
appliesproc
to elements of thelist
s from the first elements to the last. If any application returns a value other than#f
, that value is immediately returned as the result of the
application. If all applications oformap
proc
return#f
, then the result is#f
. If thelist
s are empty, then#f
is returned.
Examples:
(andmap
positive?
'(1 2 3)) ; =>#t
(ormap
eq?
'(a b c) '(a b c)) ; =>#t
(andmap
positive?
'(1 2 a)) ; => raisesexn:fail:contract
(ormap
positive?
'(1 2 a)) ; =>#t
(andmap
positive?
'(1 -2 a)) ; =>#f
(andmap
+ '(1 2 3) '(4 5 6)) ; =>9
(ormap
+ '(1 2 3) '(4 5 6)) ; =>5
3.3 Numbers
A number in MzScheme is one of the following:
a fixnum exact integer (30 bits4 plus a sign bit)
a bignum exact integer (cannot be represented in a fixnum)
a fraction exact rational (represented by two exact integers)
a flonum inexact rational (double-precision floating-point number)
a complex number; either the real and imaginary parts are both exact or inexact, or the number has an exact zero real part and an inexact imaginary part; a complex number with an inexact zero imaginary part is a real number
MzScheme extends the number syntax of R5RS in three ways:
All input radixes (
#b
,#o
,#d
, and#x
) allow ``decimal'' numbers that contain a period or exponent marker. For example,#b1.1
is equivalent to1.5
. In hexadecimal numbers,
ande
d
always stand for a hexadecimal digit, not an exponent marker.The mantissa of a number with an exponent marker can be expressed as a fraction. For example,
1/2e3
is equivalent to500.0
, and1/2e2+1/2e4i
is equivalent to50.0+5000.0i
.The following are inexact numerical constants:
+inf.0
(infinity),-inf.0
(negative infinity),+nan.0
(not a number), and-nan.0
(same as+nan.0
). These names can also be used within complex constants, as in-inf.0+inf.0i
. These names are case-insensitive.
The special inexact numbers +inf.0
, -inf.0
, and
+nan.0
have no exact form. Dividing by an inexact zero returns
+inf.0
or -inf.0
, depending on the sign of the
dividend. The infinities are integers, and they answer #t
for
both even?
and odd?
. The +nan.0
value is not an
integer and is not =
to itself, but +nan.0
is
eqv?
to itself.5 Similarly, (= 0.0 -0.0)
is #t
, but (
is eqv?
0.0 -0.0)#f
.
All multi-argument arithmetic procedures operate pairwise on arguments from left to right.
The string->number
procedure works on all number
representations and exact integer radix values in the range 2
to 16
(inclusive). The number->string
procedure
accepts all number types and the radix values 2
, 8
,
10
, and 16
; however, if an inexact number is provided
with a radix other than 10
, the
exn:fail:contract
exception is raised.
The
and add1
procedures work on any number:
sub1
The following procedures work on integers:
(quotient/remainder
n1 n2
)
returns two values:(
andquotient
n1 n2)(
.remainder
n1 n2)(integer-sqrt
n
)
returns the integer square-root ofn
. For positiven
, the result is the largest positive integer bounded by the(
. For negativesqrt
n)n
, the result is(* (
.integer-sqrt
(- n)) 0+i)(integer-sqrt/remainder
n
)
returns two values:(
andinteger-sqrt
n)(- n (
.expt
(integer-sqrt
n) 2))
The following procedures work on exact integers in their (semi-infinite) two's complement representation:
(bitwise-ior
n
···1)
returns the bitwise ``inclusive or'' of then
s.(bitwise-and
n
···1)
returns the bitwise ``and'' of then
s.(bitwise-xor
n
···1)
returns the bitwise ``exclusive or'' of then
s.(bitwise-not
n
)
returns the bitwise ``not'' ofn
.(arithmetic-shift
n m
)
returns the bitwise ``shift'' ofn
. The integern
is shifted left bym
bits; i.e.,m
new zeros are introduced as rightmost digits. Ifm
is negative,n
is shifted right by -m
bits; i.e., the rightmostm
digits are dropped.
The
procedure generates pseudo-random numbers:
random
(random
k
)
returns a random exact integer in the range0
tok
- 1 wherek
is an exact integer between 1 and 231 - 1, inclusive. The number is provided by the current pseudo-random number generator, which maintains an internal state for generating numbers.6(random
)
returns a random inexact number between0
and1
, exclusive, using the current pseudo-random number generator.(random-seed
k
)
seeds the current pseudo-random number generator withk
, an exact integer between 0 and 231 - 1, inclusive. Seeding a generator sets its internal state deterministically; seeding a generator with a particular number forces it to produce a sequence of pseudo-random numbers that is the same across runs and across platforms.(pseudo-random-generator->vector
generator
)
produces a vector that represents the complete internal state ofgenerator
. The vector is suitable as an argument tovector->pseudo-random-generator
to recreate the generator in its current state (across runs and across platforms).(vector->pseudo-random-generator
vec
)
produces a pseudo-random number generator whose internal state corresponds tovec
. The vectorvec
must contain six exact integers; the first three integers must be in the range0
to4294967086
, inclusive; the last three integers must be in the range0
to4294944442
, inclusive; at least one of the first three integers must be non-zero; and at least one of the last three integers must be non-zero.(current-pseudo-random-generator
)
returns the current pseudo-random number generator, and(current-pseudo-random-generator
generator
)
sets the current generator togenerator
. See also section 7.9.1.10.(make-pseudo-random-generator
)
returns a new pseudo-random number generator. The new generator is seeded with a number derived from(
.current-milliseconds
)(pseudo-random-generator?
v
)
returns#t
ifv
is a pseudo-random number generator,#f
otherwise.
The following procedures convert between Scheme numbers and common machine byte representations:
(integer-bytes->integer
string signed?
[big-endian?
])
converts the machine-format number encoded instring
to an exact integer. Thestring
must contain either 2, 4, or 8 characters. Ifsigned?
is true, then the string is decoded as a two's-complement number, otherwise it is decoded as an unsigned integer. Ifbig-endian?
is true, then the first character's ASCII value provides the most significant eight bits of the number, otherwise the first character provides the least-significant eight bits, and so on. The default value ofbig-endian?
is the result of
.system-big-endian?
(integer->integer-bytes
n size-n signed?
[big-endian? to-string
])
converts the exact integern
to a machine-format number encoded in a string of lengthsize-n
, which must be 2, 4, or 8. Ifsigned?
is true, then the number is encoded with two's complement, otherwise it is encoded as an unsigned bit stream. Ifbig-endian?
is true, then the most significant eight bits of the number are encoded in the first character of the resulting string, otherwise the least-significant bits are encoded in the first character, and so on. The default value ofbig-endian?
is the result of
.system-big-endian?
If
to-string
is provided, it must be a mutable string of lengthsize-n
; in that case, the encoding ofn
is written intoto-string
, andto-string
is returned as the result. Ifto-string
is not provided, the result is a newly allocated string.If
n
cannot be encoded in a string of the requested size and format, theexn:fail:contract
exception is raised. Ifto-string
is provided and it is not of lengthsize-n
, theexn:fail:contract
exception is raised.(floating-point-bytes->real
string
[big-endian?
])
converts the IEEE floating-point number encoded instring
to an inexact real number. Thestring
must contain either 4 or 8 characters. Ifbig-endian?
is true, then the first character's ASCII value provides the most significant eight bits of the IEEE representation, otherwise the first character provides the least-significant eight bits, and so on. The default value ofbig-endian?
is the result of
.system-big-endian?
(real->floating-point-bytes
x size-n
[big-endian? to-string
])
converts the real numberx
to its IEEE representation in a string of lengthsize-n
, which must be 4 or 8. Ifbig-endian?
is true, then the most significant eight bits of the number are encoded in the first character of the resulting string, otherwise the least-significant bits are encoded in the first character, and so on. The default value ofbig-endian?
is the result of
.system-big-endian?
If
to-string
is provided, it must be a mutable string of lengthsize-n
; in that case, the encoding ofn
is written intoto-string
, andto-string
is returned as the result. Ifto-string
is not provided, the result is a newly allocated string.If
to-string
is provided and it is not of lengthsize-n
, theexn:fail:contract
exception is raised.(system-big-endian?
)
returns#t
if the native encoding of numbers is big-endian for the machine running MzScheme,#f
if the native encoding is little-endian.
3.4 Characters
MzScheme characters range over Unicode scalar values (see
section 1.2.1), which includes characters whose values range from
#x0
to #x10FFFF
, but not including #xD800
to #xDFFF
. The procedure char->integer
returns a
character's code-point number, and integer->char
converts
a code-point number to a character. If integer->char
is
given an integer that is either outside #x0
to
#x10FFFF
or in the excluded range #xD800
to
#xDFFF
, the exn:fail:contract
exception is raised.
Character constants include special named characters, such as
#\newline
, plus octal representations (e.g.,
#\251
), and Unicode-style hexadecimal representations (e.g.,
#\u03BB
). See section 11.2.4 for more information on
character constants.
The character comparison
procedures char=?
, char<?
, char-ci=?
,
etc. take two or more character arguments and check the arguments
pairwise (like the numerical comparison procedures). Two characters
are eq?
whenever they are char=?
. The
expression (
produces the same result
as char<?
char1 char2)(< (
, etc.
The case-independent char->integer
char1) (char->integer
char2))-ci
procedures compare characters after
case-folding with char-foldcase
(described below).
The character predicates produce results consistent with the Unicode database7 and (usually) SRFI-14. These procedures are fully portable; their results do not depend on the current platform or locale.
(char-alphabetic?
char
)
-- returns#t
ifchar
's Unicode general category is Lu, Ll, Lt, Lm, or Lo,#f
otherwise.(char-lower-case?
char
)
-- returns#t
ifchar
has the Unicode ``Lowercase'' property.(char-upper-case?
char
)
-- returns#t
ifchar
has the Unicode ``Uppercase'' property.(char-title-case?
char
)
-- returns#t
ifchar
's Unicode general category is Lt,#f
otherwise.(char-numeric?
char
)
-- returns#t
ifchar
's Unicode general category is Nd,#f
otherwise.(char-symbolic?
char
)
-- returns#t
ifchar
's Unicode general category is Sm, Sc, Sk, or So,#f
otherwise.(char-punctuation?
char
)
-- returns#t
ifchar
's Unicode general category is Pc, Pd, Ps, Pe, Pi, Pf, or Po,#f
otherwise.(char-graphic?
char
)
-- returns#t
ifchar
's Unicode general category is Mn, Mc, Me, or if one of the following produces#t
when applied tochar
:char-alphabetic?
,char-numeric?
,char-symbolic?
, orchar-punctuation?
.(char-whitespace?
char
)
-- returns#t
ifchar
's Unicode general category is Zs, Zl, or Zp, or ifchar
is one of the following:#\tab
,#\newline
,#\vtab
,#\page
, or#\return
.(char-blank?
char
)
-- returns#t
ifchar
's Unicode general category is Zs or ifchar
is#\tab
. (These correspond to horizontal whitespace.)(char-iso-control?
char
)
-- return#t
ifchar
is between#\u0000
and#\u001F
inclusive or#\u007F
and#\u009F
inclusive.
Character conversions are also consistent with the 1-to-1 code point mapping defined by Unicode. String procedures (see section 3.5) handle the case where Unicode defines a locale-independent mapping from the code point to a code-point sequence (in addition to the 1-1 mapping on scalar values).
(char-upcase
char
)
produces a character according to the upcase mapping provided by the Unicode database forchar
; ifchar
has no upcase mapping,char-upcase
produceschar
.(char-downcase
char
)
produces a character according to the downcase mapping provided by the Unicode database forchar
; ifchar
has no downcase mapping,char-upcase
produceschar
.(char-titlecase
char
)
produces a character according to the titlecase mapping provided by the Unicode database forchar
; ifchar
has no titlecase mapping,char-upcase
produceschar
.(char-foldcase
char
)
produces a character according to the case-folding mapping provided by the Unicode database forchar
.
(make-known-char-range-list
)
produces a list of three-element
lists, where each three-element list represents a set of consecutive
code points for which the Unicode standard specifies character
properties. Each three-element list contains two integers and a
boolean; the first integer is a starting code-point value
(inclusive), the second integer is an ending code-point value
(inclusive), and the boolean is #t
when all characters in
the code-point range have identical results for all of the
character predicates above. The three-element lists are ordered in
the overall result list such that later lists represent larger
code-point values, and all three-element lists are separated from
every other by at least one code-point value that is not specified by
Unicode.
(char-utf-8-length
char
)
produces the same result as
(
.bytes-length
(string->bytes/utf-8 (string char)))
3.5 Strings
Since a string consists of a sequence of characters, a string in MzScheme is a Unicode code-point sequence. MzScheme also provides byte strings, as well as functions to convert between byte strings and strings with respect to various encodings, including UTF-8 and the current locale's encoding. See section 1.2 for an overview of Unicode, locales, and encodings, and see section 3.6 for more specific information on byte-string conversions.
A string can be mutable or immutable. When an immutable string is
provided to a procedure like
, the
string-set!
exn:fail:contract
exception is raised. String constants generated by
are immutable. read
(string->immutable-string
string
)
returns an immutable string with the same content as string
,
and it returns string
itself if string
is immutable. (See
also
in section 3.10.)immutable?
(substring
string start-k
[end-k
])
returns a mutable string, even
if the string
argument is immutable. The end-k
argument
defaults to (string-length
string
)
(string-copy!
dest-string dest-start-k src-string
[src-start-k src-end-k
])
changes the characters of dest-string
from positions
dest-start-k
(inclusive) to dest-end-k
(exclusive) to
match the characters in src-string
from src-start-k
(inclusive). If src-start-k
is not provided, it defaults to
0
. If src-end-k
is not provided, it defaults to
(
. The strings string-length
src-string)dest-string
and src-string
can be the same string, and in that case the
destination region can overlap with the source region; the
destination characters after the copy match the source characters
from before the copy. If any of dest-start-k
,
src-start-k
, or src-end-k
are out of range (taking into
account the sizes of the strings and the source and destination
regions), the exn:fail:contract
exception is raised.
When a string is created with make-string
without a fill
value, it is initialized with the null character (#\nul
) in
all positions.
The string comparison
procedures string=?
, string<?
, string-ci=?
,
etc. take two or more string arguments and check the arguments
pairwise (like the numerical comparison procedures). String
comparisons are performed through pairwise comparison of characters;
for the -ci
operations, the two strings are first
case-folded using string-foldcase
(described
below). Comparisons using all of these functions are fully
portable; the results do not depend on the current platform or
locale.
Four string-conversion procedures take into account Unicode's locale-independent conversion rules that map code-point sequences to code-point sequences (instead of simply mapping a 1-to-1 function on code points over the string). In each case, the string produced by the conversion can be longer that the input string.
(string-upcase
string
)
returns a string whose characters are the upcase conversion of the characters instring
.(string-downcase
string
)
returns a string whose characters are the downcase conversion of the characters instring
.(string-titlecase
string
)
returns a string where the first character in each sequence of cased characters instring
(ignoring case-ignorable characters) is converted to titlecase, and all other cased characters are downcased.(string-foldcase
string
)
returns a string whose characters are the case-fold conversion of the characters instring
.
Examples:
(string-upcase
"abc!") ; =>"ABC!"
(string-upcase
"Stra\xDFe") ; =>"STRASSE"
(string-downcase
"aBC!") ; =>"abc!"
(string-downcase
"Stra\xDFe") ; =>"stra\xDFe"
(string-downcase
"\u039A\u0391\u039F\u03A3") ; =>"\u03BA\u03b1\u03BF\u03C2"
(string-downcase
"\u03A3") ; =>"\u03C3"
(string-titlecase
"aBC twO") ; =>"Abc Two"
(string-titlecase
"y2k") ; =>"Y2K"
(string-titlecase
"main stra\xDFe") ; =>"Main Stra\xDFe"
(string-titlecase
"stra \xDFe") ; =>"Stra Sse"
(string-foldcase
"aBC!") ; =>"abc!"
(string-foldcase
"Stra\xDFe") ; =>"strasse"
(string-foldcase
"\u039A\u0391\u039F\u03A3") ; =>"\u03BA\u03b1\u03BF\u03C3"
In addition to the character-based string procedures, MzScheme provides the following locale-sensitive procedures (see also section 1.2.2 and section 7.9.1.11):
(string-locale-upcase
string
)
-- may produce a string that is longer or shorter thanstring
if the current locale has complex case-folding rules.(string-locale-downcase
string
)
-- likestring-locale-upcase
, may produce a string that is longer or shorter thanstring
These procedures depend only on the current locale's case-conversion and collation rules, and not on its encoding rules.
3.6 Byte Strings
A byte string is like a string, but it a sequence of bytes
instead of characters. A byte is an exact integer between
0
and 255
inclusive; (byte?
v
)
produces
#t
if v
is such an exact integer, #f
otherwise. Two bytes strings are
if they are bytewise
equal, and two byte strings are equal?
only if they are
eqv?
.eq?
MzScheme provides byte-string operations in parallel to the character-string operations:
A byte-string constant is written like a string, but prefixed with
#
(with no space between #
and the opening
double-quote). A byte-string constant can contain escape sequences,
as in #"\n"
, just like strings; an exn:fail:read
exception
is raised if a ``\u'' sequence appears within a byte
string and the given hexadecimal value is larger than 255.
Like character strings, byte strings generated by
are
immutable, and when an immutable string is provided to a procedure
like read
, the bytes-set!
exn:fail:contract
exception is raised.
The following procedures convert between byte strings and character strings:
(bytes->string/utf-8
bytes
[err-char start-k end-k
])
-- produces a string by decoding thestart-k
toend-k
substring ofbytes
as a UTF-8 encoding of Unicode code points. Iferr-char
is provided and not#f
, then it is used for bytes that fall in the range#o200
to#o377
but are not part of a valid encoding sequence. (This is consistent with reading characters from a port; see section 11.1 for more details.) Iferr-char
is#f
or not provided, and if thestart-k
toend-k
substring ofbytes
is not a valid UTF-8 encoding overall, then theexn:fail:contract
exception is raised. Ifstart-k
orend-k
are not provided, they default to0
and(
, respectively.bytes-length
bytes)(bytes->string/locale
bytes
[err-char start-k end-k
])
-- produces a string by decoding thestart-k
toend-k
substring ofbytes
using the current locale's encoding (see also section 1.2.2). Iferr-char
is provided and not#f
, it is used for each byte inbytes
that is not part of a valid encoding; iferr-char
is#f
or not provided, and if thestart-k
toend-k
substring ofbytes
is not a valid encoding overall, then theexn:fail:contract
exception is raised. Ifstart-k
orend-k
are not provided, they default to0
and(
, respectively.bytes-length
bytes)(bytes->string/latin-1
bytes
[err-char start-k end-k
])
-- produces a string by decoding thestart-k
toend-k
substring ofbytes
as a Latin-1 encoding of Unicode code points; i.e., each byte is translated directly to a character usinginteger->char
, so the decoding always succeeds.8 Theerr-char
argument is ignored, but for consistency with the other operations, it must be a character or#f
if provided. Ifstart-k
orend-k
are not provided, they default to0
and(
, respectively.bytes-length
bytes)(string->bytes/utf-8
string
[err-byte start-k end-k
])
-- produces a byte string by ending thestart-k
toend-k
substring ofstring
via UTF-8 (always succeeding). Theerr-char
argument is ignored, but for consistency with the other operations, it must be a byte or#f
if provided. Ifstart-k
orend-k
are not provided, they default to0
and(
, respectively.string-length
string)(string->bytes/locale
string
[err-byte start-k end-k
])
-- produces a string by encoding thestart-k
toend-k
substring ofstring
using the current locale's encoding (see also section 1.2.2). Iferr-byte
is provided and not#f
, it is used for each character instring
that cannot be encoded for the current locale; iferr-byte
is#f
or not provided, and if thestart-k
toend-k
substring ofstring
cannot be encoded, then theexn:fail:contract
exception is raised. Ifstart-k
orend-k
are not provided, they default to0
and(
, respectively.string-length
string)(string->bytes/latin-1
string
[err-byte start-k end-k
])
-- produces a string by encoding thestart-k
toend-k
substring ofstring
using Latin-1; i.e., each character is translated directly to a byte usingchar->integer
. Iferr-byte
is provided and not#f
, it is used for each character instring
whose value is greater than255
;9 iferr-byte
is#f
or not provided, and if thestart-k
toend-k
substring ofstring
has a character with a value greater than255
, then theexn:fail:contract
exception is raised. Ifstart-k
orend-k
are not provided, they default to0
and(
, respectively.string-length
string)(string-utf-8-length
string
[start-k end-k
])
returns the length in bytes of the UTF-8 encoding ofstring
's substring fromstart-k
toend-k
, but without actually generating the encoded bytes. Ifstart-k
is not provided, it defaults to0
, andend-k
defaults to(
.string-length
string)(bytes-utf-8-length
bytes
[err-char start-k end-k
])
returns the length in characters of the UTF-8 decoding ofbytes
's substring fromstart-k
toend-k
, but without actually generating the decoded characters. Ifstart-k
is not provided, it defaults to0
, andend-k
defaults to(
. Ifbytes-length
bytes)err-char
is#f
and the substring is not a UTF-8 encoding overall, the result is#f
. Otherwise,err-char
is used to resolve decoding errors as inbytes->string/utf-8
.(bytes-utf-8-ref
bytes
[skip-k err-char start-k end-k
])
returns theskip-k
th character in the UTF-8 decoding ofbytes
's substring fromstart-k
toend-k
, but without actually generating the other decoded characters. Ifstart-k
is not provided, it defaults to0
, andend-k
defaults to(
. If the substring is not a UTF-8 encoding up to thebytes-length
bytes)skip-k
th character (whenerr-char
is#f
), or if the substring decoding produces fewer thanskip-k
characters, the result is#f
. Iferr-char
is not#f
, it is used to resolve decoding errors as inbytes->string/utf-8
.(bytes-utf-8-index
bytes
[skip-k err-char start-k end-k
])
returns the offset in bytes intobytes
at which theskip-k
th character's encoding starts in the UTF-8 decoding ofbytes
's substring fromstart-k
toend-k
(but without actually generating the other decoded characters). Ifstart-k
is not provided, it defaults to0
, andend-k
defaults to(
. The result is relative to the start ofbytes-length
bytes)bytes
, not tostart-k
. If the substring is not a UTF-8 encoding up to theskip-k
th character (whenerr-char
is#f
), or if the substring decoding produces fewer thanskip-k
characters, the result is#f
. Iferr-char
is not#f
, it is used to resolve decoding errors as inbytes->string/utf-8
.
A string converter can be used to convert directly from one byte-string encoding of characters to another byte-string encoding.
(bytes-open-converter
from-name-string to-name-string
)
-- produces a string converter to go from the encoding named byfrom-name-string
to the encoding named byto-name-string
. If the requested conversion pair is not available,#f
is returned instead of a converter.Three encodings are always available in certain positions:
"UTF-8"
as eitherfrom
orto
-- the UTF-8 encoding."UTF-8-permissive"
asfrom
with"UTF-8"
asto
-- the UTF-8 encoding where encoding errors are tolerated, producing the same result as(
for bytes that are not part of a valid encoding sequence. (This handling of invalid sequences is consistent with the interpretation of port bytes streams into characters; see section 11.1.)char->integer
#\?)""
as eitherfrom
orto
-- the current locale's default encoding (see section 1.2.2).
A newly opened byte converter is registered with the current custodian (see section 9.2), so that the converter is closed when the custodian is shut down. A converter is not registered with a custodian (and does not need to be closed) if it is is one of the guaranteed combinations involving only
"UTF-8"
and"UTF-8-permissive"
under Unix, or if it is any of the guaranteed combinations (including""
) under Windows and Mac OS X.The set of available encodings and combinations varies by platform, depending on the iconv library that is installed. Under Windows, iconv.dll or libiconv.dll must be in the user's path or the current executable's directory at run time, and iconv.dll or libiconv.dll must link to msvcrt.dll for _errno; otherwise, only the guaranteed combinations are available.
(bytes-close-converter
bytes-converter
)
-- closes the given converter, so that it can no longer be used withbytes-convert
orbytes-convert-end
.(bytes-convert
bytes-converter src-bytes
[src-start-k src-end-k dest-bytes dest-start-k dest-end-k
])
converts the bytes fromsrc-start-k
tosrc-end-k
insrc-bytes
. Ifdest-bytes
is supplied and not#f
, the converted byte are written intodest-bytes
fromdest-start-k
todest-end-k
. Ifdest-bytes
is not supplied or is#f
, then a newly allocated byte string holds the conversion results, and the size of the result byte string is no more than(- dest-end-k start-start-k)
.If
src-start-k
ordest-start-k
is not provided, it defaults to0
. Ifsrc-end-k
is not provided, it defaults to(
. Ifbytes-length
src-bytessrc-end-k
is not provided or is#f
, then it defaults to(
whenbytes-length
dest-bytes)dest-bytes
is a byte string or to an arbitrarily large integer otherwise.The result of
bytes-convert
is three values:result-bytes
ordest-wrote-k
-- a byte string ifdest-bytes
is#f
or not provided, or the number of bytes written intodest-bytes
otherwise.src-read-k
-- the number of bytes successfully converted fromsrc-bytes
.'complete
,'continues
,'aborts
, or'error
-- indicates how conversion terminated.'complete
: The entire input was processed, andsrc-read-k
will be equal to(- src-end-k src-start-k)
.'continues
: Conversion stopped due to the limit on the result size or the space indest-bytes
; in this case, fewer than(- dest-end-k dest-start-k)
bytes may be returned if more space is needed to process the next complete encoding sequence insrc-bytes
.'aborts
: The input stopped part-way through an encoding sequence, and more input bytes are necessary to continue. For example, if the last byte of input is#o303
for a"UTF-8-permissive"
decoding, the result is'aborts
, because another byte is needed to determine how to use the#o303
byte.'error
: The bytes starting at(+ src-start-k src-read-k)
bytes insrc-bytes
do not form a legal encoding sequence. This result is never produced for some encodings, where all byte sequences are valid encodings. For example, since"UTF-8-permissive"
handles an invalid UTF-8 sequence by dropping characters or generating ``?'', every byte sequence is effectively valid.
Applying a converter accumulates state in the converter (even when the third result of
bytes-convert
is'complete
). This state can affect both further processing of input and further generation of output, but only for conversions that involve ``shift sequences'' to change modes within a stream. To terminate an input sequence and reset the converter, usebytes-convert-end
.(bytes-convert-end
bytes-converter
[dest-bytes dest-start-k dest-end-k
])
-- likebytes-convert
, but instead of converting bytes, this procedure generates an ending sequence for the conversion (sometimes called a ``shift sequence''), if any. Few encodings use shift sequences, so this function will succeed with no output for most encodings. In any case, successful output of a (possibly empty) shift sequence resets the converter to its initial state.The result of
bytes-convert-end
is two values:result-bytes
ordest-wrote-k
-- a byte string ifdest-bytes
is#f
or not provided, or the number of bytes written intodest-bytes
otherwise.'complete
or'continues
-- indicates whether conversion completed. If'complete
, then an entire ending sequence was produced. If'continues
, then the conversion could not complete due to the limit on the result size or the space indest-bytes
, and the first result is either an empty byte string or0
.
(bytes-converter?
v
)
returns#t
ifv
is a byte converter produced bybytes-open-converter
,#f
otherwise.(locale-string-encoding
)
returns a string for the current locale's encoding (i.e., the encoding normally identified by""
). See alsosystem-language+country
in section 15.5.
3.7 Symbols
For information about symbol parsing and printing, see section 11.2.4 and section 11.2.5, respectively.
MzScheme provides two ways of generating an uninterned
symbol, i.e., a symbol that is not
, eq?
, or
eqv?
to any other symbol, although it may print the same
as another symbol:
equal?
(string->uninterned-symbol
string
)
is like(
, but the resulting symbol is a new uninterned symbol. Callingstring->symbol
string
)
twice with the samestring->uninterned-symbol
string
returns two distinct symbols.(gensym
[symbol/string
])
creates an uninterned symbol with an automatically-generated name. The optionalsymbol/string
argument is a prefix symbol or string.
Regular (interned) symbols are only weakly held by the internal symbol
table. This weakness can never affect the result of
an
, eq?
, or eqv?
test, but a symbol
may disappear when placed into a weak box (see section 13.1) used
as the key in a weak hash table (see section 3.14), or used as
an ephemeron key (see section 13.2).equal?
3.8 Keywords
A symbol-like datum that starts with a hash and colon (``#:'') is
parsed as a keyword constant. Keywords behave like
symbols -- two keywords are
if and only if they print
the same -- but they are a distinct set of values.eq?
(keyword?
v
)
returns#t
ifv
is a keyword,#f
otherwise.(keyword->string
keyword
)
returns a string for the
ed form ofdisplay
keyword
, not including the leading#:
.(string->keyword
string
)
returns a keyword whose
ed form is the same as that ofdisplay
string
, but with a leading#:
.
Like symbols, keywords are only weakly held by the internal keyword table; see section 3.7 for more information.
3.9 Vectors
When a vector is created with
without a fill
value, it is initialized with make-vector
0
in all positions. A vector
can be immutable, such as a vector returned by syntax-e
, but
vectors generated by read
are mutable. (See also
in section 3.10.)immutable?
(vector->immutable-vector
vec
)
returns an immutable vector with
the same content as vec
, and it returns vec
itself if
vec
is immutable. (See also
in
section 3.10.)immutable?
(vector-immutable
v
···1)
is like (vector v ···1)
except that
the resulting vector is immutable. (See also
in
section 3.10.)immutable?
3.10 Lists
A cons cell can be mutable or immutable. When an immutable cons cell
is provided to a procedure like
, the
set-cdr!
exn:fail:contract
exception is raised. Cons cells generated by
are always mutable.read
The global variable null
is bound to the empty list.
(reverse!
list
)
is the same as (reverse
, but
list
)list
is destructively reversed using
(i.e.,
each cons cell in set-cdr!
list
is mutated).
(append!
list
···1)
is like
(append
, but it destructively appends the
list
)list
s (i.e., except for the last list
, the last cons cell
of each list
is mutated to append the lists; empty lists are
essentially dropped).
(list*
v
···1)
is similar to (
but the last argument is used directly as the list
v
···1)
of the last
pair constructed for the list:
cdr
(list*
1 2 3 4) ; =>'(1 2 3 . 4)
(cons-immutable
v1 v2
)
returns an immutable pair whose
is car
v1
and
is cdr
v2
.
(list-immutable
v
···1)
is like (
, but using
immutable pairs.list
v ···1)
(list*-immutable
v
···1)
is like (
, but using
immutable pairs.list*
v ···1)
(immutable?
v
)
returns #t
if v
is an immutable
cons cell, string, vector, box, or hash table, #f
otherwise.
The list-ref
and list-tail
procedures accept an
improper list as a first argument. If either procedure is applied to
an improper list and an index that would require taking the
or car
of a non-cons-cell, the
cdr
exn:fail:contract
exception is raised.
The member
, memv
, and memq
procedures
accept an improper list as a second argument. If the membership
search reaches the improper tail, the
exn:fail:contract
exception is raised.
The assoc
, assv
, and assq
procedures
accept an improperly formed association list as a second argument.
If the association search reaches an improper list tail or a list
element that is not a pair, the exn:fail:contract
exception is raised.
3.11 Boxes
MzScheme provides boxes, which are records that have a single field:
(box
v
)
returns a new mutable box that containsv
.(box-immutable
v
)
returns a new immutable box that containsv
.(unbox
box
)
returns the content ofbox
. For anyv
,(unbox (box
returnsv
))v
.(set-box!
mutable-box v
)
sets the content ofmutable-box
tov
.(box?
v
)
returns#t
ifv
is a box,#f
otherwise.
Two boxes are equal?
if the contents of the boxes are
.equal?
A box returned by
(see section 12.2.2) is
immutable; if syntax-e
is applied to such a box, the
set-box!
exn:fail:contract
exception is raised. A box produced by read
(via
#&
) is mutable. (See also
in
section 3.10.)immutable?
3.12 Procedures
See section 4.6 for information on defining new procedure types.
3.12.1 Arity
MzScheme's
procedure returns the input arity
of a procedure:
procedure-arity
(procedure-arity
proc
)
returns information about the number of arguments accepted by the procedureproc
. The resulta
is either:an exact non-negative integer ==> the procedure always takes exactly
a
arguments;an
arity-at-least
10 instance ==> the procedure takes(
or more arguments; orarity-at-least-value
a)a list containing integers and
arity-at-least
instances ==> the procedure takes any number of arguments that can match one of the arities in the list.
(procedure-arity-includes?
proc k
)
returns#t
if the procedure can acceptn
arguments (wherek
is an exact, non-negative integer),#f
otherwise.
Examples:
(procedure-arity
cons
) ; =>2
(procedure-arity
list
) ; =>#<struct:arity-at-least>
(arity-at-least?
(procedure-arity
list
)) ; =>#t
(arity-at-least-value
(procedure-arity
list
)) ; =>0
(arity-at-least-value
(procedure-arity
(lambda (x . y) x))) ; =>1
(procedure-arity
(case-lambda [(x) 0] [(x y) 1])) ; =>'(1 2)
(procedure-arity-includes?
cons
2) ; =>#t
(procedure-arity-includes?
display
3) ; =>#f
When compiling a lambda
or case-lambda
expression,
MzScheme looks for a 'method-arity-error
property
attached to the expression (see section 12.6.2). If it is present
with a true value, and if no case of the procedure accepts zero
arguments, then the procedure is marked so that an
exn:fail:contract:arity
exception involving the procedure
will hide the first argument, if one was provided. (Hiding the first
argument is useful when the procedure implements a method, where the
first argument is implicit in the original source). The property
affects only the format of exn:fail:contract:arity
exceptions,
not the result of
.procedure-arity
3.12.2 Primitives
A primitive procedure is a built-in procedure that is implemented in low-level language. Not all built-in procedures are primitives, but almost all R5RS procedures are primitives, as are most of the procedures described in this manual.
(primitive?
v
)
returns#t
ifv
is a primitive procedure or#f
otherwise.(primitive-result-arity
prim-proc
)
returns the arity of the result of the primitive procedureprim-proc
(as opposed to the procedure's input arity as returned byarity
; see section 3.12.1). For most primitives, this procedure returns1
, since most primitives return a single value when applied. For information about arity values, see section 3.12.1.(primitive-closure?
v
)
returns#t
ifv
is internally implemented as a primitive closure rather than a simple primitive procedure,#f
otherwise. This information is intended for use by the mzc compiler.
3.12.3 Procedure Names
See section 6.2.4 for information about the names of primitives,
and the names inferred for lambda
and case-lambda
procedures.
3.13 Promises
The force
procedure can only be applied to values returned
by delay
, and promises are never implicitly
d.force
(promise?
v
)
returns #t
if v
is a promise
created by delay
, #f
otherwise.
3.14 Hash Tables
(make-hash-table
[flag-symbol flag-symbol
])
creates and returns a
new hash table. If provided, each flag-symbol
must one of
the following:
'weak
-- creates a hash table with weakly-held keys (see section 13.1).'equal
-- creates a hash table that compares keys using
instead ofequal?
(needed, for example, when using strings as keys).eq?
By default, key comparisons use
. If the second
eq?
flag-symbol
is redundant, the
exn:fail:contract
exception is raised.
Two hash tables are
if they are created with the same
flags, and if they map the same keys to equal?
values (where
``same key'' means either equal?
or eq?
, depending
on the way the hash table compares keys).equal?
(make-immutable-hash-table
assoc-list
[flag-symbol
])
creates an
immutable hash table. (See also
in
section 3.10.) The immutable?
assoc-list
must be a list of pairs, where
the
of each pair is a key, and the car
is the
corresponding value. The mappings are added to the table in the order
that they appear in cdr
assoc-list
, so later mappings can hide
earlier mappings. If the optional flag-symbol
argument is
provided, it must be 'equal
, and the created hash table
compares keys with
; otherwise, the created table
compares keys with equal?
.eq?
(hash-table?
v
[flag-symbol flag-symbol
])
returns #t
if
v
was created by
or
make-hash-table
make-immutable-hash-table
with the given flag-symbol
s
(or more), #f
otherwise. Each provided flag-symbol
must be a distinct flag supported by make-hash-table
; if the
second flag-symbol
is redundant, the
exn:fail:contract
exception is raised.
(hash-table-put!
hash-table key-v v
)
maps key-v
to v
in hash-table
, overwriting any existing mapping for
key-v
. If hash-table
is immutable, the
exn:fail:contract
exception is raised.
(hash-table-get
hash-table key-v
[failure-thunk
])
returns the
value for key-v
in hash-table
. If no value is found for
key-v
, then the result of invoking failure-thunk
(a
procedure of no arguments) is returned. If failure-thunk
is not
provided, the exn:fail:contract
exception is raised when no value is
found for key-v
.
(hash-table-remove!
hash-table key-v
)
removes the value mapping
for key-v
if it exists in hash-table
. If
hash-table
is immutable, the exn:fail:contract
exception is raised.
(hash-table-map
hash-table proc
)
applies the procedure proc
to each element in hash-table
, accumulating the results into a
list. The procedure proc
must take two arguments: a key and its
value. See the caveat below about concurrent modification.
(hash-table-for-each
hash-table proc
)
applies the procedure
proc
to each element in hash-table
(for the side-effects
of proc
) and returns void. The procedure proc
must
take two arguments: a key and its value. See the caveat below about
concurrent modification.
(hash-table-count
hash-table
)
returns the number of keys mapped
by hash-table
. If hash-table
is not created with
'weak
, then the result is computed in constant time and
atomically. If hash-table
is created with 'weak
, see
the caveat below about concurrent modification.
(hash-table-copy
hash-table
)
returns a mutable hash table with
the same mappings, same key-comparison mode, and same key-holding
strength as hash-table
.
(eq-hash-code
v
)
returns an exact integer; for any two
values, the returned integer is the same. Furthermore,
for the result integer eq?
k
and any other exact integer j
,
(= k j)
implies (
.eq?
k j)
(equal-hash-code
v
)
returns an exact integer; for any two
values, the returned integer is the same.
Furthermore, for the result integer equal?
k
and any other exact
integer j
, (= k j)
implies (
. If
eq?
k j)v
contains a cycle through pairs, vectors, boxes, and
inspectable structure fields, then equal-hash-code
applied
to v
will loop indefinitely.
Caveat concerning concurrent modification: A hash table can be
manipulated with
, hash-table-get
,
and hash-table-put!
concurrently by multiple threads,
and the operations are protected by a table-specific semaphore as
needed. A few caveats apply, however:
hash-table-remove!
If a thread is terminated while applying
,hash-table-get
, orhash-table-put!
to a hash table that useshash-table-remove!
comparisons, all current and future operations on the hash table block indefinitely.equal?
The
,hash-table-map
, andhash-table-for-each
procedures do not use the table's semaphore. Consequently, if a hash table is extended with new keys by another thread while a map, for-each, or count is in process, arbitrary key-value pairs can be dropped or duplicated in the map or for-each. Similarly, if a map or for-each procedure itself extends the table, arbitrary key-value pairs can be dropped or duplicated. However, key mappings can be deleted or remapped by any thread with no adverse affects (i.e., the change does not affect a traversal if the key has been seen already, otherwise the traversal skips a deleted key or uses the remapped key's new value).hash-table-count
Caveat concerning mutable keys: If a key into an
-based hash table is mutated (e.g., a key string is
modified with equal?
), then the hash table's behavior
for put and get operations becomes unpredictable.string-set!
4 30 bits for a 32-bit architecture, 62 bits for a 64-bit architecture.
5 This definition of
technically contradicts R5RS, but R5RS does not address
strange ``numbers'' like eqv?
+nan.0
.
6 The random number generator uses a 54-bit version of L'Ecuyer's MRG32k3a algorithm.
7 The current version of MzScheme uses Unicode version 4.1.
8 See also the Latin-1 footnote of section 1.2.3.
9 See also the Latin-1 footnote of section 1.2.3.
10 The
arity-at-least
structure type is transparent to all
inspectors (see section 4.5).