Basic Data Extensions

3.1  Void and Undefined

MzScheme returns the unique void value -- printed as #<void> -- for expressions that have unspecified results in R5RS. The procedure void takes any number of arguments and returns void:

Variables bound by letrec-values that are accessible but not yet initialized are bound to the unique undefined value, printed as #<undefined>.

3.2  Booleans

Unless otherwise specified, two instances of a particular MzScheme data type are equal? only when they are eq?. Two values are eqv? only when they are either eq?, both +nan.0, or both = and have the same exactness and sign. (The inexact numbers 0.0 and -0.0 are not eqv?, although they are =.)

The andmap and ormap procedures apply a test procedure to the elements of a list, returning immediately when the result for testing the entire list is determined. The arguments to andmap and ormap are the same as for map, but a single boolean value is returned as the result, rather than a list:

Examples:

(andmap positive? '(1 2 3)) ; => #t
(ormap eq? '(a b c) '(a b c)) ; => #t
(andmap positive? '(1 2 a)) ; => raises exn:fail:contract
(ormap positive? '(1 2 a)) ; => #t
(andmap positive? '(1 -2 a)) ; => #f
(andmap + '(1 2 3) '(4 5 6)) ; => 9
(ormap + '(1 2 3) '(4 5 6)) ; => 5

3.3  Numbers

A number in MzScheme is one of the following:

MzScheme extends the number syntax of R5RS in three ways:

The special inexact numbers +inf.0, -inf.0, and +nan.0 have no exact form. Dividing by an inexact zero returns +inf.0 or -inf.0, depending on the sign of the dividend. The infinities are integers, and they answer #t for both even? and odd?. The +nan.0 value is not an integer and is not = to itself, but +nan.0 is eqv? to itself.5 Similarly, (= 0.0 -0.0) is #t, but (eqv? 0.0 -0.0) is #f.

All multi-argument arithmetic procedures operate pairwise on arguments from left to right.

The string->number procedure works on all number representations and exact integer radix values in the range 2 to 16 (inclusive). The number->string procedure accepts all number types and the radix values 2, 8, 10, and 16; however, if an inexact number is provided with a radix other than 10, the exn:fail:contract exception is raised.

The add1 and sub1 procedures work on any number:

The following procedures work on integers:

The following procedures work on exact integers in their (semi-infinite) two's complement representation:

The random procedure generates pseudo-random numbers:

The following procedures convert between Scheme numbers and common machine byte representations:

3.4  Characters

MzScheme characters range over Unicode scalar values (see section 1.2.1), which includes characters whose values range from #x0 to #x10FFFF, but not including #xD800 to #xDFFF. The procedure char->integer returns a character's code-point number, and integer->char converts a code-point number to a character. If integer->char is given an integer that is either outside #x0 to #x10FFFF or in the excluded range #xD800 to #xDFFF, the exn:fail:contract exception is raised.

Character constants include special named characters, such as #\newline, plus octal representations (e.g., #\251), and Unicode-style hexadecimal representations (e.g., #\u03BB). See section 11.2.4 for more information on character constants.

The character comparison procedures char=?, char<?, char-ci=?, etc. take two or more character arguments and check the arguments pairwise (like the numerical comparison procedures). Two characters are eq? whenever they are char=?. The expression (char<? char1 char2) produces the same result as (< (char->integer char1) (char->integer char2)), etc. The case-independent -ci procedures compare characters after case-folding with char-foldcase (described below).

The character predicates produce results consistent with the Unicode database7 and (usually) SRFI-14. These procedures are fully portable; their results do not depend on the current platform or locale.

Character conversions are also consistent with the 1-to-1 code point mapping defined by Unicode. String procedures (see section 3.5) handle the case where Unicode defines a locale-independent mapping from the code point to a code-point sequence (in addition to the 1-1 mapping on scalar values).

(make-known-char-range-list) produces a list of three-element lists, where each three-element list represents a set of consecutive code points for which the Unicode standard specifies character properties. Each three-element list contains two integers and a boolean; the first integer is a starting code-point value (inclusive), the second integer is an ending code-point value (inclusive), and the boolean is #t when all characters in the code-point range have identical results for all of the character predicates above. The three-element lists are ordered in the overall result list such that later lists represent larger code-point values, and all three-element lists are separated from every other by at least one code-point value that is not specified by Unicode.

(char-utf-8-length char) produces the same result as (bytes-length (string->bytes/utf-8 (string char))).

3.5  Strings

Since a string consists of a sequence of characters, a string in MzScheme is a Unicode code-point sequence. MzScheme also provides byte strings, as well as functions to convert between byte strings and strings with respect to various encodings, including UTF-8 and the current locale's encoding. See section 1.2 for an overview of Unicode, locales, and encodings, and see section 3.6 for more specific information on byte-string conversions.

A string can be mutable or immutable. When an immutable string is provided to a procedure like string-set!, the exn:fail:contract exception is raised. String constants generated by read are immutable. (string->immutable-string string) returns an immutable string with the same content as string, and it returns string itself if string is immutable. (See also immutable? in section 3.10.)

(substring string start-k [end-k]) returns a mutable string, even if the string argument is immutable. The end-k argument defaults to (string-length string)

(string-copy! dest-string dest-start-k src-string [src-start-k src-end-k]) changes the characters of dest-string from positions dest-start-k (inclusive) to dest-end-k (exclusive) to match the characters in src-string from src-start-k (inclusive). If src-start-k is not provided, it defaults to 0. If src-end-k is not provided, it defaults to (string-length src-string). The strings dest-string and src-string can be the same string, and in that case the destination region can overlap with the source region; the destination characters after the copy match the source characters from before the copy. If any of dest-start-k, src-start-k, or src-end-k are out of range (taking into account the sizes of the strings and the source and destination regions), the exn:fail:contract exception is raised.

When a string is created with make-string without a fill value, it is initialized with the null character (#\nul) in all positions.

The string comparison procedures string=?, string<?, string-ci=?, etc. take two or more string arguments and check the arguments pairwise (like the numerical comparison procedures). String comparisons are performed through pairwise comparison of characters; for the -ci operations, the two strings are first case-folded using string-foldcase (described below). Comparisons using all of these functions are fully portable; the results do not depend on the current platform or locale.

The following string-conversion procedures take into account Unicode's locale-independent conversion rules that map code-point sequences to code-point sequences (instead of simply mapping a 1-to-1 function on code points over the string). In each case, the string produced by the conversion can be longer than the input string.

Examples:

(string-upcase "abc!") ; => "ABC!"
(string-upcase "Stra\xDFe") ; => "STRASSE"

(string-downcase "aBC!") ; => "abc!"
(string-downcase "Stra\xDFe") ; => "stra\xDFe"
(string-downcase "\u039A\u0391\u039F\u03A3") ; => "\u03BA\u03b1\u03BF\u03C2"
(string-downcase "\u03A3") ; => "\u03C3"

(string-titlecase "aBC  twO") ; => "Abc  Two"
(string-titlecase "y2k") ; => "Y2K"
(string-titlecase "main stra\xDFe") ; => "Main Stra\xDFe"
(string-titlecase "stra \xDFe") ; => "Stra Sse"

(string-foldcase "aBC!") ; => "abc!"
(string-foldcase "Stra\xDFe") ; => "strasse"
(string-foldcase "\u039A\u0391\u039F\u03A3") ; => "\u03BA\u03b1\u03BF\u03C3"

In addition to the character-based string procedures, MzScheme provides the following locale-sensitive procedures (see also section 1.2.2 and section 7.9.1.11):

These procedures depend only on the current locale's case-conversion and collation rules, and not on its encoding rules.

MzScheme provides four Unicode-normalization procedures:

For each of the normalization procedures, if the given string is already in the corresponding Unicode normal form, the string may be returned directly as the result (instead of a newly allocated string).

3.6  Byte Strings

A byte string is like a string, but it a sequence of bytes instead of characters. A byte is an exact integer between 0 and 255 inclusive; (byte? v) produces #t if v is such an exact integer, #f otherwise. Two bytes strings are equal? if they are bytewise equal, and two byte strings are eqv? only if they are eq?.

MzScheme provides byte-string operations in parallel to the character-string operations:

A byte-string constant is written like a string, but prefixed with # (with no space between # and the opening double-quote). A byte-string constant can contain escape sequences, as in #"\n", just like strings; an exn:fail:read exception is raised if a ``\u'' sequence appears within a byte string and the given hexadecimal value is larger than 255.

Like character strings, byte strings generated by read are immutable, and when an immutable string is provided to a procedure like bytes-set!, the exn:fail:contract exception is raised.

The following procedures convert between byte strings and character strings:

A string converter can be used to convert directly from one byte-string encoding of characters to another byte-string encoding.

3.7  Symbols

For information about symbol parsing and printing, see section 11.2.4 and section 11.2.5, respectively.

MzScheme provides two ways of generating an uninterned symbol, i.e., a symbol that is not eq?, eqv?, or equal? to any other symbol, although it may print the same as another symbol:

Regular (interned) symbols are only weakly held by the internal symbol table. This weakness can never affect the result of an eq?, eqv?, or equal? test, but a symbol may disappear when placed into a weak box (see section 13.1) used as the key in a weak hash table (see section 3.14), or used as an ephemeron key (see section 13.2).

3.8  Keywords

A symbol-like datum that starts with a hash and colon (``#:'') is parsed as a keyword constant. Keywords behave like symbols -- two keywords are eq? if and only if they print the same -- but they are a distinct set of values.

Like symbols, keywords are only weakly held by the internal keyword table; see section 3.7 for more information.

3.9  Vectors

When a vector is created with make-vector without a fill value, it is initialized with 0 in all positions. A vector can be immutable, such as a vector returned by syntax-e, but vectors generated by read are mutable. (See also immutable? in section 3.10.)

(vector->immutable-vector vec) returns an immutable vector with the same content as vec, and it returns vec itself if vec is immutable. (See also immutable? in section 3.10.)

(vector-immutable v ···1) is like (vector v ···1) except that the resulting vector is immutable. (See also immutable? in section 3.10.)

3.10  Lists

A cons cell can be mutable or immutable. When an immutable cons cell is provided to a procedure like set-cdr!, the exn:fail:contract exception is raised. Cons cells generated by read are always mutable.

The global variable null is bound to the empty list.

(reverse! list) is the same as (reverse list), but list is destructively reversed using set-cdr! (i.e., each cons cell in list is mutated).

(append! list ···1) is like (append list), but it destructively appends the lists (i.e., except for the last list, the last cons cell of each list is mutated to append the lists; empty lists are essentially dropped).

(list* v ···1) is similar to (list v ···1) but the last argument is used directly as the cdr of the last pair constructed for the list:

(list* 1 2 3 4) ; => '(1 2 3 . 4)

(cons-immutable v1 v2) returns an immutable pair whose car is v1 and cdr is v2.

(list-immutable v ···1) is like (list v ···1), but using immutable pairs.

(list*-immutable v ···1) is like (list* v ···1), but using immutable pairs.

(immutable? v) returns #t if v is an immutable cons cell, string, vector, box, or hash table, #f otherwise.

The list-ref and list-tail procedures accept an improper list as a first argument. If either procedure is applied to an improper list and an index that would require taking the car or cdr of a non-cons-cell, the exn:fail:contract exception is raised.

The member, memv, and memq procedures accept an improper list as a second argument. If the membership search reaches the improper tail, the exn:fail:contract exception is raised.

The assoc, assv, and assq procedures accept an improperly formed association list as a second argument. If the association search reaches an improper list tail or a list element that is not a pair, the exn:fail:contract exception is raised.

3.11  Boxes

MzScheme provides boxes, which are records that have a single field:

Two boxes are equal? if the contents of the boxes are equal?.

A box returned by syntax-e (see section 12.2.2) is immutable; if set-box! is applied to such a box, the exn:fail:contract exception is raised. A box produced by read (via #&) is mutable. (See also immutable? in section 3.10.)

3.12  Procedures

See section 4.6 for information on defining new procedure types.

3.12.1  Arity

MzScheme's procedure-arity procedure returns the input arity of a procedure:

Examples:

(procedure-arity cons) ; => 2
(procedure-arity list) ; => #<struct:arity-at-least>
(arity-at-least? (procedure-arity list)) ; => #t
(arity-at-least-value (procedure-arity list)) ; => 0
(arity-at-least-value (procedure-arity (lambda (x . y) x))) ; => 1
(procedure-arity (case-lambda [(x) 0] [(x y) 1])) ; => '(1 2)
(procedure-arity-includes? cons 2) ; => #t
(procedure-arity-includes? display 3) ; => #f

When compiling a lambda or case-lambda expression, MzScheme looks for a 'method-arity-error property attached to the expression (see section 12.6.2). If it is present with a true value, and if no case of the procedure accepts zero arguments, then the procedure is marked so that an exn:fail:contract:arity exception involving the procedure will hide the first argument, if one was provided. (Hiding the first argument is useful when the procedure implements a method, where the first argument is implicit in the original source). The property affects only the format of exn:fail:contract:arity exceptions, not the result of procedure-arity.

3.12.2  Primitives

A primitive procedure is a built-in procedure that is implemented in low-level language. Not all built-in procedures are primitives, but almost all R5RS procedures are primitives, as are most of the procedures described in this manual.

3.12.3  Procedure Names

See section 6.2.4 for information about the names of primitives, and the names inferred for lambda and case-lambda procedures.

3.12.4  Closure Equality

(procedure-closure-contents-eq? proc1, proc2) return #t if the procedures proc1 and proc2 refer to the same code closed over the same values, where each value is compared with eq?.

Inlining and other compiler optimizations limit the usefulness of this procedure, because code can be duplicated or merged. Since the amount of duplication from inlining is limited, however, procedure-closure-contents-eq? is useful for some caching purposes.

Example:

(let ([f #f])
  ;; Using set! likely prevents inlining:
  (set! f (lambda (x) (lambda () x)))
  (procedure-closure-contents-eq? (f 'a) (f 'a)) ; => #t, probably
  (procedure-closure-contents-eq? (f 'a) (f 'b))) ; => #f, definitely
 
(let ([f (lambda (x) (lambda () x))])
  (procedure-closure-contents-eq? (f 'a) (f 'a)))
;; => #f, probably, because inling likely duplicates f's body 

3.13  Promises

The force procedure can only be applied to values returned by delay, and promises are never implicitly forced.

(promise? v) returns #t if v is a promise created by delay, #f otherwise.

3.14  Hash Tables

(make-hash-table [flag-symbol flag-symbol]) creates and returns a new hash table. If provided, each flag-symbol must one of the following:

By default, key comparisons use eq?. If the second flag-symbol is redundant, the exn:fail:contract exception is raised.

Two hash tables are equal? if they are created with the same flags, and if they map the same keys to equal? values (where ``same key'' means either eq? or equal?, depending on the way the hash table compares keys).

(make-immutable-hash-table assoc-list [flag-symbol]) creates an immutable hash table. (See also immutable? in section 3.10.) The assoc-list must be a list of pairs, where the car of each pair is a key, and the cdr is the corresponding value. The mappings are added to the table in the order that they appear in assoc-list, so later mappings can hide earlier mappings. If the optional flag-symbol argument is provided, it must be 'equal, and the created hash table compares keys with equal?; otherwise, the created table compares keys with eq?.

(hash-table? v [flag-symbol flag-symbol]) returns #t if v was created by make-hash-table or make-immutable-hash-table with the given flag-symbols (or more), #f otherwise. Each provided flag-symbol must be a distinct flag supported by make-hash-table; if the second flag-symbol is redundant, the exn:fail:contract exception is raised.

(hash-table-put! hash-table key-v v) maps key-v to v in hash-table, overwriting any existing mapping for key-v. If hash-table is immutable, the exn:fail:contract exception is raised.

(hash-table-get hash-table key-v [failure-thunk]) returns the value for key-v in hash-table. If no value is found for key-v, then the result of invoking failure-thunk (a procedure of no arguments) is returned. If failure-thunk is not provided, the exn:fail:contract exception is raised when no value is found for key-v.

(hash-table-remove! hash-table key-v) removes the value mapping for key-v if it exists in hash-table. If hash-table is immutable, the exn:fail:contract exception is raised.

(hash-table-map hash-table proc) applies the procedure proc to each element in hash-table, accumulating the results into a list. The procedure proc must take two arguments: a key and its value. See the caveat below about concurrent modification.

(hash-table-for-each hash-table proc) applies the procedure proc to each element in hash-table (for the side-effects of proc) and returns void. The procedure proc must take two arguments: a key and its value. See the caveat below about concurrent modification.

(hash-table-count hash-table) returns the number of keys mapped by hash-table. If hash-table is not created with 'weak, then the result is computed in constant time and atomically. If hash-table is created with 'weak, see the caveat below about concurrent modification.

(hash-table-copy hash-table) returns a mutable hash table with the same mappings, same key-comparison mode, and same key-holding strength as hash-table.

(eq-hash-code v) returns an exact integer; for any two eq? values, the returned integer is the same. Furthermore, for the result integer k and any other exact integer j, (= k j) implies (eq? k j).

(equal-hash-code v) returns an exact integer; for any two equal? values, the returned integer is the same. Furthermore, for the result integer k and any other exact integer j, (= k j) implies (eq? k j). If v contains a cycle through pairs, vectors, boxes, and inspectable structure fields, then equal-hash-code applied to v will loop indefinitely.

Caveat concerning concurrent modification: A hash table can be manipulated with hash-table-get, hash-table-put!, and hash-table-remove! concurrently by multiple threads, and the operations are protected by a table-specific semaphore as needed. A few caveats apply, however:

Caveat concerning mutable keys: If a key into an equal?-based hash table is mutated (e.g., a key string is modified with string-set!), then the hash table's behavior for put and get operations becomes unpredictable.


4 30 bits for a 32-bit architecture, 62 bits for a 64-bit architecture.

5 This definition of eqv? technically contradicts R5RS, but R5RS does not address strange ``numbers'' like +nan.0.

6 The random number generator uses a 54-bit version of L'Ecuyer's MRG32k3a algorithm.

7 The current version of MzScheme uses Unicode version 4.1.

8 See also the Latin-1 footnote of section 1.2.3.

9 See also the Latin-1 footnote of section 1.2.3.

10 In PLT's software distributions for Windows, a suitable iconv.dll is included with libmzschVERS.dll.

11 The arity-at-least structure type is transparent to all inspectors (see section 4.5).