The syntax of a Scheme program is defined by
a read phase that processes a character stream into a syntax object; and
The expand phase recursively processes a syntax object to produce a complete parse of the program. Binding information in a syntax object drives the expansion process, and when the expansion process encounters a binding form, it extends syntax objects for sub-expression with new binding information.
An identifier is source-program entity. Parsing (i.e., expanding) a Scheme program reveals that some identifiers correspond to variables, some refer to syntactic forms, and some are quoted to produce a symbol or a syntax object.
An identifier binds another (i.e., it is a binding) when the former is parsed as a variable and the latter is parsed as a reference to the former; the latter is bound. The scope of a binding is the set of source forms to which it applies. The environment of a form is the set of bindings whose scope includes the form. A binding for a sub-expression shadows any bindings (i.e., it is shadowing) in its environment, so that uses of an identifier refer to the shadowing binding.
For example, as a bit of source, the text
(let ([x 5]) x)
includes two identifiers: let and x (which appears twice). When this source is parsed in a typical environment, x turns out to represent a variable (unlike let). In particular, the first x binds the second x.
A top-level binding is a binding from a definition at the top-level; a module binding is a binding from a definition in a module; and a local binding is another other kind of binding. There is no difference between an unbound identifier and one with a top-level binding; within a module, references to top-level bindings are disallowed, and so such identifiers are called unbound in a module context.
Throughout the documentation, identifiers are typeset to suggest the way that they are parsed. A black, boldface identifier like lambda indicates as a reference to a syntactic form. A plain blue identifier like x is a variable or a reference to an unspecified top-level variable. A hyperlinked identifier cons is a reference to a specific top-level variable.
Every binding has a phase level in which it can be referenced, where a phase level normally corresponds to an integer (but the special label phase level does not correspond to an integer). Phase level 0 corresponds to the run time of the enclosing module (or the run time of top-level expressions). Bindings in phase level 0 constitute the base environment. Phase level 1 corresponds to the time during which the enclosing module (or top-level expression) is expanded; bindings in phase level 1 constitute the transformer environment. Phase level -1 corresponds to the run time of a different module for which the enclosing module is imported for use at phase level 1 (relative to the importing module); bindings in phase level -1 constitute the template environment. The label phase level does not correspond to any execution time; it is used to track bindings (e.g., to identifiers within documentation) without implying an execution dependency.
If an identifier has a local binding, then it is the same for all phase levels, though the reference is allowed only at a particular phase level. Attempting to reference a local binding in a different phase level from the binding’s context produces a syntax error. If an identifier has a top-level binding or module binding, then it can have different such bindings in different phase levels.
A syntax object combines a simpler Scheme value, such as a symbol or pair, with lexical information about bindings, source-location information, syntax properties, and syntax certificates. In particular, an identifier is represented as a symbol object that combines a symbol and lexical and other information.
For example, a car identifier might have lexical information that designates it as the car from the scheme/base language (i.e., the built-in car). Similarly, a lambda identifier’s lexical information may indicate that it represents a procedure form. Some other identifier’s lexical information may indicate that it references a top-level variable.
When a syntax object represents a more complex expression than an identifier or simple constant, its internal components can be extracted. Even for extracted identifier, detailed information about binding is available mostly indirectly; two identifiers can be compared to see if they refer to the same binding (i.e., free-identifier=?), or whether each identifier would bind the other if one was in a binding position and the other in an expression position (i.e., bound-identifier=?).
For example, the when the program written as
is represented as a syntax object, then two syntax objects can be extracted for the two xs. Both the free-identifier=? and bound-identifier=? predicates will indicate that the xs are the same. In contrast, the let identifier is not free-identifier=? or bound-identifier=? to either x.
The lexical information in a syntax object is independent of the other half, and it can be copied to a new syntax object in combination with an arbitrary other Scheme value. Thus, identifier-binding information in a syntax object is predicated on the symbolic name of the identifier as well as the identifier’s lexical information; the same question with the same lexical information but different base value can produce a different answer.
For example, combining the lexical information from let in the program above to 'x would not produce an identifier that is free-identifier=? to either x, since it does not appear in the scope of the x binding. Combining the lexical context of the 6 with 'x, in contrast, would produce an identifier that is bound-identifier=? to both xs.
The quote-syntax form bridges the evaluation of a program and the representation of a program. Specifically, (quote-syntax datum) produces a syntax object that preserves all of the lexical information that datum had when it was parsed as part of the quote-syntax form.
Expansion recursively processes a syntax object in a particular phase level, starting with phase level 0. Bindings from the syntax object’s lexical information drive the expansion process, and cause new bindings to be introduced for the lexical information of sub-expressions. In some cases, a sub-expression is expanded in a deeper phase than the enclosing expression.
A complete expansion produces a syntax object matching the following grammar:
Beware that the symbolic names of identifiers in a fully expanded program may not match the symbolic names in the grammar. Only the binding (according to free-identifier=?) matters.
||||(begin top-level-form ...)|
||||(#%provide raw-provide-spec ...)|
||||(define-values (id ...) expr)|
||||(define-syntaxes (id ...) expr)|
||||(define-values-for-syntax (id ...) expr)|
||||(#%require raw-require-spec ...)|
||||(#%plain-lambda formals expr ...+)|
||||(case-lambda (formals expr ...+) ...)|
||||(if expr expr expr)|
||||(begin expr ...+)|
||||(begin0 expr expr ...)|
||||(set! id expr)|
||||(with-continuation-mark expr expr expr)|
||||(#%plain-app expr ...+)|
||||(#%top . id)|
||||(#%variable-reference (#%top . id))|
||||(id ...+ . id)|
More specifically, the typesetting of identifiers in the above grammar is significant. For example, the second case for expr is a syntax-object list whose first element is an identifier, where the identifier’s lexical information specifies a binding to the #%plain-lambda of the scheme/base language (i.e., the identifier is free-identifier=? to one whose binding is #%plain-lambda). In all cases, identifiers above typeset as syntactic-form names refer to the bindings defined in Syntactic Forms.
Only phase levels 0 and 1 are relevant for the parse of a program (though the datum in a quote-syntax form preserves its information for all phase levels). In particular, the relevant phase level is 0, except for the exprs in a define-syntax, define-syntaxes, define-for-syntax, or define-values-for-syntax form, in which case the relevant phase level is 1 (for which comparisons are made using free-transformer-identifier=? instead of free-identifier=?).
If it is an identifier (i.e., a syntax-object symbol), then a binding is determined by the identifier’s lexical information. If the identifier has a binding other than as a top-level variable, that binding is used to continue. If the identifier has no binding, a new syntax-object symbol '#%top is created using the lexical information of the identifier; if this #%top identifier has no binding (other than as a top-level variable), then parsing fails with an exn:fail:syntax exception. Otherwise, the new identifier is combined with the original identifier in a new syntax-object pair (also using the same lexical information as the original identifier), and the #%top binding is used to continue.
If it is a syntax-object pair of any other form, then a new syntax-object symbol '#%app is created using the lexical information of the pair. If the resulting #%app identifier has no binding, parsing fails with an exn:fail:syntax exception. Otherwise, the new identifier is combined with the original pair to form a new syntax-object pair (also using the same lexical information as the original pair), and the #%app binding is used to continue.
If it is any other syntax object, then a new syntax-object symbol '#%datum is created using the lexical information of the original syntax object. If the resulting #%datum identifier has no binding, parsing fails with an exn:fail:syntax exception. Otherwise, the new identifier is combined with the original syntax object in a new syntax-object pair (using the same lexical information as the original pair), and the #%datum binding is used to continue.
A transformer binding, such as introduced by define-syntax or let-syntax. If the associated value is a procedure of one argument, the procedure is called as a syntax transformer (described below), and parsing starts again with the syntax-object result. If the transformer binding is to any other kind of value, parsing fails with an exn:fail:syntax exception. The call to the syntax transformer is parameterized to set current-namespace to a namespace that shares bindings and variables with the namespace being used to expand, except that its base phase is one greater.
A variable binding, such as introduced by a module-level define or by let. In this case, if the form being parsed is just an identifier, then it is parsed as a reference to the corresponding variable. If the form being parsed is a syntax-object pair, then an #%app is added to the front of the syntax-object pair in the same way as when the first item in the syntax-object pair is not an identifier (third case in the previous enumeration), and parsing continues.
A core syntactic form, which is parsed as described for each form in Syntactic Forms. Parsing a core syntactic form typically involves recursive parsing of sub-forms, and may introduce bindings that determine the parsing of sub-forms.
Each expansion step occurs in a particular context, and transformers and core syntactic forms may expand differently for different contexts. For example, a module form is allowed only in a top-level context, and it fails in other contexts. The possible contexts are as follows:
top-level context : outside of any module, definition, or expression, except that sub-expressions of a top-level begin form are also expanded as top-level forms.
Different core syntactic forms parse sub-forms using different contexts. For example, a let form always parses the right-hand expressions of a binding in an expression context, but it starts parsing the body in an internal-definition context.
When a require form is encountered at the top level or module level, all lexical information derived from the top level or the specific module’s level are extended with bindings from the specified modules. If not otherwise indicated in the require form, bindings are introduced at the phase levels specified by the exporting modules: phase level 0 for each normal provide, phase level 1 for each for-syntax provide, and so on. The for-meta provide form allows exports at an arbitrary phase level (as long as a binding exists within the module at the phase level).
A for-syntax sub-form within require imports similarly, but the resulting bindings have a phase level that is one more than the exported phase levels, when exports for the label phase level are still imported at the label phase level. More generally, a for-meta sub-form within require imports with the specified phase level shift; if the specified shift is #f, or if for-label is used to import, then all bindings are imported into the label phase level.
When a define, define-values, define-syntax, or define-syntaxes form is encountered at the top level or module level, all lexical information derived from the top level or the specific module’s level is extended with bindings for the specified identifiers at phase level 0 (i.e., the base environment is extended).
When a define-for-syntax or define-values-for-syntax form is encountered at the top level or module level, bindings are introduced as for define-values, but at phase level 1 (i.e., the transformer environment is extended).
When a let-values form is encountered, the body of the let-values form is extended (by creating new syntax objects) with bindings for the specified identifiers. The same bindings are added to the identifiers themselves, so that the identifiers in binding position are bound-identifier=? to uses in the fully expanded form, and so they are not bound-identifier=? to other identifiers. The bindings are available for use at the phase level at which the let-values form is expanded.
A new binding in lexical information maps to a new variable. The identifiers mapped to this variable are those that currently have the same binding (i.e., that are currently bound-identifier=?) to the identifier associated with the binding.
For example, in
the binding introduced for x applies to the x in the body, but not the y n the body, because (at the point in expansion where the let-values form is encountered) the binding x and the body y are not bound-identifier=?.
In a top-level context or module context, when the expander encounters a define-syntaxes form, the binding that it introduces for the defined identifiers is a transformer binding. The value of the binding exists at expansion time, rather than run time (though the two times can overlap), though the binding itself is introduced with phase level 0 (i.e., in the base environment).
The value for the binding is obtained by evaluating the expression in the define-syntaxes form. This expression must be expanded (i.e. parsed) before it can be evaluated, and it is expanded at phase level 1 (i.e., in the transformer environment) instead of phase level 0.
If the resulting value is a procedure of one argument or the result of make-set!-transformer on a procedure, then it is used as a syntax transformer (a.k.a. macro). The procedure is expected to accept a syntax object and return a syntax object. A use of the binding (at phase level 0) triggers a call of the syntax transformer by the expander; see Expansion Steps.
Before the expander passes a syntax object to a transformer, the syntax object is extended with a syntax mark (that applies to all sub-syntax objects). The result of the transformer is similarly extended with the same syntax mark. When a syntax object’s lexical information includes the same mark twice in a row, the marks effectively cancel. Otherwise, two identifiers are bound-identifier=? (that is, one can bind the other) only if they have the same binding and if they have the same marks – counting only marks that were added after the binding.
This marking process helps keep binding in an expanded program consistent with the lexical structure of the source program. For example, the expanded form of the program
|(define x 12)|
|[(_ id) (let ([x 10]) id)]))|
|(define x 12)|
|[(_ id) (let ([x 10]) id)]))|
|(let-values ([(x) 10]) x)|
However, the result of the last expression is 12, not 10. The reason is that the transformer bound to m introduces the binding x, but the referencing x is present in the argument to the transformer. The introduced x is the one left with a mark, and the reference x has no mark, so the binding x is not bound-identifier=? to the body x.
The set! form works with the make-set!-transformer and prop:set!-transformer property to support assignment transformers that transform set! expressions. An assignment transformer contains a procedure that is applied by set! in the same way as a normal transformer by the expander.
The make-rename-transformer procedure or prop:rename-transformer property creates a value that is also handled specially by the expander and by set! as a transformer binding’s value. When id is bound to a rename transformer produced by make-rename-transformer, it is replaced with the target identifier passed to make-rename-transformer. In addition, as long as the target identifier does not have a true value for the 'not-free-identifier=? syntax property, the lexical information that contains the binding of id is also enriched so that id is free-identifier=? to the target identifier, identifier-binding returns the same results for both identifiers, and provide exports id as the target identifier. Finally, the binding is treated specially by syntax-local-value, and syntax-local-make-delta-introducer as used by syntax transformers.
In addition to using marks to track introduced identifiers, the expander tracks the expansion history of a form through syntax properties such as 'origin. See Syntax Object Properties for more information.
The expander’s handling of letrec-values+syntaxes is similar to its handling of define-syntaxes. A letrec-values+syntaxes mist be expanded in an arbitrary phase level n (not just 0), in which case the expression for the transformer binding is expanded at phase level n+1.
The expression in a define-for-syntax or define-values-for-syntax form is expanded and evaluated in the same way as for syntax. However, the introduced binding is a variable binding at phase level 1 (not a transformer binding at phase level 0).
In certain contexts, such as an internal-definition context or module context, forms are partially expanded to determine whether they represent definitions, expressions, or other declaration forms. Partial expansion works by cutting off the normal recursion expansion when the relevant binding is for a primitive syntactic form.
As a special case, when expansion would otherwise add an #%app, #%datum, or #%top identifier to an expression, and when the binding turns out to be the primitive #%app, #%datum, or #%top form, then expansion stops without adding the identifier.
An internal-definition context corresponds to a partial expansion step (see Partial Expansion). A form that supports internal definitions starts by expanding its first form in an internal-definition context, but only partially. That is, it recursively expands only until the form becomes one of the following:
A define-values or define-syntaxes form, for any form other than the last one: The definition form is not expanded further. Instead, the next form is expanded partially, and so on. As soon as an expression form is found, the accumulated definition forms are converted to a letrec-values (if no define-syntaxes forms were found) or letrec-syntaxes+values form, moving the expression forms to the body to be expanded in expression context.
When a define-values form is discovered, the lexical context of all syntax objects for the body sequence is immediately enriched with bindings for the define-values form before expansion continues. When a define-syntaxes form is discovered, the right-hand side is expanded and evaluated (as for a letrec-values+syntaxes form), and a transformer binding is installed for the body sequence before expansion continues.
A primitive expression form other than begin: The expression is expanded in an expression context, along with all remaining body forms. If any definitions were found, this expansion takes place after conversion to a letrec-values or letrec-syntaxes+values form. Otherwise, the expressions are expanded immediately.
A begin form: The sub-forms of the begin are spliced into the internal-definition sequence, and partial expansion continues with the first of the newly-spliced forms (or the next form, if the begin had no sub-forms).
A require form not only introduces bindings at expansion time, but also visits the referenced module when it is encountered by the expander. That is, the expander instantiates any define-for-syntaxed variables defined in the module, and also evaluates all expressions for define-syntaxes transformer bindings.
Module visits propagate through requires in the same way as module instantiation. Moreover, when a module is visited at phase 0, any module that it requires for-syntax is instantiated at phase 1, while further requires for-template leading back to phase 0 causes the required module to be visited at phase 0 (i.e., not instantiated).
During compilation, the top-level of module context is itself implicitly visited. Thus, when the expander encounters (require (for-syntax ....)), it immediately instantiates the required module at phase 1, in addition to adding bindings at phase level 1 (i.e., the transformer environment). Similarly, the expander immediately evaluates any define-values-for-syntax form that it encounters.
Phases beyond 0 are visited on demand. For example, when the right-hand side of a phase-0 let-syntax is to be expanded, then modules that are available at phase 1 are visited. More generally, initiating expansion at phase n visits modules at phase n, which in turn instantiates modules at phase n+1. These visits and instantiations apply to available modules in the enclosing namespace.
When the expander encounters require and (require (for-syntax ....)) within a module context, the resulting visits and instantiations are specific to the expansion of the enclosing module, and are kept separate from visits and instantiations triggered from a top-level context or from the expansion of a different module. Along the same lines, when a module is attached to a namespace through namespace-attach-module, modules that it requires are transitively attached, but instances are attached only at phases at or below the namespace’s base phase.
When a top-level definition binds an identifier that originates from a macro expansion, the definition captures only uses of the identifier that are generated by the same expansion. This behavior is consistent with expansion in internal-definition contexts, where the defined identifier turns into a fresh lexical binding.
|> (define x 1)|
|> (def-and-use-of-x 2)|
|> (def-and-use x 3)|
For a top-level definition (outside of a module), the order of evaluation affects the binding of a generated definition for a generated identifier use. If the use precedes the definition, then the use refers to a non-generated binding, just as if the generated definition were not present. (No such dependency on order occurs within a module, since a module binding covers the entire module body.) To support the declaration of an identifier before its use, the define-syntaxes form avoids binding an identifier if the body of the define-syntaxes declaration produces zero results.
|> (define bucket-1 0)|
|> (define bucket-2 0)|
|> (define x 1)|
|> (def-and-set!-use-of-x 2)|
reference to undefined identifier: even
Macro-generated "require" and "provide" clauses also introduce and reference generation-specific bindings:
In require, for a require-spec of the form (rename-in [orig-id bind-id]) or (only-in .... [orig-id bind-id]), the bind-id is bound only for uses of the identifier generated by the same macro expansion as bind-id. In require for other require-specs, the generator of the require-spec determines the scope of the bindings.
In provide, for a provide-spec of the form id, the exported identifier is the one that binds id within the module in a generator-specific way, but the external name is the plain id. The exceptions for all-except-out are similarly determined in a generator-specific way, as is the orig-id binding of a rename-out form, but plain identifiers are used for the external names. For all-defined-out, only identifiers with definitions having the same generator as the (all-defined-out) form are exported; the external name is the plain identifier from the definition.
Before expanded code is evaluated, it is first compiled. A compiled form has essentially the same information as the corresponding expanded form, though the internal representation naturally dispenses with identifiers for syntactic forms and local bindings. One significant difference is that a compiled form is almost entirely opaque, so the information that it contains cannot be accessed directly (which is why some identifiers can be dropped). At the same time, a compiled form can be marshaled to and from a byte string, so it is suitable for saving and re-loading code.
Although individual read, expand, compile, and evaluate operations are available, the operations are often combined automatically. For example, the eval procedure takes a syntax object and expands it, compiles it, and evaluates it.
See Namespaces for functions that manipulate namespaces.
A namespace is a top-level mapping from symbols to binding information. It is the starting point for expanding an expression; a syntax object produced by read-syntax has no initial lexical context; the syntax object can be expanded after initializing it with the mappings of a particular namespace. A namespace is also the starting point evaluating expanded code, where the first step in evaluation is linking the code to specific module instances and top-level variables.
For expansion purposes, a namespace maps each symbol in each phase level to one of three possible bindings:
a particular module binding from a particular module
a top-level transformer binding named by the symbol
a top-level variable named by the symbol
An “empty” namespace maps all symbols to top-level variables. Certain evaluations extend a namespace for future expansions; importing a module into the top-level adjusts the namespace bindings for all of the imported named, and evaluating a top-level define form updates the namespace’s mapping to refer to a variable (in addition to installing a value into the variable).
For evaluation, each namespace encapsulates a distinct set of top-level variables at various phases, as well as a potentially distinct set of module instances in each phase. That is, even though module declarations are shared for all phase levels, module instances are distinct for each phase. Each namespace has a base phase, which corresponds to the phase used by reflective operations such as eval and dynamic-require. In particular, using eval on a require form instantiates a module in the namespace’s base phase.
After a namespace is created, module instances from existing namespaces can be attached to the new namespace. In terms of the evaluation model, top-level variables from different namespaces essentially correspond to definitions with different prefixes, but attaching a module uses the same prefix for the module’s definitions in namespaces where it is attached. The first step in evaluating any compiled expression is to link its top-level variable and module-level variable references to specific variables in the namespace.
At all times during evaluation, some namespace is designated as the current namespace. The current namespace has no particular relationship, however, with the namespace that was used to expand the code that is executing, or with the namespace that was used to link the compiled form of the currently evaluating code. In particular, changing the current namespace during evaluation does not change the variables to which executing expressions refer. The current namespace only determines the behavior of reflective operations to expand code and to start evaluating expanded/compiled code.
|> (define x 'orig) ; define in the original namespace|
|; The following let expression is compiled in the original|
|; namespace, so direct references to x see 'orig.|
If an identifier is bound to syntax or to an import, then defining the identifier as a variable shadows the syntax or import in future uses of the environment. Similarly, if an identifier is bound to a top-level variable, then binding the identifier to syntax or an import shadows the variable; the variable’s value remains unchanged, however, and may be accessible through previously evaluated expressions.
|> (define x 5)|
|> (define (f) x)|
|> (define-syntax x (syntax-id-rules () [_ 10]))|
|> (define x 7)|
|> (module m mzscheme (define x 8) (provide x))|
|> (require 'm)|
To improve error reporting, names are inferred at compile-time for certain kinds of values, such as procedures. For example, evaluating the following expression:
produces an error message because too many arguments are provided to the procedure. The error message is able to report f as the name of the procedure. In this case, Scheme decides, at compile-time, to name as 'f all procedures created by the let-bound lambda.
Names are inferred whenever possible for procedures. Names closer to an expression take precedence. For example, in
|(let ([f (lambda () 0)]) f))|
the procedure bound to my-f will have the inferred name 'f.
When an 'inferred-name property is attached to a syntax object for an expression (see Syntax Object Properties), the property value is used for naming the expression, and it overrides any name that was inferred from the expression’s context. Normally, the property value should be a symbol or an identifier.
When an inferred name is not available, but a source location is available, a name is constructed using the source location information. Inferred and property-assigned names are also available to syntax transformers, via syntax-local-name.