Syntax and Macros

MzScheme supports the R5RS define-syntax, let-syntax, and letrec-syntax forms with syntax-rules, with minor pattern and template extensions described in section 12.1.

In addition to syntax-rules, MzScheme supports macros that perform arbitrary transformations on syntax. In particular, a transformer expression -- the right-hand side of a define-syntax, let-syntax, or letrec-syntax binding -- can be an arbitrary expression, and it is evaluated in a transformer environment. When the expression produces a procedure, it is associated as a syntax transformer to the identifier bound by define-syntax, let-syntax, or letrec-syntax. This more general, mostly hygienic macro system is based on syntax-case by Dybvig, Hieb, and Bruggeman (see ``Syntactic abstraction in Scheme'' in Lisp and Symbolic Computation, December 1993).

A transformer procedure consumes a syntax object and produces a new syntax object. A syntax object encodes S-expression structure, but also includes source-location information and lexical-binding information for each element within the S-expression. A syntax object is a first-class value, and it can exist at run-time. However, syntax objects are more typically used at syntax-expansion time -- which is the run-time of a transformer procedure.45

Unlike traditional defmacro systems, MzScheme keeps the top-level transformer environment separate from the normal top-level environment. The environments are separated because the expressions in the different environments are evaluated at different times (transformer expressions are evaluated at syntax-expansion time, while normal expressions are evaluated at run time). Separating each environment ensures that compilation and analysis tools can process programs properly. See section 12.3.3 for more information.

Also unlike traditional macro systems, a transformer procedure is invoked whenever its identifier is used in an expression position, not in application positions only. Even more generally, a transformer expression might not produce a procedure value, in which case the non-procedure is associated to its identifier as a generic expansion-time value. For example, a unit signature (see Chapter 55 in PLT MzLib: Libraries Manual) is associated to an identifier through an expansion-time value. See section 12.6 for more information on transformer applications and expansion-time values.

12.1  syntax-rules Extensions

MzScheme extends the pattern language for syntax-rules so that a pattern of the form

(... pattern)

is equivalent to pattern where ... is treated like any other identifier. Similarly, a template of the form

(... template)

is equivalent to template where ... is treated like any other identifier.

In a pattern, additional patterns can follow ..., but only one ... can appear in a sequence:

(pattern ···1 ... pattern ···)

Furthermore, a sequence containing ... can end with a dotted pair:

(pattern ···1 ... pattern ··· . pattern)

but in this case, the final pattern is never matched to a syntactic list.

A template element consists of any number of ...s after a template. For each ... after the first one, the preceding element (with earlier ...s) is conceptually wrapped with parentheses for generating output, and then wrapping parentheses in the output are removed. If a pattern identifier is followed by more ellipses in a template than in the pattern, then the pattern's match is expanded normally for inner ellipses (up to the number of ellipses that appear in the pattern), and then the match is replicated as necessary to satisfy outer ellipses.

To mesh gracefully with modules, literal identifiers are compared with module-identifier=?, which is equivalent to the comparison behavior of R5RS in the absence of modules; see section 12.3.1 for more information on identifier syntax comparisons.

Examples:

(define-syntax ex1
  (syntax-rules ()
   [(ex1 a) '(a (... ...))]))
(ex1 1) ; => '(1 ...)

(define-syntax ex2
  (syntax-rules ()
   [(ex2 a ... b) '(b a ...)]))
(ex2 1 2 3) ; => '(3 1 2)

(define-syntax ex3
  (syntax-rules ()
   [(ex3 a ... b . c) '(b a ... c)]))
(ex3 1 2 3 4) ; syntax error
(ex3 1 2 3 . 4) ; => '(3 1 2 4)

(define-syntax ex4
  (syntax-rules ()
   [(ex4 (a ...) ... b) '(a ... ... b)]))
(ex4 (1) (2 3) 4) ; => '(1 2 3 4)

The syntax-id-rules form has the same syntax as syntax-rules, except that each pattern is used in its entirety (instead of starting with a keyword placeholder that is ignored). Furthermore, when an identifier id is bound as syntax to a syntax-id-rules transformer, the transformer is applied whenever id appears in an expression position -- not just when it is in the application position -- or when id appears as the target of an assignment. When the identifier appears in an application position, (id expr ···), the entire ``application'' is provided to the transformer, and when the identifier appears as a set! target, (set! id expr), the entire set! expression is provided to the transformer; otherwise, the id is provided alone to the transformer. Typically, set! is included as a keyword in a syntax-id-rules use, and three patterns match the three possible uses of the identifier.

(define-syntax pwd
  ; For this macro to work, the set! case must 
  ;  be first, and the pwd case must be last
  (syntax-id-rules (set!)
    [(set! pwd expr) (current-directory expr)]
    [(pwd expr ...) ((current-directory) expr ...)]
    [pwd (current-directory)]))

(set! pwd "/tmp") ; sets current-directory parameter
pwd ; => "/tmp"
(current-directory) ; => "/tmp"
(current-directory "/usr/tmp")
pwd ; => "/usr/tmp"

12.2  Syntax Objects

(read-syntax [source-name-v input-port]) is like read, except that it produces a syntax object with source-location information. The source-name-v is used as the source field of the syntax object; it can be an arbitrary value, but it should generally be a path for the source file. The default source-name-v is the input port's name (according to object-name; see section 6.2.3). See section 11.2.4 for more information about read and read-syntax, see section 11.2.1.1 for information about port locations, and see section 12.6.2 for information on the 'paren-shape property and original-indicator property attached to a syntax object by read-syntax.

The result of read-syntax is a syntax object with source-location information, but no lexical information. Syntax objects acquire lexical information during expansion, so that by the time a transformer is called, the provided syntax object has lexical information.

The eval, compile, expand, expand-once, and expand-to-top-form procedures work on syntax objects, especially syntax objects with no lexical context. (If one of these procedures is given a non-syntax S-expression, the S-expression is converted to a syntax object containing no source information and no lexical context.) Each of these procedures adds context to the syntax object using namespace-syntax-introduce before expanding the syntax (but see section 14.1 for information on the special handling of module). In contrast, the eval-syntax, compile-syntax, expand-syntax, expand-syntax-once, and expand-syntax-to-top-form procedures do not add context to a given syntax object before expanding.

The syntax object produced by expand, expand-syntax, etc. includes lexical information that influences future expansion and compilation of the syntax object. Thus, a syntax object produced by read-syntax should be passed to eval or expand (or another procedure without -syntax in its name), but a syntax object returned by expand should be passed to eval-syntax (or another procedure with -syntax in its name), since the result from expand has acquired a lexical context.

For example, if the following text is parsed by read-syntax,

(lambda (x) (+ x y))

the result is a syntax object that contains the S-expression structure '(lambda (x) (+ x y)), but also source information indicating that the first x is in column 9, etc. If expand is applied to the syntax object with a normal top-level environment, then the result will be a similar syntax object (with the source-location information intact), but the second x in the syntax object will have lexical information that ties it to the first x, and y in the syntax object will be annotated as a free variable. Even the syntax object's 'lambda will have lexical information tying it to the built-in lambda form.

Compilation (often as a prelude to interactive evaluation) strips away source and context information as it processes a syntax object. The compilation of a quote-syntax form is an exception:

(quote-syntax datum)

The quote-syntax form produces a syntax object that preserves the source-location information for datum. It also encapsulates lexical-binding information accumulated by compilation in the quote-syntax expression's environment. A quote-syntax expression rarely appears in normal expressions; quote-syntax is more typically used within a transformer expression.

In addition to local and lexical information, a syntax object may have properties and certificates attached. Properties are added or inspected using syntax-property, as described in section 12.6.2. Certificates validate references to identifiers that are not exported from a macro, as described in section 12.6.3.

The syntax-object->datum procedure strips away location, lexical, property, and certificate information from a syntax object to produce a plain S-expression. The datum->syntax-object procedure wraps syntax information onto an S-expression, copying the source-location information of a given syntax object, the lexical information of another syntax object, and the properties of a third syntax object (where some or all three of the given objects can be the same). The syntax-e procedure unwraps only the immediate S-expression structure from a syntax object, leaving nested structure in place. These procedures are described in section 12.2.2.

Although procedures such as syntax-object->datum permit arbitrary manipulation of syntax objects, a syntax transformer is more likely to use the pattern-matching syntax-case and syntax forms, which are described in the following subsection.

12.2.1  Syntax Patterns

The syntax-case form pattern-matches and deconstructs a syntax object:

(syntax-case stx-expr (literal-identifier ...) 
  syntax-clause 
  ···) 

syntax-clause is one of
  (pattern expr) 
  (pattern fender-expr expr)

If stx-expr expression does not produce a syntax object value, it is converted to one using datum->syntax-object with the lexical context of the expression (see section 12.2.2). The syntax is then compared to the pattern in each syntax-clause until a match is found, and the result of the corresponding expr is the result of the syntax-case expression. If a syntax-clause contains a fender-expr, the clause matches only when both the pattern matches the syntax object and the fender-expr returns a true value. If no pattern matches, a ``bad syntax'' exn:fail:syntax exception is raised.

A pattern is nearly the same as a syntax-rules pattern (see R5RS), with the ellipsis-escaping extension (see section 12.1). The difference is that the first identifier in pattern is not ignored, unlike the leading keyword in a syntax-rules pattern.

As in syntax-rules, a non-literal identifier in a pattern is bound to a corresponding part of the syntax object within the clause's expr and optional fender-expr. The identifier cannot be used directly, however; a use of the identifier in an expression position is a syntax error. Instead, the identifier can be used only in syntax expressions within the binding's scope.

A syntax expression has the form

(syntax template)

where template is as in syntax-rules (extended, as usual, for escaped ellipses). The result of a syntax expression is a syntax object. Identifiers in the template that are bound by a syntax-case pattern are replaced with their bindings in the generated syntax object. A syntax expression that contains no pattern identifiers is equivalent to a quote-syntax expression, except that unlike quote-syntax, the syntax form always fails to compile (i.e., it loops forever) when template is cyclic.

The syntax-rules form can be expressed as a syntax-case form wrapped in lambda:

 (syntax-rules (literal-identifier ···)
   ((ignored-identifier . pattern) template)
   ···)
=expands=>
 (lambda (stx)
   (syntax-case stx (literal-identifier ···)
     ((generated-identifier . pattern) (syntax template))
     ···))

Note that implicit lambda of syntax-rules for the transformer procedure is made explicit with syntax-case. The define-syntax form supports define-style abbreviations for transformer procedures (see section 2.8.1).

The following example shows one reason to use syntax-case instead of syntax-rules: custom error reporting.

(define-syntax (let1 stx)
  (syntax-case stx ()
    [(_ id val body)
     (begin
       ;; If id is not an identifier, report an error in terms of let1 instead of let:
       (unless (identifier? (syntax id))
         (raise-syntax-error #f "expected an identifier" stx (syntax id)))
       (syntax (let ([id val]) body)))]))
(let1 x 10 (add1 x)) ; => 11
(let1 2 10 (add1 x)) ; => let1: expected an identifier at: 2 in: (let1 2 10 (add1 x))

Another reason to use syntax-case is to implement ``non-hygienic'' macros that introduce capturing identifiers:

(define-syntax (if-it stx)
  (syntax-case stx ()
    [(src-if-it test then else)
     (syntax-case (datum->syntax-object (syntax src-if-it) 'it) ()
       [it (syntax (let ([it test]) (if it then else)))])])))
(if-it (memq 'b '(a b c)) it 'nope) ; => '(b c)

The nested syntax-case is used to bind the pattern variable it. The syntax for it is generated with datum->syntax-object using the context of src-if-it, which means that the introduced variable has the same lexical context as if-it at the macro's use; in other words, it acts as if it existed in the input syntax, so it can bind uses of it in test.

The syntax-case* form is a generalization of syntax-case where the procedure for comparing literal-identifiers is determined by a comparison-proc-expr:

(syntax-case* stx-expr (literal-identifier ...) comparison-proc-expr 
  syntax-clause 
  ···)

The result of comparison-proc-expr must be a procedure that accepts two arguments. The first argument is an identifier from stx-expr, and the second argument is an identifier from a syntax-clause pattern that is module-identifier=? to one of the literal-identifiers. A true result from the comparison procedure indicates that the first identifier matches the second.

12.2.1.1  Binding Pattern Variables

The with-syntax form is a let-like form for binding pattern variables:

(with-syntax ((pattern stx-expr) 
              ···) 
  expr)

The patterns are matched the stx-expr values, and all pattern identifiers are bound in expr. The pattern identifiers across all patterns must be distinct. If a stx-expr expression does not produce a syntax object, its result is converted using datum->syntax-object and the lexical context of the stx-expr (see section 12.2.2). If the result of a stx-expr does not match its pattern, the exn:fail:syntax exception is raised.

The if-it example can be written more simply using with-syntax:

(define-syntax (if-it stx)
  (syntax-case stx ()
    [(src-if-it test then else)
     (with-syntax ([it (datum->syntax-object (syntax src-if-it) 'it)])
       (syntax (let ([it test]) (if it then else))))]))

Macros that expand to non-hygienic macros rarely work as intended. For example:

(define-syntax (cond-it stx)
  (syntax-case stx ()
    [(_ (test body) . rest)
     (syntax (if-it test body (cond-it . rest)))]
    [(_) (syntax (void))]))
(cond-it [(memq 'b '(a b c)) it] [#t 'nope]) ; => undefined variable it

The problem is that cond-it introduces if-it (hygienically), so cond-it effectively introduces it (hygienically), which doesn't bind it in the source use of cond-it. In general, the solution is to avoid macros that expand to uses of non-hygienic macros.46

12.2.1.2  Quasiquoting Templates

The quasisyntax form is like syntax, except with quasiquoting within the template:

(quasisyntax quasitemplate)

A quasitemplate is the same as a template, except that unsyntax and unsyntax-splicing escape to an expression:

(unsyntax expr)
(unsyntax-splicing expr)

The expression must produce a syntax object (or syntax list) to be substituted in place of the unsyntax or unsyntax-splicing form within the quasiquoting template, just like unquote and unquote-splicing within quasiquote. (If the escaped expression does not generate a syntax object, it is converted to one in the same was as for the right-hand sides of with-syntax.) Nested quasisyntaxes introduce quasiquoting layers in the same way as nested quasiquotes.

Also analogous to quote and quasiquote, the reader converts #' to syntax, #` to quasisyntax, #, to unsyntax, and #,@ to unsyntax-splicing. See also section 11.2.4.

Example:

(with-syntax ([(v ...) (list 1 2 3)])
  #`(0 v ... #,(+ 2 2) #,@(list 5 6) 7)) ; => syntax for (0 1 2 3 4 5 6 7)

12.2.1.3  Assigning Source Location

The syntax/loc form is like syntax, except that the immediate resulting syntax object takes its source-location information from a supplied syntax object, unless the template is just a pattern variable:

(syntax/loc location-stx-expr template)

Use syntax/loc instead of syntax whenever possible to help tools that report source locations. For example, the earlier if-it example should have been written with syntax/loc:

(define-syntax (if-it stx)
  (syntax-case stx ()
    [(src-if-it test then else)
     (with-syntax ([it (datum->syntax-object (syntax src-if-it) 'it)])
       (syntax/loc stx (let ([it test]) (if it then else))))]))

The quasisyntax/loc form is the quasiquoting analogue of syntax/loc:

(quasisyntax/loc location-stx-expr template)

12.2.2  Syntax Object Content

(syntax? v) returns #t if v is a syntax object, #f otherwise.

(syntax-source stx) returns the source for the syntax object stx, or #f if none is known. The source is represented by an arbitrary value (e.g., one passed to read-syntax), but it is typically a file path string. See also section 14.3.

(syntax-line stx) returns the line number (positive exact integer) for the start of the syntax object in its source, or #f if the line number or source is unknown. The result is #f if and only if (syntax-column stx) produces #f. See also section 11.2.1.1 and section 14.3.

(syntax-column stx) returns the column number (non-negative exact integer) for the start of the syntax object in its source, or #f if the source column is unknown. The result is #f if and only if (syntax-line stx) produces #f. See also section 11.2.1.1 and section 14.3.

(syntax-position stx) returns the character position (positive exact integer) for the start of the syntax object in its source, or #f if the source position is unknown. See also section 11.2.1.1 and section 14.3.

(syntax-span stx) returns the span (non-negative exact integer) in characters of the syntax object in its source, or #f if the span is unknown. See also section 14.3.

(syntax-original? stx) returns #t if stx has the property that read-syntax and read-honu-syntax attach to the syntax objects that they generate (see section 12.6.2), and if stx's lexical information does not indicate that the object was introduced by a syntax transformer (see section 12.3). The result is #f otherwise. This predicate can be used to distinguish syntax objects in an expanded expression that were directly present in the original expression, as opposed to syntax objects inserted by macros.

(syntax-source-module stx) returns a module path index or symbol (see section 5.4.2) for the module whose source contains stx, or #f if stx has no source module.

(syntax-e stx) unwraps the immediate S-expression structure from a syntax object, leaving nested syntax structure (if any) in place. The result of (syntax-e stx) is one of the following:

A syntax pair is a pair containing a syntax object as its first element, and either the empty list, a syntax pair, or a syntax object as its second element.

A syntax object that is the result of read-syntax reflects the use of dots (.) in the input by creating a syntax object for every pair of parentheses in the source, and by creating a pair-valued syntax object only for parentheses in the source. For example:

input read-syntax result
(a b) stx, where
 (syntax-e stx) is equivalent to (list a-stx b-stx)
 and (syntax-e a-stx) is equivalent to 'a
 and (syntax-e b-stx) is equivalent to 'b
(a . (b)) stx, where
 (syntax-e stx) is equivalent to (cons a-stx sb-stx)
 and (syntax-e a-stx) is equivalent to 'a
 and (syntax-e sb-stx) is equivalent to (list b-stx)
 and (syntax-e b-stx) is equivalent to 'b

(syntax->list stx) returns an immutable list of syntax objects or #f. The result is a list of syntax objects when (syntax-object->datum stx) would produce a list. In other words, syntax pairs in (syntax-e stx) are flattened.

(syntax-object->datum stx) returns an S-expression by stripping the syntactic information from stx. Graph structure is preserved by the conversion.

(datum->syntax-object ctxt-stx v [src-stx-or-list prop-stx cert-stx]) converts the S-expression v to a syntax object, using syntax objects already in v in the result. Converted objects in v are given the lexical context information of ctxt-stx and the source-location information of src-stx-or-list. If v is not already a syntax object, then the resulting immediate syntax object it is given the properties (see section 12.6.2) of prop-stx and the inactive certificates (see section 12.6.3) of cert-stx. Any of ctxt-stx, src-stx-or-list, prop-stx, or cert-stx can be #f, in which case the resulting syntax has no lexical context, source information, new properties, and/or certificates.

If src-stx-or-list is not #f or a syntax object, it must be a list of five elements:

  (list source-name-v line-k column-k position-k span-k)

where source-name-v is an arbitrary value for the source name; line-k is a positive, exact integer for the source line, or #f; and column-k is a non-negative, exact integer for the source column, or #f; position-k is a positive, exact integer for the source position, or #f; and span-k is a non-negative, exact integer for the source span, or #f. The line-k and column-k values must both be numbers or both be #f, otherwise the exn:fail exception is raised.

Graph structure is preserved by the conversion of v to a syntax object, but graph structure that is distributed among distinct syntax objects in v may be hidden from future applications of syntax-object->datum and syntax-graph? to the new syntax object.

(syntax-graph? stx) returns #t if stx might be preservably shared within a syntax object created by read-syntax, read-honu-syntax, or datum->syntax-object. In general, sharing detection is approximate -- datum->syntax-object can construct syntax objects with sharing that is hidden from syntax-graph? -- but syntax-graph? reliably returns #t for at least one syntax object in a cyclic structure. Meanwhile, deconstructing a syntax object with procedures such as syntax-e and comparing the results with eq? can also fail to detect sharing (even cycles), due to the way lexical information is lazily propagated; only syntax-object->datum reliably exposes sharing in a way that can be detected with eq?.

(identifier? v) returns #t if v is a syntax object and (syntax-e stx) produces a symbol.

(generate-temporaries stx-pair) returns a list of identifiers that are distinct from all other identifiers. The list contains as many identifiers as stx-pair contains elements. The stx-pair argument must be a syntax pair that can be flattened into a list. The elements of stx-pair can be anything, but string, symbol, and identifier elements will be embedded in the corresponding generated name (useful for debugging purposes). The generated identifiers are built with interned symbols (not gensyms), so the limitations described in section 14.3 do not apply.

12.3  Syntax and Lexical Scope

Hygienic macro expansion depends on information associated with each syntax object that records the lexical context of the site where the syntax object is introduced. This information includes the identifiers that are bound by lambda, let, letrec, etc., at the syntax object's introduction site, the required identifiers at the introduction site, and the macro expansion that introduces the object.

Based on this information, a particular identifier syntax object falls into one of three classifications:

The identifier-binding procedure (described in section 12.3.2) reports an identifiers classification. Further information about a lexical identifier is available only in relative terms, such as whether two identifiers refer to the same binding (see bound-identifier=? in section 12.3.1). For module-imported identifiers, information about the module source is available.

In a freshly read syntax object, identifiers have no lexical information, so they are all classified as free. During expansion, some identifiers acquire lexical or module-import classifications. An identifier that becomes classified as lexical will remain so classified, though its binding might shift as expansion proceeds (i.e., as nested binding expressions are parsed, and as macro introductions are tracked). An identifier classified as module-imported might similarly shift to the lexical classification, but if it remains module-imported, its source-module designation will never change.

Lexical information is used to expand and parse syntax in a way that it obeys lexical and module scopes. In addition, an identifier's lexical information encompasses a second dimension, which distinguishes the environment of normal expressions from the environment of transformer expressions. The module bindings of each environment can be different, so an identifier may be classified differently depending on whether it is ultimately used in a normal expression or in a transformer expression. See section 12.3.3 and section 12.3.4 for more information on the two environments.

12.3.1  Syntax Object Comparisons

(bound-identifier=? a-id-stx b-id-stx) returns #t if the identifier a-id-stx would bind b-id-stx (or vice-versa) if the identifiers were substituted in a suitable expression context, #f otherwise.

(free-identifier=? a-id-stx b-id-stx) returns #t if a-id-stx and b-id-stx access the same lexical, module, or top-level binding and return the same result for syntax-e, #f otherwise.

(module-identifier=? a-id-stx b-id-stx) returns #t if a-id-stx and b-id-stx access the same lexical, module, or top-level binding in the normal environment. ``Same module binding'' means that the identifiers refer to the same original definition site, not necessarily the require or provide site. Due to renaming in require and provide, the identifiers may return distinct results with syntax-e.

(module-transformer-identifier=? a-id-stx b-id-stx) returns #t if a-id-stx and b-id-stx access the same lexical, module, or top-level binding in the identifiers' transformer environments (see section 12.3.3).

(module-template-identifier=? a-id-stx b-id-stx) returns #t if a-id-stx and b-id-stx access the same lexical or module binding in the identifiers' template environments (see section 12.3.4).

(check-duplicate-identifier id-stx-list) compares each identifier in id-stx-list with every other identifier in the list with bound-identifier=?. If any comparison returns #t, one of the duplicate identifiers is returned (the first one in id-stx-list that is a duplicate), otherwise the result is #f.

12.3.2  Syntax Object Bindings

(identifier-binding id-stx) returns one of three kinds of values, depending on the binding of id-stx in its normal environment:

(identifier-transformer-binding id-stx) is like identifier-binding, except that the reported information is for the identifier's bindings in the transformer environment (see section 12.3.3), instead of the normal environment. If the result is 'lexical for either of identifier-binding or identifier-transformer-binding, then the result is always 'lexical for both.

(identifier-template-binding id-stx) is like identifier-binding, except that the reported information is for the identifier's bindings in the template environment (see section 12.3.4), instead of the normal environment. If the result is 'lexical for either of identifier-binding or identifier-template-binding, then the result is always 'lexical for both.

(identifier-binding-export-position id-stx) returns either #f or an exact non-negative integer. It returns an integer only when identifier-binding returns a list, when id-stx represents an imported binding, and when the source module assigns internal positions for its definitions. This function is intended for use by mzc.

(identifier-transformer-binding-export-position id-stx) is like identifier-binding-export-position, except that the reported information is for the transformer environment. This function is intended for use by mzc.

12.3.3  Transformer Environments

The top-level environment for transformer expressions is separate from the normal top-level environment. Consequently, top-level definitions are not available for use in top-level transformer definitions. For example, the following program does not work:

(define count 0)
(define (inc!) (set! count (add1 count)))
(define-syntax (let1 stx)
  (syntax-case stx ()
    [(_ x v b)
     (begin
       (printf "expanding ~a~n" count) ; DOESN'T WORK
       (inc!)                          ; ALSO DOESN'T WORK
       (syntax (let ([x v]) b)))]))
(let1 x 2 (add1 x))

The variables count and inc! are bound in the normal top-level environment, but it is not bound in the transformer environment, so the attempt to expand (let1 x 2 (add1 x)) will result in an undefined-variable error.

In the same way that define binds only in the normal environment, a require expression imports only into the normal environment, and the imported bindings are not made visible in the transformer environment. A top-level require-for-syntax imports into the transformer environment without affecting the normal environment. Furthermore, the require and require-for-syntax forms create separate instantiations of any module that is imported into both environments, in keeping with the separation of the environments.

The initial namespace created by the stand-alone MzScheme application imports all of MzScheme's built-in syntax, procedures, and constants into the transformer environment.47 To extend this environment, use one of the following:

In particular, the example above can be repairs by replacing

(define count 0)
(define (inc!) (set! count (add1 count)))

with either

(define-for-syntax count 0)
(define-for-syntax (inc!) (set! count (add1 count)))

or

(begin-for-syntax
 (define count 0)
 (define (inc!) (set! count (add1 count))))

or

(module counter mzscheme
  (define count 0)
  (define (inc!) (set! count (add1 count)))
  (provide count inc!))
(require-for-syntax counter)

When an identifier binding is introduced by a form other than module or a top-level definition, it extends the environment for both normal and transformer expressions within its scope, but the binding is only accessible by expressions resolved in the proper environment (i.e., the one in which it was introduced). In particular, a transformer expression in a let-syntax or letrec-syntax expression cannot access identifiers bound by enclosing forms, and an identifier bound in a transformer expression should not appear as an expression in the result of the transformer. Such out-of-context uses of an identifier are flagged as syntax errors when attempting to resolve the identifier.

A let-syntax or letrec-syntax expression can never usefully appear as a transformer expression, because MzScheme provides no mechanism for importing into the meta-transformer environment that would be used by meta-transformer expressions to operate on transformer expressions. In other words, an expression of the form

(let-syntax ([identifier (let-syntax ([identifier expr]) 
                                 body-expr)]) 
  ...)

is always illegal, assuming that let-syntax is bound in both the normal and transformer environments to the let-syntax of mzscheme. No syntax (not even function application) is bound in expr's environment. This restriction in the mzscheme language is of little consequence, however, since for-syntax exports allow the definition of syntax applicable to the above body-expr.

12.3.4  Module Environments

In the same way that the normal and transformer environments are kept separate at the top level, a module's normal and transformer environments are also separated. Normal imports and definitions in a module -- both variable and syntax -- contribute to the module's normal environment, only.

For example, the module expression

(module m mzscheme 
  (define (id x) x)
  (define-syntax (macro stx)
    (id (syntax (printf "hi~n")))))

is ill-formed because id is not bound in the transformer environment for the macro implementation. To make id usable from the transformer, the body of the module m would have to be executed -- which is impossible in general, because a syntax definition such as macro affects the expansion of the rest of the module body.

Consequently, if a procedure such as id is to be used in a transformer, it must either remain local to the transformer expression, or reside in a different module. For example, the above module is trivially repaired as

(module m mzscheme 
  (define-syntax macro
    (let ([id (lambda (x) x)])
      (lambda (stx)
        (id (syntax (printf "hi~n")))))))

The define-for-syntax, begin-for-syntax, and define-syntaxes forms (see section 12.3.3 and section 12.4) are useful for defining multiple macros that share helper functions.

In the mzscheme language, the base environment for a transformer expression includes all of MzScheme. The mzscheme language also provides a require-for-syntax form (in the normal environment) for importing bindings from another module into the importing module's transformer environment:

(require-for-syntax require-spec ···)

A for-syntax import of M within N causes M to be executed at N's expansion time, instead of (or possibly in addition to) run time for N. The syntax and variable identifiers exported by the for-syntax module are visible within the module's transformer environment, but not its normal environment. Like a normal expression, a transformer expression in a module cannot contain free variables.

Finally, mzscheme provides the require-for-template form, which is roughly dual to require-for-syntax:

(require-for-template require-spec ···)

A for-template import of M within N causes the referenced module to be executed at the run-time of any P that includes a for-syntax import of N. In other words, require-for-template introduces bindings that become available in a future run time.

Transformer expressions and imports for a module M are executed once each time a module is expanded using M's syntax bindings or using M as a for-syntax import. After the module is expanded, its transformer environment is destroyed, including bindings from modules used at expansion time.

Example:

 (module rt mzscheme
   (printf "RT here~n")
   (define mx (lambda () 7))
   (provide mx))

 (module tt mzscheme
   (printf "RT here, too~n")
   (define x 700)
   (provide x))

 (module et mzscheme
   (require-for-template tt)
   (printf "ET here~n")
   ;; The x below is future-time:
   (define mx (lambda () (syntax x)))
   (provide mx))

 (module m mzscheme
   (require-for-syntax mzscheme)
   (require rt)               ; rt provides run-time mx
   (require-for-syntax et)    ; et provides exp-time mx

   ;; The mx below is run-time:
   (printf "~a~n" (mx))       ; prints 7 when run

   ;; The mx below is exp-time:
   (define-syntax onem (lambda (stx) (mx)))
   (printf "~a~n" (onem))    ; prints 700 when run

   ;; The mx below is run-time:
   (define-syntax twom (lambda (stx) (syntax (mx))))
   (printf "~a~n" (twom)))    ; prints 7 when run

 ;; "ET here" is printed during the expansion of m

 (require m) ; prints "ET here" (for later macro expansion in the top level, if any)
             ; and "RT here, too" and "RT here" in some order,
             ; then 7, then 700, then 7

This expansion-time execution model explains the need to execute declared modules only when they are invoked. If a declared module is imported into other modules only for syntax, then the module is needed only at expansion time and can be ignored at run time. The separation of declaration and execution also allows a for-syntax module to be executed once for each module that it expands through require-for-syntax.

The hierarchy of run times avoids confusion among expansion and executing layers that can prevent separate compilation. By ensuring that the layers are separate, a compiler or programming environment can expand, partially expand, or re-expand a module without affecting the module's run-time behavior, whether the module is currently executing or not.

Since transformer expressions may themselves use macros defined by modules with for-syntax imports (to implement the macros), expansion of a module creates a hierarchy of run times (or "tower of expanders"). The expansion time of each layer corresponds to the run time of the next deeper layer.

In the absence of let-syntax and letrec-syntax, the hierarchy of run times would be limited to three levels, since the transformer expressions for run-time imports would have been expanded before the importing module must be expanded. The let-syntax and letrec-syntax forms, however, allow syntax visible in a for-syntax import's transformers to appear in the expansion of transformer expressions in the module. Consequently, the hierarchy is bounded in principle only by the number of declared modules. In practice, the hierarchy will rarely exceed a few levels.

12.3.5  Macro-Generated Top-Level and Module Definitions

When a top-level definition binds an identifier that originates from a macro expansion, the definition captures only uses of the identifier that are generated by the same expansion. This behavior is consistent with internal definitions (see section 2.8.5), where the defined identifier turns into a fresh lexical binding.

Example:

(define-syntax def-and-use-of-x
  (syntax-rules ()
    [(def-and-use-of-x val)
     ; x below originates from this macro:
     (begin (define x val) x)]))
(define x 1)
x ; => 1
(def-and-use-of-x 2) ; => 2
x ; => 1

(define-syntax def-and-use
  (syntax-rules ()
    [(def-and-use x val)
     ; x below was provided by the macro use:
     (begin (define x val) x)]))
(def-and-use x 3) ; => 3
x ; => 3

For a top-level definition (outside of module), the order of evaluation affects the binding of a generated definition for a generated identifier use. If the use precedes the definition, then the use refers to a non-generated binding, just as if the generated definition were not present. (No such dependency on order occurs within a module, since a module binding covers the entire module body.) To support the declaration of an identifier before its use, the define-syntaxes form avoids binding an identifier if the body of the define-syntaxes declaration produces zero results (see also section 12.4).

Example:

(define bucket-1 0)
(define bucket-2 0)
(define-syntax def-and-set!-use-of-x
  (syntax-rules ()
    [(def-and-set!-use-of-x val)
     (begin (set! bucket-1 x) (define x val) (set! bucket-2 x))]))
(define x 1)
(def-and-set!-use-of-x 2)
x ; => 1
bucket-1 ; => 1
bucket-2 ; => 2

(define-syntax defs-and-uses/fail
  (syntax-rules ()
    [(def-and-use)
     (begin
      ; Initial reference to even precedes definition:
      (define (odd x) (if (zero? x) #f (even (sub1 x))))
      (define (even x) (if (zero? x) #t (odd (sub1 x))))
      (odd 17))]))
(defs-and-uses/fail) ; => error: undefined identifier even
     
(define-syntax defs-and-uses
  (syntax-rules ()
    [(def-and-use)
     (begin
      ; Declare before definition via no-values define-syntaxes:
      (define-syntaxes (odd even) (values))
      (define (odd x) (if (zero? x) #f (even (sub1 x))))
      (define (even x) (if (zero? x) #t (odd (sub1 x))))
      (odd 17))]))
(defs-and-uses) ; => #t

Within a module, macro-generated require and provide clauses also introduce and reference generation-specific bindings:

12.4  Binding Multiple Syntax Identifiers

In addition to define-syntax, let-syntax, and letrec-syntax, MzScheme provides define-syntaxes, let-syntaxes, and letrec-syntaxes. These forms are analogous to define-values, let-values, and letrec-values, allowing multiple syntax bindings at once (see section 2.8).

(define-syntaxes (identifier ···) expr)

(let-syntaxes (((identifier ···) expr)
               ···)
   expr ···1)

(letrec-syntaxes (((identifier ···) expr)
                  ···)
   expr ···1)

At the top level, define-syntaxes accepts zero results for any number of identifiers, and in that case, it neither binds the identifiers nor signals an error. This behavior is useful for identifiers that are introduced by a macro that produces top-level defines. See section 12.3.5 for more information.

MzScheme also provides a letrec-syntaxes+values form for binding both values and syntax in a single, mutually recursive scope:

(letrec-syntaxes+values (((identifier ···) expr) ···)
                        (((identifier ···) expr) ···)
   expr ···1)

The first set of bindings are syntax bindings (as in letrec-syntaxes), and the second set of bindings are normal variable bindings (as in letrec-values).

Examples:

;; Defines let/cc and let-current-continuation as the same macro:
(define-syntaxes (let/cc let-current-continuation)
  (let ([macro (syntax-rules ()
                 [(_ id body1 body ...) 
                  (call/cc (lambda (id) body1 body ...))])])
    (values macro macro)))

(letrec-syntaxes+values ([(get-id) (syntax-rules ()
                                    [(_) id])])
                        ([(id) (lambda (x) x)]
                         [(x) (get-id)])
   x) ; => the id identify procedure

12.5  Special Syntax Identifiers

To enable the definition of syntax transformers for application forms and other data (numbers, vectors, etc.), the syntax expander treats #%app, #%top, and #%datum as special identifiers.

Any expandable expression of the form

(datum . datum)

where the first datum is not an identifier bound to an expansion-time value, is treated as

(#%app datum . datum)

so that the syntax transformer bound to #%app is applied. In addition, () is treated as (#%app). Similarly, an expression

identifier

where identifier has no binding other than a top-level binding, is treated as

(#%top . identifier)

Finally, an expression

datum

where datum is not an identifier or pair, is treated as

(#%datum . datum)

The mzscheme module provides #%app, #%top, and #%datum as regular application, top-level variable reference, and implicit quote, respectively. A module can export different transformers with these names to support languages different from conventional Scheme.

In the case of read-eval-print-loop or the default load handler, every input datum is wrapped with #%top-interaction:

(#%top-interaction . datum)

The mzscheme module provides #%top-interaction as a macro that expands to just the datum.

Within module, #%module-begin is used as a transformer for the module body. A #%module-begin is implicitly added around a module body when it contains multiple S-expressions, or when the S-expression expands to a core form other than #%module-begin or #%plain-module-begin; the lexical context for the introduced #%module-begin identifier includes only the exports of the module's initial import. After such wrapping, if any, and before any expansion, an 'enclosing-module-name property is attached to the module-body syntax object; the property's value is a symbol for the module name as specified after the module keyword.

The mzscheme module binds #%module-begin to a form that inserts a for-syntax import of mzscheme, so that mzscheme bindings can be used in syntax definitions. It also exports #%plain-module-begin, which can be substituted for #%module-begin to avoid the for-syntax import of mzscheme. Any other transformer used for #%module-begin must expand to mzscheme's #%module-begin or #%plain-module-begin.

When an expression is fully expanded, all applications, top-level variable references, and literal datum expressions will appear as explicit #%app, #%top, and #%datum forms, respectively. Those forms can also be used directly by source code. The #%module-begin form can never usefully appear in an expression, and the body of a fully expanded module declaration is not wrapped with #%module-begin; instead, it is wrapped with #%plain-module-begin.

The following example shows how the special syntax identifiers can be defined to create a non-Scheme module language:

(module lambda-calculus mzscheme 
  
  ; Restrict lambda to one argument: 
  (define-syntax lc-lambda 
    (syntax-rules () 
      [(_ (x) E) (lambda (x) E)])) 
  
  ; Restrict application to two expressions:
  (define-syntax lc-app 
    (syntax-rules () 
      [(_ E1 E2) (E1 E2)])) 
  
  ; Restrict a lambda calculus module to one body expression: 
  (define-syntax lc-module-begin  
    (syntax-rules () 
      [(_ E) (#%module-begin E)])) 
  
  ; Disallow numbers, vectors, etc. 
  (define-syntax lc-datum 
    (syntax-rules ())) 
  
  ; Provide (with renaming): 
  (provide #%top ; keep mzscheme's free-variable error 
           (rename lc-lambda lambda) 
           (rename lc-app #%app) 
           (rename lc-module-begin #%module-begin) 
           (rename lc-datum #%datum))) 
  
(module m lambda-calculus 
  ; The only syntax defined by lambda-calculus is 
  ; unary lambda, unary application, and variables. 
  ; Also, the module must contain exactly one expression. 
  ((lambda (y) (y y)) 
   (lambda (y) (y y)))) 
  
(require m)     ; executes m, loops forever

12.6  Macro Expansion

A define-syntax, let-syntax, or letrec-syntax form associates an identifier to an expansion-time value. If the expansion-time value is a procedure of one argument, then the procedure is applied by the syntax expander when the identifier is used in the scope of the syntax binding.

The transformer for an identifier is applied whenever the identifier appears in an expression position -- not just when it appears after a parenthesis as (identifier ...). When it does appear as (identifier ...), the entire (identifier ...) expression is provided as the argument to the transformer. Otherwise only identifier is provided to the transformer.

A typical transformer is implemented as

(lambda (stx) 
  (syntax-case stx ()
    [(_ rest-of-pattern) expr]))

so that identifier by itself does not match the pattern; thus, the exn:fail:syntax exception is raised when identifier does not appear as (identifier ...).

(make-set!-transformer proc) also creates a transformer procedure. The proc argument must be a procedure of one argument; if the result of (make-set!-transformer proc) is bound as syntax to identifier, then proc is applied as a transformer when identifier is used in an expression position, or when it is used as the target of a set! assignment: (set! identifier expr). When the identifier appears as a set! target, the entire set! expression is provided to the transformer.

Example:

(let ([x 1]
      [y 2])
  (let-syntax ([x (make-set!-transformer
                    (lambda (stx)
                     (syntax-case stx (set!)
                       ; Redirect mutation of x to y
                       [(set! id v) (syntax (set! y v))])))]
                       ; Normal use of x really gets x
                       [id (identifier? (syntax id)) (syntax x)])))])
    (