Syntax and Macros

MzScheme supports the R5RS define-syntax, let-syntax, and letrec-syntax forms with syntax-rules, with minor pattern and template extensions described in section 12.1.

In addition to syntax-rules, MzScheme supports macros that perform arbitrary transformations on syntax. In particular, a transformer expression -- the right-hand side of a define-syntax, let-syntax, or letrec-syntax binding -- can be an arbitrary expression, and it is evaluated in a transformer environment. When the expression produces a procedure, it is associated as a syntax transformer to the identifier bound by define-syntax, let-syntax, or letrec-syntax. This more general, mostly hygienic macro system is based on syntax-case by Dybvig, Hieb, and Bruggeman (see ``Syntactic abstraction in Scheme'' in Lisp and Symbolic Computation, December 1993).

A transformer procedure consumes a syntax object and produces a new syntax object. A syntax object encodes S-expression structure, but also includes source-location information and lexical-binding information for each element within the S-expression. A syntax object is a first-class value, and it can exist at run-time. However, syntax objects are more typically used at syntax-expansion time -- which is the run-time of a transformer procedure.44

Unlike traditional defmacro systems, MzScheme keeps the top-level transformer environment separate from the normal top-level environment. The environments are separated because the expressions in the different environments are evaluated at different times (transformer expressions are evaluated at syntax-expansion time, while normal expressions are evaluated at run time). Separating each environment ensures that compilation and analysis tools can process programs properly. See section 12.3.3 for more information.

Also unlike traditional macro systems, a transformer procedure is invoked whenever its identifier is used in an expression position, not in application positions only. Even more generally, a transformer expression might not produce a procedure value, in which case the non-procedure is associated to its identifier as a generic expansion-time value. For example, a unit signature (see Chapter 52 in PLT MzLib: Libraries Manual) is associated to an identifier through an expansion-time value. See section 12.6 for more information on transformer applications and expansion-time values.

12.1  syntax-rules Extensions

MzScheme extends the pattern language for syntax-rules so that a pattern of the form

(... pattern)

is equivalent to pattern where ... is treated like any other identifier. Similarly, a template of the form

(... template)

is equivalent to template where ... is treated like any other identifier.

In a pattern, additional patterns can follow ..., but only one ... can appear in a sequence:

(pattern ···1 ... pattern ···)

Furthermore, a sequence containing ... can end with a dotted pair:

(pattern ···1 ... pattern ··· . pattern)

but in this case, the final pattern is never matched to a syntactic list.

A template element consists of any number of ...s after a template. For each ... after the first one, the preceding element (with earlier ...s) is conceptually wrapped with parentheses for generating output, and then wrapping parentheses in the output are removed. If a pattern identifier is followed by more ellipses in a template than in the pattern, then the pattern's match is expanded normally for inner ellipses (up to the number of ellipses that appear in the pattern), and then the match is replicated as necessary to satisfy outer ellipses.

To mesh gracefully with modules, literal identifiers are compared with module-identifier=?, which is equivalent to the comparison behavior of R5RS in the absence of modules; see section 12.3.1 for more information on identifier syntax comparisons.

Examples:

(define-syntax ex1
  (syntax-rules ()
   [(ex1 a) '(a (... ...))]))
(ex1 1) ; => '(1 ...)

(define-syntax ex2
  (syntax-rules ()
   [(ex2 a ... b) '(b a ...)]))
(ex2 1 2 3) ; => '(3 1 2)

(define-syntax ex3
  (syntax-rules ()
   [(ex3 a ... b . c) '(b a ... c)]))
(ex3 1 2 3 4) ; syntax error
(ex3 1 2 3 . 4) ; => '(3 1 2 4)

(define-syntax ex4
  (syntax-rules ()
   [(ex4 (a ...) ... b) '(a ... ... b)]))
(ex4 (1) (2 3) 4) ; => '(1 2 3 4)

The syntax-id-rules form has the same syntax as syntax-rules, except that each pattern is used in its entirety (instead of starting with a keyword placeholder that is ignored). Furthermore, when an identifier id is bound as syntax to a syntax-id-rules transformer, the transformer is applied whenever id appears in an expression position -- not just when it is in the application position -- or when id appears as the target of an assignment. When the identifier appears in an application position, (id expr ···), the entire ``application'' is provided to the transformer, and when the identifier appears as a set! target, (set! id expr), the entire set! expression is provided to the transformer; otherwise, the id is provided alone to the transformer. Typically, set! is included as a keyword in a syntax-id-rules use, and three patterns match the three possible uses of the identifier.

(define-syntax pwd
  ; For this macro to work, the set! case must 
  ;  be first, and the pwd case must be last
  (syntax-id-rules (set!)
    [(set! pwd expr) (current-directory expr)]
    [(pwd expr ...) ((current-directory) expr ...)]
    [pwd (current-directory)]))

(set! pwd "/tmp") ; sets current-directory parameter
pwd ; => "/tmp"
(current-directory) ; => "/tmp"
(current-directory "/usr/tmp")
pwd ; => "/usr/tmp"

12.2  Syntax Objects

(read-syntax [source-name-v input-port]) is like read, except that it produces a syntax object with source-location information. The source-name-v is used as the source field of the syntax object; it can be an arbitrary value, but it should generally be a path for the source file. The default source-name-v is the input port's name (according to object-name; see section 6.2.3). See section 11.2.4 for more information about read and read-syntax, see section 11.2.1.1 for information about port locations, and see section 12.6.2 for information on the 'paren-shape property and original-indicator property attached to a syntax object by read-syntax.

The result of read-syntax is a syntax object with source-location information, but no lexical information. Syntax objects acquire lexical information during expansion, so that by the time a transformer is called, the provided syntax object has lexical information.

The eval, compile, expand, expand-once, and expand-to-top-form procedures work on syntax objects, especially syntax objects with no lexical context. (If one of these procedures is given a non-syntax S-expression, the S-expression is converted to a syntax object containing no source information and no lexical context.) Each of these procedures adds context to the syntax object using namespace-syntax-introduce before expanding the syntax (but see section 14.1 for information on the special handling of module). In contrast, the eval-syntax, compile-syntax, expand-syntax, expand-syntax-once, and expand-syntax-to-top-form procedures do not add context to a given syntax object before expanding.

The syntax object produced by expand, expand-syntax, etc. includes lexical information that influences future expansion and compilation of the syntax object. Thus, a syntax object produced by read-syntax should be passed to eval or expand (or another procedure without -syntax in its name), but a syntax object returned by expand should be passed to eval-syntax (or another procedure with -syntax in its name), since the result from expand has acquired a lexical context.

For example, if the following text is parsed by read-syntax,

(lambda (x) (+ x y))

the result is a syntax object that contains the S-expression structure '(lambda (x) (+ x y)), but also source information indicating that the first x is in column 9, etc. If expand is applied to the syntax object with a normal top-level environment, then the result will be a similar syntax object (with the source-location information intact), but the second x in the syntax object will have lexical information that ties it to the first x, and y in the syntax object will be annotated as a free variable. Even the syntax object's 'lambda will have lexical information tying it to the built-in lambda form.

Compilation (often as a prelude to interactive evaluation) strips away source and context information as it processes a syntax object. The compilation of a quote-syntax form is an exception:

(quote-syntax datum)

The quote-syntax form produces a syntax object that preserves the source-location information for datum. It also encapsulates lexical-binding information accumulated by compilation in the quote-syntax expression's environment. A quote-syntax expression rarely appears in normal expressions; quote-syntax is more typically used within a transformer expression.

In addition to local and lexical information, a syntax object may have properties and certificates attached. Properties are added or inspected using syntax-property, as described in section 12.6.2. Certificates validate references to identifiers that are not exported from a macro, as described in section 12.6.3.

The syntax-object->datum procedure strips away location, lexical, property, and certificate information from a syntax object to produce a plain S-expression. The datum->syntax-object procedure wraps syntax information onto an S-expression, copying the source-location information of a given syntax object, the lexical information of another syntax object, and the properties of a third syntax object (where some or all three of the given objects can be the same). The syntax-e procedure unwraps only the immediate S-expression structure from a syntax object, leaving nested structure in place. These procedures are described in section 12.2.2.

Although procedures such as syntax-object->datum permit arbitrary manipulation of syntax objects, a syntax transformer is more likely to use the pattern-matching syntax-case and syntax forms, which are described in the following subsection.

12.2.1  Syntax Patterns

The syntax-case form pattern-matches and deconstructs a syntax object:

(syntax-case stx-expr (literal-identifier ...) 
  syntax-clause 
  ···) 

syntax-clause is one of
  (pattern expr) 
  (pattern fender-expr expr)

If stx-expr expression does not produce a syntax object value, it is converted to one using datum->syntax-object with the lexical context of the expression (see section 12.2.2). The syntax is then compared to the pattern in each syntax-clause until a match is found, and the result of the corresponding expr is the result of the syntax-case expression. If a syntax-clause contains a fender-expr, the clause matches only when both the pattern matches the syntax object and the fender-expr returns a true value. If no pattern matches, a ``bad syntax'' exn:fail:syntax exception is raised.

A pattern is nearly the same as a syntax-rules pattern (see R5RS), with the ellipsis-escaping extension (see section 12.1). The difference is that the first identifier in pattern is not ignored, unlike the leading keyword in a syntax-rules pattern.

As in syntax-rules, a non-literal identifier in a pattern is bound to a corresponding part of the syntax object within the clause's expr and optional fender-expr. The identifier cannot be used directly, however; a use of the identifier in an expression position is a syntax error. Instead, the identifier can be used only in syntax expressions within the binding's scope.

A syntax expression has the form

(syntax template)

where template is as in syntax-rules (extended, as usual, for escaped ellipses). The result of a syntax expression is a syntax object. Identifiers in the template that are bound by a syntax-case pattern are replaced with their bindings in the generated syntax object. A syntax expression that contains no pattern identifiers is equivalent to a quote-syntax expression, except that unlike quote-syntax, the syntax form always fails to compile (i.e., it loops forever) when template is cyclic.

The syntax-rules form can be expressed as a syntax-case form wrapped in lambda:

 (syntax-rules (literal-identifier ···)
   ((ignored-identifier . pattern) template)
   ···)
=expands=>
 (lambda (stx)
   (syntax-case stx (literal-identifier ···)
     ((generated-identifier . pattern) (syntax template))
     ···))

Note that implicit lambda of syntax-rules for the transformer procedure is made explicit with syntax-case. The define-syntax form supports define-style abbreviations for transformer procedures (see section 2.8.1).

The following example shows one reason to use syntax-case instead of syntax-rules: custom error reporting.

(define-syntax (let1 stx)
  (syntax-case stx ()
    [(_ id val body)
     (begin
       ;; If id is not an identifier, report an error in terms of let1 instead of let:
       (unless (identifier? (syntax id))
         (raise-syntax-error #f "expected an identifier" stx (syntax id)))
       (syntax (let ([id val]) body)))]))
(let1 x 10 (add1 x)) ; => 11
(let1 2 10 (add1 x)) ; => let1: expected an identifier at: 2 in: (let1 2 10 (add1 x))

Another reason to use syntax-case is to implement ``non-hygienic'' macros that introduce capturing identifiers:

(define-syntax (if-it stx)
  (syntax-case stx ()
    [(src-if-it test then else)
     (syntax-case (datum->syntax-object (syntax src-if-it) 'it) ()
       [it (syntax (let ([it test]) (if it then else)))])])))
(if-it (memq 'b '(a b c)) it 'nope) ; => '(b c)

The nested syntax-case is used to bind the pattern variable it. The syntax for it is generated with datum->syntax-object using the context of src-if-it, which means that the introduced variable has the same lexical context as if-it at the macro's use; in other words, it acts as if it existed in the input syntax, so it can bind uses of it in test.

The syntax-case* form is a generalization of syntax-case where the procedure for comparing literal-identifiers is determined by a comparison-proc-expr:

(syntax-case* stx-expr (literal-identifier ...) comparison-proc-expr 
  syntax-clause 
  ···)

The result of comparison-proc-expr must be a procedure that accepts two arguments. The first argument is an identifier from stx-expr, and the second argument is an identifier from a syntax-clause pattern that is module-identifier=? to one of the literal-identifiers. A true result from the comparison procedure indicates that the first identifier matches the second.

12.2.1.1  Binding Pattern Variables

The with-syntax form is a let-like form for binding pattern variables:

(with-syntax ((pattern stx-expr) 
              ···) 
  expr)

The patterns are matched the stx-expr values, and all pattern identifiers are bound in expr. The pattern identifiers across all patterns must be distinct. If a stx-expr expression does not produce a syntax object, its result is converted using datum->syntax-object and the lexical context of the stx-expr (see section 12.2.2). If the result of a stx-expr does not match its pattern, the exn:fail:syntax exception is raised.

The if-it example can be written more simply using with-syntax:

(define-syntax (if-it stx)
  (syntax-case stx ()
    [(src-if-it test then else)
     (with-syntax ([it (datum->syntax-object (syntax src-if-it) 'it)])
       (syntax (let ([it test]) (if it then else))))]))

Macros that expand to non-hygienic macros rarely work as intended. For example:

(define-syntax (cond-it stx)
  (syntax-case stx ()
    [(_ (test body) . rest)
     (syntax (if-it test body (cond-it . rest)))]
    [(_) (syntax (void))]))
(cond-it [(memq 'b '(a b c)) it] [#t 'nope]) ; => undefined variable it

The problem is that cond-it introduces if-it (hygienically), so cond-it effectively introduces it (hygienically), which doesn't bind it in the source use of cond-it. In general, the solution is to avoid macros that expand to uses of non-hygienic macros.45

12.2.1.2  Quasiquoting Templates

The quasisyntax form is like syntax, except with quasiquoting within the template:

(quasisyntax quasitemplate)

A quasitemplate is the same as a template, except that unsyntax and unsyntax-splicing escape to an expression:

(unsyntax expr)
(unsyntax-splicing expr)

The expression must produce a syntax object (or syntax list) to be substituted in place of the unsyntax or unsyntax-splicing form within the quasiquoting template, just like unquote and unquote-splicing within quasiquote. (If the escaped expression does not generate a syntax object, it is converted to one in the same was as for the right-hand sides of with-syntax.) Nested quasisyntaxes introduce quasiquoting layers in the same way as nested quasiquotes.

Also analogous to quote and quasiquote, the reader converts #' to syntax, #` to quasisyntax, #, to unsyntax, and #,@ to unsyntax-splicing. See also section 11.2.4.

Example:

(with-syntax ([(v ...) (list 1 2 3)])
  #`(0 v ... #,(+ 2 2) #,@(list 5 6) 7)) ; => syntax for (0 1 2 3 4 5 6 7)

12.2.1.3  Assigning Source Location

The syntax/loc form is like syntax, except that the immediate resulting syntax object takes its source-location information from a supplied syntax object, unless the template is just a pattern variable:

(syntax/loc location-stx-expr template)

Use syntax/loc instead of syntax whenever possible to help tools that report source locations. For example, the earlier if-it example should have been written with syntax/loc:

(define-syntax (if-it stx)
  (syntax-case stx ()
    [(src-if-it test then else)
     (with-syntax ([it (datum->syntax-object (syntax src-if-it) 'it)])
       (syntax/loc stx (let ([it test]) (if it then else))))]))

The quasisyntax/loc form is the quasiquoting analogue of syntax/loc:

(quasisyntax/loc location-stx-expr template)

12.2.2  Syntax Object Content

(syntax? v) returns #t if v is a syntax object, #f otherwise.

(syntax-source stx) returns the source for the syntax object stx, or #f if none is known. The source is represented by an arbitrary value (e.g., one passed to read-syntax), but it is typically a file path string. See also section 14.3.

(syntax-line stx) returns the line number (positive exact integer) for the start of the syntax object in its source, or #f if the line number or source is unknown. The result is #f if and only if (syntax-column stx) produces #f. See also section 11.2.1.1 and section 14.3.

(syntax-column stx) returns the column number (non-negative exact integer) for the start of the syntax object in its source, or #f if the source column is unknown. The result is #f if and only if (syntax-line stx) produces #f. See also section 11.2.1.1 and section 14.3.

(syntax-position stx) returns the character position (positive exact integer) for the start of the syntax object in its source, or #f if the source position is unknown. See also section 11.2.1.1 and section 14.3.

(syntax-span stx) returns the span (non-negative exact integer) in characters of the syntax object in its source, or #f if the span is unknown. See also section 14.3.

(syntax-original? stx) returns #t if stx has the property that read-syntax and read-honu-syntax attach to the syntax objects that they generate (see section 12.6.2), and if stx's lexical information does not indicate that the object was introduced by a syntax transformer (see section 12.3). The result is #f otherwise. This predicate can be used to distinguish syntax objects in an expanded expression that were directly present in the original expression, as opposed to syntax objects inserted by macros.

(syntax-source-module stx) returns a module path index or symbol (see section 12.6.5) for the module whose source contains stx, or #f if stx has no source module.

(syntax-e stx) unwraps the immediate S-expression structure from a syntax object, leaving nested syntax structure (if any) in place. The result of (syntax-e stx) is one of the following:

A syntax pair is a pair containing a syntax object as its first element, and either the empty list, a syntax pair, or a syntax object as its second element.

A syntax object that is the result of read-syntax reflects the use of dots (.) in the input by creating a syntax object for every pair of parentheses in the source, and by creating a pair-valued syntax object only for parentheses in the source. For example:

input read-syntax result
(a b) stx, where
 (syntax-e stx) is equivalent to (list a-stx b-stx)
 and (syntax-e a-stx) is equivalent to 'a
 and (syntax-e b-stx) is equivalent to 'b
(a . (b)) stx, where
 (syntax-e stx) is equivalent to (cons a-stx sb-stx)
 and (syntax-e a-stx) is equivalent to 'a
 and (syntax-e sb-stx) is equivalent to (list b-stx)
 and (syntax-e b-stx) is equivalent to 'b

(syntax->list stx) returns an immutable list of syntax objects or #f. The result is a list of syntax objects when (syntax-object->datum stx) would produce a list. In other words, syntax pairs in (syntax-e stx) are flattened.

(syntax-object->datum stx) returns an S-expression by stripping the syntactic information from stx. Graph structure is preserved by the conversion.

(datum->syntax-object ctxt-stx v [src-stx-or-list prop-stx]) converts the S-expression v to a syntax object, using syntax objects already in v in the result. Converted objects in v are given the lexical context information of ctxt-stx and the source-location information of src-stx-or-list; if the resulting syntax object has no properties, then it is given the properties of prop-stx. Any of ctxt-stx, src-stx-or-list, or prop-stx can be #f, in which case the resulting syntax has no lexical context, source information, and/or new properties. If src-stx-or-list is not #f or a syntax object, it must be a list of five elements:

  (list source-name-v line-k column-k position-k span-k)

where source-name-v is an arbitrary value for the source name; line-k is a positive, exact integer for the source line, or #f; and column-k is a non-negative, exact integer for the source column, or #f; position-k is a positive, exact integer for the source position, or #f; and span-k is a non-negative, exact integer for the source span, or #f. The line-k and column-k values must both be numbers or both be #f, otherwise the exn:fail exception is raised. Graph structure is preserved by the conversion, but graph structure that is distributed among distinct syntax objects in v may be hidden from future applications of syntax-object->datum and syntax-graph? to the new syntax object.

(syntax-graph? stx) returns #t if stx might be preservably shared within a syntax object created by read-syntax, read-honu-syntax, or datum->syntax-object. In general, sharing detection is approximate -- datum->syntax-object can construct syntax objects with sharing that is hidden from syntax-graph? -- but syntax-graph? reliably returns #t for at least one syntax object in a cyclic structure. Meanwhile, deconstructing a syntax object with procedures such as syntax-e and comparing the results with eq? can also fail to detect sharing (even cycles), due to the way lexical information is lazily propagated; only syntax-object->datum reliably exposes sharing in a way that can be detected with eq?.

(identifier? v) returns #t if v is a syntax object and (syntax-e stx) produces a symbol.

(generate-temporaries stx-pair) returns a list of identifiers that are distinct from all other identifiers. The list contains as many identifiers as stx-pair contains elements. The stx-pair argument must be a syntax pair that can be flattened into a list. The elements of stx-pair can be anything, but string, symbol, and identifier elements will be embedded in the corresponding generated name (useful for debugging purposes). The generated identifiers are built with interned symbols (not gensyms), so the limitations described in section 14.3 do not apply.

12.3  Syntax and Lexical Scope

Hygienic macro expansion depends on information associated with each syntax object that records the lexical context of the site where the syntax object is introduced. This information includes the identifiers that are bound by lambda, let, letrec, etc., at the syntax object's introduction site, the required identifiers at the introduction site, and the macro expansion that introduces the object.

Based on this information, a particular identifier syntax object falls into one of three classifications:

The identifier-binding procedure (described in section 12.3.2) reports an identifiers classification. Further information about a lexical identifier is available only in relative terms, such as whether two identifiers refer to the same binding (see bound-identifier=? in section 12.3.1). For module-imported identifiers, information about the module source is available.

In a freshly read syntax object, identifiers have no lexical information, so they are all classified as free. During expansion, some identifiers acquire lexical or module-import classifications. An identifier that becomes classified as lexical will remain so classified, though its binding might shift as expansion proceeds (i.e., as nested binding expressions are parsed, and as macro introductions are tracked). An identifier classified as module-imported might similarly shift to the lexical classification, but if it remains module-imported, its source-module designation will never change.

Lexical information is used to expand and parse syntax in a way that it obeys lexical and module scopes. In addition, an identifier's lexical information encompasses a second dimension, which distinguishes the environment of normal expressions from the environment of transformer expressions. The module bindings of each environment can be different, so an identifier may be classified differently depending on whether it is ultimately used in a normal expression or in a transformer expression. See section 12.3.3 and section 12.3.4 for more information on the two environments.

12.3.1  Syntax Object Comparisons

(bound-identifier=? a-id-stx b-id-stx) returns #t if the identifier a-id-stx would bind b-id-stx (or vice-versa) if the identifiers were substituted in a suitable expression context, #f otherwise.

(free-identifier=? a-id-stx b-id-stx) returns #t if a-id-stx and b-id-stx access the same lexical, module, or top-level binding and return the same result for syntax-e, #f otherwise.

(module-identifier=? a-id-stx b-id-stx) returns #t if a-id-stx and b-id-stx access the same lexical, module, or top-level binding in the normal environment. ``Same module binding'' means that the identifiers refer to the same original definition site, not necessarily the require or provide site. Due to renaming in require and provide, the identifiers may return distinct results with syntax-e.

(module-transformer-identifier=? a-id-stx b-id-stx) returns #t if a-id-stx and b-id-stx access the same lexical, module, or top-level binding in the identifiers' transformer environments (see section 12.3.3).

(module-template-identifier=? a-id-stx b-id-stx) returns #t if a-id-stx and b-id-stx access the same lexical or module binding in the identifiers' template environments (see section 12.3.4).

(check-duplicate-identifier id-stx-list) compares each identifier in id-stx-list with every other identifier in the list with bound-identifier=?. If any comparison returns #t, one of the duplicate identifiers is returned (the first one in id-stx-list that is a duplicate), otherwise the result is #f.

12.3.2  Syntax Object Bindings

(identifier-binding id-stx) returns one of three kinds of values, depending on the binding of id-stx in its normal environment:

(identifier-transformer-binding id-stx) is like identifier-binding, except that the reported information is for the identifier's bindings in the transformer environment (see section 12.3.3), instead of the normal environment. If the result is 'lexical for either of identifier-binding or identifier-transformer-binding, then the result is always 'lexical for both.

(identifier-template-binding id-stx) is like identifier-binding, except that the reported information is for the identifier's bindings in the template environment (see section 12.3.4), instead of the normal environment. If the result is 'lexical for either of identifier-binding or identifier-template-binding, then the result is always 'lexical for both.

(identifier-binding-export-position id-stx) returns either #f or an exact non-negative integer. It returns an integer only when identifier-binding returns a list, when id-stx represents an imported binding, and when the source module assigns internal positions for its definitions. This function is intended for use by mzc.

(identifier-transformer-binding-export-position id-stx) is like identifier-binding-export-position, except that the reported information is for the transformer environment. This function is intended for use by mzc.

12.3.3  Transformer Environments

The top-level environment for transformer expressions is separate from the normal top-level environment. Consequently, top-level definitions are not available for use in top-level transformer definitions. For example, the following program does not work:

(define count 0)
(define (inc!) (set! count (add1 count)))
(define-syntax (let1 stx)
  (syntax-case stx ()
    [(_ x v b)
     (begin
       (printf "expanding ~a~n" count) ; DOESN'T WORK
       (inc!)                          ; ALSO DOESN'T WORK
       (syntax (let ([x v]) b)))]))
(let1 x 2 (add1 x))

The variables count and inc! are bound in the normal top-level environment, but it is not bound in the transformer environment, so the attempt to expand (let1 x 2 (add1 x)) will result in an undefined-variable error.

In the same way that define binds only in the normal environment, a require expression imports only into the normal environment, and the imported bindings are not made visible in the transformer environment. A top-level require-for-syntax imports into the transformer environment without affecting the normal environment. Furthermore, the require and require-for-syntax forms create separate instantiations of any module that is imported into both environments, in keeping with the separation of the environments.

The initial namespace created by the stand-alone MzScheme application imports all of MzScheme's built-in syntax, procedures, and constants into the transformer environment.46 To extend this environment, use one of the following:

In particular, example above can be repairs by replacing

(define count 0)
(define (inc!) (set! count (add1 count)))

with either

(define-for-syntax count 0)
(define-for-syntax (inc!) (set! count (add1 count)))

or

(begin-for-syntax
 (define count 0)
 (define (inc!) (set! count (add1 count))))

or

(module counter mzscheme
  (define count 0)
  (define (inc!) (set! count (add1 count)))
  (provide count inc!))
(require-for-syntax counter)

When an identifier binding is introduced by a form other than module or a top-level definition, it extends the environment for both normal and transformer expressions within its scope, but the binding is only accessible by expressions resolved in the proper environment (i.e., the one in which it was introduced). In particular, a transformer expression in a let-syntax or letrec-syntax expression cannot access identifiers bound by enclosing forms, and an identifier bound in a transformer expression should not appear as an expression in the result of the transformer. Such out-of-context uses of an identifier are flagged as syntax errors when attempting to resolve the identifier.

A let-syntax or letrec-syntax expression can never usefully appear as a transformer expression, because MzScheme provides no mechanism for importing into the meta-transformer environment that would be used by meta-transformer expressions to operate on transformer expressions. In other words, an expression of the form

(let-syntax ([identifier (let-syntax ([identifier expr]) 
                                 body-expr)]) 
  ...)

is always illegal, assuming that let-syntax is bound in both the normal and transformer environments to the let-syntax of mzscheme. No syntax (not even function application) is bound in expr's environment. This restriction in the mzscheme language is of little consequence, however, since for-syntax exports allow the definition of syntax applicable to the above body-expr.

12.3.4  Module Environments

In the same way that the normal and transformer environments are kept separate at the top level, a module's normal and transformer environments are also separated. Normal imports and definitions in a module -- both variable and syntax -- contribute to the module's normal environment, only.

For example, the module expression

(module m mzscheme 
  (define (id x) x)
  (define-syntax (macro stx)
    (id (syntax (printf "hi~n")))))

is ill-formed because id is not bound in the transformer environment for the macro implementation. To make id usable from the transformer, the body of the module m would have to be executed -- which is impossible in general, because a syntax definition such as macro affects the expansion of the rest of the module body.

Consequently, if a procedure such as id is to be used in a transformer, it must either remain local to the transformer expression, or reside in a different module. For example, the above module is trivially repaired as

(module m mzscheme 
  (define-syntax macro
    (let ([id (lambda (x) x)])
      (lambda (stx)
        (id (syntax (printf "hi~n")))))))

The define-for-syntax, begin-for-syntax, and define-syntaxes forms (see section 12.3.3 and section 12.4) are useful for defining multiple macros that share helper functions.

In the mzscheme language, the base environment for a transformer expression includes all of MzScheme. The mzscheme language also provides a require-for-syntax form (in the normal environment) for importing bindings from another module into the importing module's transformer environment:

(require-for-syntax require-spec ···)

A for-syntax import of M within N causes M to be executed at N's expansion time, instead of (or possibly in addition to) run time for N. The syntax and variable identifiers exported by the for-syntax module are visible within the module's transformer environment, but not its normal environment. Like a normal expression, a transformer expression in a module cannot contain free variables.

Finally, mzscheme provides the require-for-template form, which is roughly dual to require-for-syntax:

(require-for-template require-spec ···)

A for-template import of M within N causes the referenced module to be executed at the run-time of any P that includes a for-syntax import of N. In other words, require-for-template introduces bindings that become available in a future run time.

Transformer expressions and imports for a module M are executed once each time a module is expanded using M's syntax bindings or using M as a for-syntax import. After the module is expanded, its transformer environment is destroyed, including bindings from modules used at expansion time.

Example:

 (module rt mzscheme
   (printf "RT here~n")
   (define mx (lambda () 7))
   (provide mx))

 (module tt mzscheme
   (printf "RT here, too~n")
   (define x 700)
   (provide x))

 (module et mzscheme
   (require-for-template tt)
   (printf "ET here~n")
   ;; The x below is future-time:
   (define mx (lambda () (syntax x)))
   (provide mx))

 (module m mzscheme
   (require-for-syntax mzscheme)
   (require rt)               ; rt provides run-time mx
   (require-for-syntax et)    ; et provides exp-time mx

   ;; The mx below is run-time:
   (printf "~a~n" (mx))       ; prints 7 when run

   ;; The mx below is exp-time:
   (define-syntax onem (lambda (stx) (mx)))
   (printf "~a~n" (onem))    ; prints 700 when run

   ;; The mx below is run-time:
   (define-syntax twom (lambda (stx) (syntax (mx))))
   (printf "~a~n" (twom)))    ; prints 7 when run

 ;; "ET here" is printed during the expansion of m

 (require m) ; prints "ET here" (for later macro expansion in the top level, if any)
             ; and "RT here, too" and "RT here" in some order,
             ; then 7, then 700, then 7

This expansion-time execution model explains the need to execute declared modules only when they are invoked. If a declared module is imported into other modules only for syntax, then the module is needed only at expansion time and can be ignored at run time. The separation of declaration and execution also allows a for-syntax module to be executed once for each module that it expands through require-for-syntax.

The hierarchy of run times avoids confusion among expansion and executing layers that can prevent separate compilation. By ensuring that the layers are separate, a compiler or programming environment can expand, partially expand, or re-expand a module without affecting the module's run-time behavior, whether the module is currently executing or not.

Since transformer expressions may themselves use macros defined by modules with for-syntax imports (to implement the macros), expansion of a module creates a hierarchy of run times (or "tower of expanders"). The expansion time of each layer corresponds to the run time of the next deeper layer.

In the absence of let-syntax and letrec-syntax, the hierarchy of run times would be limited to three levels, since the transformer expressions for run-time imports would have been expanded before the importing module must be expanded. The let-syntax and letrec-syntax forms, however, allow syntax visible in a for-syntax import's transformers to appear in the expansion of transformer expressions in the module. Consequently, the hierarchy is bounded in principle only by the number of declared modules. In practice, the hierarchy will rarely exceed a few levels.

12.3.5  Macro-Generated Top-Level and Module Definitions

When a top-level definition binds an identifier that originates from a macro expansion, the definition captures only uses of the identifier that are generated by the same expansion. This behavior is consistent with internal definitions (see section 2.8.5), where the defined identifier turns into a fresh lexical binding.

Example:

(define-syntax def-and-use-of-x
  (syntax-rules ()
    [(def-and-use-of-x val)
     ; x below originates from this macro:
     (begin (define x val) x)]))
(define x 1)
x ; => 1
(def-and-use-of-x 2) ; => 2
x ; => 1

(define-syntax def-and-use
  (syntax-rules ()
    [(def-and-use x val)
     ; x below was provided by the macro use:
     (begin (define x val) x)]))
(def-and-use x 3) ; => 3
x ; => 3

For a top-level definition (outside of module), the order of evaluation affects the binding of a generated definition for a generated identifier use. If the use precedes the definition, then the use refers to a non-generated binding, just as if the generated definition were not present. (No such dependency on order occurs within a module, since a module binding covers the entire module body.) To support the declaration of an identifier before its use, the define-syntaxes form avoids binding an identifier if the body of the define-syntaxes declaration produces zero results (see also section 12.4).

Example:

(define bucket-1 0)
(define bucket-2 0)
(define-syntax def-and-set!-use-of-x
  (syntax-rules ()
    [(def-and-set!-use-of-x val)
     (begin (set! bucket-1 x) (define x val) (set! bucket-2 x))]))
(define x 1)
(def-and-set!-use-of-x 2)
x ; => 1
bucket-1 ; => 1
bucket-2 ; => 2

(define-syntax defs-and-uses/fail
  (syntax-rules ()
    [(def-and-use)
     (begin
      ; Initial reference to even precedes definition:
      (define (odd x) (if (zero? x) #f (even (sub1 x))))
      (define (even x) (if (zero? x) #t (odd (sub1 x))))
      (odd 17))]))
(defs-and-uses/fail) ; => error: undefined identifier even
     
(define-syntax defs-and-uses
  (syntax-rules ()
    [(def-and-use)
     (begin
      ; Declare before definition via no-values define-syntaxes:
      (define-syntaxes (odd even) (values))
      (define (odd x) (if (zero? x) #f (even (sub1 x))))
      (define (even x) (if (zero? x) #t (odd (sub1 x))))
      (odd 17))]))
(defs-and-uses) ; => #t

Within a module, macro-generated require and provide clauses also introduce and reference generation-specific bindings:

12.4  Binding Multiple Syntax Identifiers

In addition to define-syntax, let-syntax, and letrec-syntax, MzScheme provides define-syntaxes, let-syntaxes, and letrec-syntaxes. These forms are analogous to define-values, let-values, and letrec-values, allowing multiple syntax bindings at once (see section 2.8).

(define-syntaxes (identifier ···) expr)

(let-syntaxes (((identifier ···) expr)
               ···)
   expr ···1)

(letrec-syntaxes (((identifier ···) expr)
                  ···)
   expr ···1)

At the top level, define-syntaxes accepts zero results for any number of identifiers, and in that case, it neither binds the identifiers nor signals an error. This behavior is useful for identifiers that are introduced by a macro that produces top-level defines. See section 12.3.5 for more information.

MzScheme also provides a letrec-syntaxes+values form for binding both values and syntax in a single, mutually recursive scope:

(letrec-syntaxes+values (((identifier ···) expr) ···)
                        (((identifier ···) expr) ···)
   expr ···1)

The first set of bindings are syntax bindings (as in letrec-syntaxes), and the second set of bindings are normal variable bindings (as in letrec-values).

Examples:

;; Defines let/cc and let-current-continuation as the same macro:
(define-syntaxes (let/cc let-current-continuation)
  (let ([macro (syntax-rules ()
                 [(_ id body1 body ...) 
                  (call/cc (lambda (id) body1 body ...))])])
    (values macro macro)))

(letrec-syntaxes+values ([(get-id) (syntax-rules ()
                                    [(_) id])])
                        ([(id) (lambda (x) x)]
                         [(x) (get-id)])
   x) ; => the id identify procedure

12.5  Special Syntax Identifiers

To enable the definition of syntax transformers for application forms and other data (numbers, vectors, etc.), the syntax expander treats #%app, #%top, and #%datum as special identifiers.

Any expandable expression of the form

(datum . datum)

where the first datum is not an identifier bound to an expansion-time value, is treated as

(#%app datum . datum)

so that the syntax transformer bound to #%app is applied. In addition, () is treated as (#%app). Similarly, an expression

identifier

where identifier has no binding other than a top-level binding, is treated as

(#%top . identifier)

Finally, an expression

datum

where datum is not an identifier or pair, is treated as

(#%datum . datum)

The mzscheme module binds #%app, #%top, and #%datum as regular application, top-level variable reference, and implicit quote, respectively. A module can export different transformers with these names to support languages different from conventional Scheme.

In addition, #%module-begin is used as a transformer for a module body. A #%module-begin is implicitly added around a module body when it contains multiple S-expressions, or when the S-expression expands to a core form other than #%module-begin or #%plain-module-begin; the lexical context for the introduced #%module-begin identifier includes only the exports of the module's initial import. After such wrapping, if any, and before any expansion, an 'enclosing-module-name property is attached to the module-body syntax object; the property's value is a symbol for the module name as specified after the module keyword.

The mzscheme module binds #%module-begin to a form that inserts a for-syntax import of mzscheme, so that mzscheme bindings can be used in syntax definitions. It also exports #%plain-module-begin, which can be substituted for #%module-begin to avoid the for-syntax import of mzscheme. Any other transformer used for #%module-begin must expand to mzscheme's #%module-begin or #%plain-module-begin.

When an expression is fully expanded, all applications, top-level variable references, and literal datum expressions will appear as explicit #%app, #%top, and #%datum forms, respectively. Those forms can also be used directly by source code. The #%module-begin form can never usefully appear in an expression, and the body of a fully expanded module declaration is not wrapped with #%module-begin; instead, it is wrapped with #%plain-module-begin.

The following example shows how the special syntax identifiers can be defined to create a non-Scheme module language:

(module lambda-calculus mzscheme 
  
  ; Restrict lambda to one argument: 
  (define-syntax lc-lambda 
    (syntax-rules () 
      [(_ (x) E) (lambda (x) E)])) 
  
  ; Restrict application to two expressions:
  (define-syntax lc-app 
    (syntax-rules () 
      [(_ E1 E2) (E1 E2)])) 
  
  ; Restrict a lambda calculus module to one body expression: 
  (define-syntax lc-module-begin  
    (syntax-rules () 
      [(_ E) (#%module-begin E)])) 
  
  ; Disallow numbers, vectors, etc. 
  (define-syntax lc-datum 
    (syntax-rules ())) 
  
  ; Provide (with renaming): 
  (provide #%top ; keep mzscheme's free-variable error 
           (rename lc-lambda lambda) 
           (rename lc-app #%app) 
           (rename lc-module-begin #%module-begin) 
           (rename lc-datum #%datum))) 
  
(module m lambda-calculus 
  ; The only syntax defined by lambda-calculus is 
  ; unary lambda, unary application, and variables. 
  ; Also, the module must contain exactly one expression. 
  ((lambda (y) (y y)) 
   (lambda (y) (y y)))) 
  
(require m)     ; executes m, loops forever

12.6  Macro Expansion

A define-syntax, let-syntax, or letrec-syntax form associates an identifier to an expansion-time value. If the expansion-time value is a procedure of one argument, then the procedure is applied by the syntax expander when the identifier is used in the scope of the syntax binding.

The transformer for an identifier is applied whenever the identifier appears in an expression position -- not just when it appears after a parenthesis as (identifier ...). When it does appear as (identifier ...), the entire (identifier ...) expression is provided as the argument to the transformer. Otherwise only identifier is provided to the transformer.

A typical transformer is implemented as

(lambda (stx) 
  (syntax-case stx ()
    [(_ rest-of-pattern) expr]))

so that identifier by itself does not match the pattern; thus, the exn:fail:syntax exception is raised when identifier does not appear as (identifier ...).

(make-set!-transformer proc) also creates a transformer procedure. The proc argument must be a procedure of one argument; if the result of (make-set!-transformer proc) is bound as syntax to identifier, then proc is applied as a transformer when identifier is used in an expression position, or when it is used as the target of a set! assignment: (set! identifier expr). When the identifier appears as a set! target, the entire set! expression is provided to the transformer.

Example:

(let ([x 1]
      [y 2])
  (let-syntax ([x (make-set!-transformer
                    (lambda (stx)
                     (syntax-case stx (set!)
                       ; Redirect mutation of x to y
                       [(set! id v) (syntax (set! y v))])))]
                       ; Normal use of x really gets x
                       [id (identifier? (syntax id)) (syntax x)])))])
    (begin
      (set! x 3)
      (list x y)))) ; => '(1 3)

(set!-transformer? v) returns #t if v is a value created by make-set!-transformer, #f otherwise.

(set!-transformer-procedure transformer) returns the procedure passed to make-set!-transformer to create transformer.

(make-rename-transformer id-stx) creates a transformer procedure that inserts the identifier id-stx in place of whatever identifier binds the transformer, including in non-application positions, and in set! expressions. Such a transformer could be written manually, but the one created by make-rename-transformer cooperates specially with syntax-local-value (see below).

(rename-transformer? v) returns #t if v is a value created by make-rename-transformer, #f otherwise.

(rename-transformer-target transformer) returns the identifier passed to make-rename-transformer to create transformer.

If a transformer expression produces a non-procedure value, the value is associated with the identifier as a generic expansion-time value. Any use of the identifier in an expression position is rejected as a syntax error, but syntax transformers can access the value. For example, the define-signature form (see Chapter 52 in PLT MzLib: Libraries Manual) associates a component interface description to the defined identifier.

When a syntax transformer is applied, it can query the bindings of identifiers in the lexical environment of the expression being transformed. For example, the unit/sig form can access a named interface description with syntax-local-value:

A transformer can also expand or partially expand subexpressions from its input syntax object:

To track the introduction of identifiers by a macro (see section 12.3), the syntax expander adds a special ``mark'' to a syntax object that is provided to a transformer, and also marks the result of the transformer. Consecutive marks cancel, and each transformer application has a distinct mark, so the only parts of the resulting syntax object with marks are the parts that were introduced by the transformer. A transformer can explicitly add a current mark to a syntax object using syntax-local-introduce or the result of make-syntax-introducer:

Explicit marking is useful on syntax objects that flow into or out of a transformer without being the transformer argument or result. For example, DrScheme's Check Syntax tool recognizes 'disappeared-binding and 'disappeared-use properties, which specify bound-binding identifier pairs in the source program that do not appear in the expansion. Example:

(define-syntax (match-list stx)
  (syntax-case stx ()
    [(_ expr (id ...) result-id)
     (let ([ids (syntax->list (syntax (id ...)))]
           [result-id (syntax result-id)])
       ;; Make sure the expression is well formed:
       (for-each (lambda (id)
                   (unless (identifier? id)
                     (raise-syntax-error #f "not an identifier" stx id)))
                 (append ids (list result-id)))
       ;; Find the matching identifier and produce a list-ref expression:
       (let loop ([ids ids] [pos 0])
         (cond
           [(null? ids) (raise-syntax-error #f "no pattern binding" stx result-id)]
           [(bound-identifier=? (car ids) result-id)
            ;; Found it; produce the list-ref expression, and
            ;; tell the Check Syntax tool about the pattern-variable binding:
            (with-syntax ([pos pos])
              (syntax-property
               (syntax-property
                (syntax (list-ref expr pos)) ; the expansion result
                'disappeared-binding
                (syntax-local-introduce (car ids)))
               'disappeared-use
               (syntax-local-introduce result-id)))]
           [else (loop (cdr ids) (add1 pos))])))]))

;; Test it:
(match-list '(1 2 3) (a b c) b) ; => 2

In this example, Check Syntax will draw a binding arrow from the first b to the second b. Without the calls to syntax-local-introduce, the identifiers stored in the property would appear to have originated from the transformer, instead of from the transformer's argument; consequently, Check Syntax would not draw the arrow, because it would not know that the bs exist in the source program.

12.6.1  Expanding Expressions to Primitive Syntax

(expand stx-or-sexpr) expands all non-primitive syntax in stx-or-sexpr, and returns a syntax object for the expanded expression. See below for the grammar of fully expanded expressions. Before stx-or-sexpr is expanded, its lexical context is enriched with namespace-syntax-introduce as for eval (see section 8.3 and section 14.1). Use syntax-object->datum to convert the returned syntax object into an S-expression.

(expand-syntax stx) is like (expand stx), except that the argument must be a syntax object, and its lexical context is not enriched before expansion.

(expand-once stx-or-sexpr) partially expands syntax in the stx-or-sexpr and returns a syntax object for the partially-expanded expression. Due to limitations in the expansion mechanism, some context information may be lost. In particular, calling expand-once on the result may produce a result that is different from expansion via expand. Before stx-or-sexpr is expanded, its lexical context is enriched with namespace-syntax-introduce as for eval (see section 8.3 and section 14.1).

(expand-syntax-once stx) is like (expand-once stx), except that the argument must be a syntax object, and its lexical context is not enriched before expansion.

(expand-to-top-form stx-or-sexpr) partially expands syntax in stx-or-sexpr to reveal the outermost syntactic form. This partial expansion is mainly useful for detecting top-level uses of begin. Unlike expanding the result of expand-once, expanding the result of expand-to-top-form with expand produces the same result as using expand on the original syntax. Before stx-or-sexpr is expanded, its lexical context is enriched with namespace-syntax-introduce as for eval (see section 8.3 and section 14.1).

(expand-syntax-to-top-form stx) is like (expand-to-top-form stx), except that the argument must be a syntax object, and its lexical context is not enriched before expansion.

The possible shapes of a fully expanded expression are defined by top-level-expr:

top-level-expr is one of
  general-top-level-expr
  (module identifier name (#%plain-module-begin module-level-expr ···))
  (begin top-level-expr ···)

module-level-expr is one of
  general-top-level-expr
  (provide provide-spec ...)

general-top-level-expr is one of
  expr
  (define-values (variable ···) expr)
  (define-syntaxes (identifier ···) expr)
  (define-values-for-syntax (variable ···) expr)
  (require require-spec ···)
  (require-for-syntax require-spec ···)
  (require-for-template require-spec ···)

expr is one of
  variable
  (lambda formals expr ···1)
  (case-lambda (formals expr ···1) ···)
  (if expr expr)
  (if expr expr expr)
  (begin expr ···1)
  (begin0 expr expr ···)
  (let-values (((variable ···) expr) ···) expr ···1)
  (letrec-values (((variable ···) expr) ···) expr ···1)
  (set! variable expr)
  (quote datum)
  (quote-syntax datum)
  (with-continuation-mark expr expr expr)
  (#%app expr ···1)
  (#%datum . datum)
  (#%top . variable)
  (#%variable-reference variable)
  (#%variable-reference (#%top . variable))

where formals is defined in section 2.9, and require-spec and provide-spec are defined in section 5.2.

When a variable expression appears in a fully-expanded expression, it either refers to a variable bound by lambda, case-lambda, let-values, letrec-values, or define (within the current module), or it refers to an imported variable. (In other words, a variable not wrapped by #%top never refers to a top-level variable.)

The keywords in the above grammar are placeholders for identifiers that are module-identifier=? (or module-transformer-identifier=? for define-syntax expressions) to the same-named exports of mzscheme. Due to import renamings, the printed identifier names can be different in the expanded expression.

12.6.2  Syntax Object Properties

Every syntax object has an associated property list, which can be queried or extended with syntax-property:

The read-syntax procedure attaches a 'paren-shape property to any pair or vector syntax object generated from parsing a pair of square brackets (``['' and ``]'') or curly braces (``{'' and ``}'').47 The property value is #\[ in the former case, and #\{ in the latter case. The syntax form copies any 'paren-shape property from the sourec of a template to corresponding generated syntax.

Both the syntax input to a transformer and the syntax result of a transformer may have associated properties. The two sets of properties are merged by the syntax expander: each property in the original and not present in the result is copied to the result, and the values of properties present in both are combined with cons-immutable (result value first, original value second).

Before performing the merge, however, the syntax expander automatically add a property to the original syntax object using the key 'origin. If the source syntax has no 'origin property, it is set to the empty list. Then, still before the merge, the identifier that triggered the macro expansion (as syntax) is cons-immutabled onto the 'origin property so far.

The 'origin property thus records (in reverse order) the sequence of macro expansions that produced an expanded expression. Usually, the 'origin value is an immutable list of identifiers. However, a transformer might return syntax that has already been expanded, in which case an 'origin list can contain other lists after a merge.

For example, the expression

(or x y)

expands to

(let ((or-part x)) (if or-part or-part (or y)))

which, in turn, expands to

(let-values ([(or-part) x]) (if or-part or-part y))

The syntax object for the final expression will have an 'origin property whose value is (list-immutable (quote-syntax let) (quote-syntax or)).

(syntax-track-origin new-stx orig-stx id-stx) add properties to new-stx in the same way that macro expansion adds properties to a transformer result. In particular, it merges the properties of orig-stx into new-stx, first adding id-stx as an 'origin property, and it returns the property-extended syntax object. Use the syntax-track-origin procedure in a macro transformer that discards syntax (corresponding to orig-stx with a keyword id-stx) leaving some other syntax in its place (corresponding to new-stx).

Besides 'origin tracking for general macro expansion, MzScheme adds properties to expanded syntax (often using syntax-track-origin) to record additional expansion details:

The syntax-original? procedure and the 'origin, 'disappeared-binding, and 'disappeared-use properties are used by program-processing tools (such as Check Syntax in DrScheme) to relate source code to its expanded form. Implementors of macro transformers should consider whether properties added automatically by MzScheme are sufficient for tools to make sense of expansion result, and implementors should use syntax-track-origin and syntax-property as necessary to fill in gaps (see section 12.6 for an example).

See section 12.6.5 for information about properties generated by the expansion of a module declaration. See section 3.12.1 and section 6.2.3 for information about properties recognized when compiling a procedure. See section 14.3 for information on properties and byte codes.

12.6.3  Certificates for Protected References

As illustrated in section 5.3, a macro can expand into a use of an identifier that is not exported from the macro's module. In general, such an identifier must not be extracted from the expanded expression and used in a different context, because using the identifier in a different context may break invariants of the macro's module. For example, the following module exports a macro go that expands to a use of unchecked-go:

(module m mzscheme
  (provide go)
  (define (unchecked-go n x) 
    ;; to avoid disaster, n must be a number
    (+ n 17))
  (define-syntax (go stx)
    (syntax-case stx ()
     [(_ x)
      #'(unchecked-go 8 x)])))

If the reference to unchecked-go is extracted from the expansion of (go 'a), then it might be inserted into a new expression, (unchecked-go #f 'a), leading to disaster. The datum->syntax-object procedure can be used similarly to construct references to an unexported identifier, even when no macro expansion includes a reference to the identifier.

To prevent such abuses of unexported identifiers, MzScheme's macro expander and compiler reject references to unexported identifiers unless they appear in certified syntax objects. The macro expander always certifies a syntax object that is produced by a transformer. For example, when (go 'a) is expanded to (unchecked-go 8 'a), a certificate is attached to the result (unchecked-go 8 'a). Extracting just unchecked-go removes the identifier from the certified expression, so that the reference is disallowed when it is inserted into (unchecked-go #f 'a).

In addition to checking module references, the macro expander disallows references to local bindings where the binding identifier is less certified than the reference. Otherwise, the expansion of (go 'a) could be wrapped with a local binding that redirects #%app to values, thus obtaining the value of unchecked-go. Note that a capturing #%app would have to be extracted from the expansion of (go 'a), since lexical scope would prevent an arbitrary #%app from capturing. The act of extracting #%app removes its certification, whereas the #%app within the expansion is still certified; comparing these certifications, the macro expander rejects the local-binding reference, and unchecked-go remains protected.

In much the same way that the macro expander copies properties from a transformer's input to its output, the expander copies certificates from a transformer's input to its output. Building on the previous example,

(module n mzscheme
  (require m)
  (provide go-more)
  (define y 'hello)
  (define-syntax (go-more stx)
    #'(go y)))

the expansion of (go-more) introduces a reference to the unexported y in (go y), and a certificate allows the reference to y. As (go y) is expanded to (unchecked-go 8 y), the certificate that allows y is copied over, in addition to the certificate that allows the reference to unchecked-go.

When a protected identifier becomes inaccessible by direct reference (i.e., when the current code inspector is changed so that it does not control the module's invocation; see section 9.4), the protected identifier is treated like an unexported identifier.

12.6.3.1  Certificate Propagation

When the result of a macro expansion contains a quote-syntax form, the macro expansion's certificate must be attached to the resulting syntax object to support macro-generating macros. In general, when the macro expander encounters quote-syntax, it attaches all certificates from enclosing expressions to the quoted syntax constant. However, the certificates are attached to the syntax constant as inactive certificates, and inactive certificates do not count directly for certifying identifier access. Inactive certificates become active when the macro expander certifies the result of a macro expansion; at that time, the expander removes all inactive certificates within the expansion result and attaches active versions of the certificates to the overall expansion result.

For example, suppose that the go macro is implemented through a macro:

(module m mzscheme
  (provide def-go)
  (define (unchecked-go n x) 
    (+ n 17))
  (define-syntax (def-go stx)
   (syntax-case stx ()
     [(_ go)
      #'(define-syntax (go stx)
          (syntax-case stx ()
           [(_ x)
            #'(unchecked-go 8 x)]))])))

When def-go is used inside another module, the generated macro should legally generate expressions that use unchecked-go, since def-go in m had complete control over the generated macro.

(module n mzscheme
  (require m)
  (def-go go)
  (go 10)) ; access to unchecked-go is allowed

This example works because the expansion of (def-go go) is certified to access protected identifiers in m, including unchecked-go. Specifically, the certified expansion is a definition of the macro go, which includes a syntax-object constant unchecked-go. Since the enclosing macro declaration is certified, the unchecked-go syntax constant gets an inactive certificate to access protected identifiers of m. When (go 10) is expanded, the inactive certificate on unchecked-go is activated for the macro result (unchecked-go 8 10), and the access of unchecked-go is allowed.

To see why unchecked-go as a syntax constant must be given an inactive certificate instead of an active one, it's helpful to write the def-go macro as follows:

(define-syntax (def-go stx)
 (syntax-case stx ()
   [(_ go)
    #'(define-syntax (go stx)
        (syntax-case stx ()
         [(_ x)
          (with-syntax ([ug (quote-syntax unchecked-go)])
            #'(ug 8 x))]))]))

In this case, unchecked-go is clearly quoted as an immediate syntax object in the expansion of (def-go go). If this syntax object were given an active certificate, then it would keep the certificate -- directly on the identifier unchecked-go -- in the result (unchecked-go 8 10). Consequently, the unchecked-go identifier could be extracted and used with its certificate intact. Attaching an inactive certificate to unchecked-go and activating it only for the complete result (unchecked-go 8 10) ensures that unchecked-go is used only in the way intended by the implementor of def-go.

12.6.3.2  Internal Certificates

In some cases, a macro implementor intends to allow limited destructuring of a macro result without losing the result's certificate. For example, given the following define-like-y macro,

(module q mzscheme
  (provide define-like-y)
  (define y 'hello)
  (define-syntax (define-like-y stx)
    (syntax-case stx ()
      [(_ id) #'(define-values (id) y)])))

someone may use the macro in an internal definition:

(let ()
  (define-like-y x)
  x)

The implementor of the q module most likely intended to allow such uses of define-like-y. To convert an internal definition into a letrec binding, however, the define form produced by define-like-y must be deconstructed, which would normally lose the certificate that allows the reference to y.

The internal use of define-like-y is allowed because the macro expander treats specially a transformer result that is a syntax list beginning with define-values. In that case, instead of attaching the certificate to the overall expression, the certificate is instead attached to each individual element of the syntax list, pushing the certificates into the second element of the list so that they are attached to the defined identifiers. Thus, a certificate is attached to define-values, x, and y in the expansion result (define-values (x) y), and the definition can be deconstructed for conversion to letrec.

Just like the new certificate that is added to a transformer result, old certificates from the input are similarly moved to syntax-list elements when the result starts with define-values. Thus, define-like-y could have been implemented to produce (define id y), using define instead of define-values. In that case, the certificate to allow reference to y would be attached initially to the expansion result (define x y), but as the define is expanded to define-values, the certificate would be moved to the parts.

The macro expander treats syntax-list results starting with define-syntaxes in the same way that it treats results starting with define-values. Syntax-list results starting with begin are treated similarly, except that the second element of the syntax list is treated like all the other elements (i.e., the certificate is attached to the element instead of its content). Furthermore, the macro expander applies this special handling recursively, in case a macro produces a begin form that contains nested define-values forms.

The default application of certificates can be overridden by attaching a 'certify-mode property (see section 12.6.2) to the result syntax object of a macro transformer. If the property value is 'opaque, then the certificate is attached to the syntax object and not its parts. If the property value is 'transparent, then the certificate is attached to the syntax object's parts. If the property value is 'transparent-binding, then the certificate is attached to the syntax object's parts and to the sub-parts of the second part (as for define-values and define-syntaxes). The 'transparent and 'transparent-binding modes triggers recursive property checking at the parts, so that the certificate can be pushed arbitrarily deep into a transformer's result.

12.6.3.3  Checking and Transferring Certificates

In general, a certificate combines a mark (see section 12.6), a module name (more precisely, a module path index; see section 12.6.5), an inspector, and an arbitrary key object. Within a certified syntax object, the certificate's mark is attached to every piece of syntax that was introduced by the relevant macro transformation (see again section 12.6), so the certificate applies only to those pieces of syntax, and only to identifiers that are bound by the transformer's module. The certificate's inspector depends on the module that defined the transformer; specifically, it is the inspector for the module's declaration (see section 9.4). A certificate's key is hidden if it is introduced by macro expansion, but applying the result of syntax-local-certifier (see section 12.6) can introduce certificates with other keys.

To check access to an unexported identifier, the compiler or macro expander checks each of the identifier's marks and module bindings; if, for some mark, the identifier's enclosing expressions include a certificate with the mark, the identifier's binding module, and with an inspector that controls the module's invocation (as opposed to the module's declaration; see again section 9.4), then the access is allowed. To check access to a protected identifier, only the certificate's mark and inspector are used (i.e., the module that bound the transformer is irrelevant, as long as it was evaluated with a sufficiently powerful inspector). The certificate key is not used in checking references.

To check access to a locally bound identifier, the compiler or macro expander checks the marks of the binding and reference identifiers; for every mark that they have in common, if the reference identifier has a certificate for the mark from an enclosing expression, the binding identifier must have a certificate for the mark from an enclosing expression, otherwise the reference is disallowed. (The reference identifier can have additional certificates for marks that are not attached to the binding identifier.) The binding module (if any) and the certificate key are not used for checking a local reference.

The datum->syntax-object procedure never transfers a certificate from one syntax object to another, so it cannot be used to gain access to an unexported identifier. The syntax-recertify procedure can be used to transfer a certificate from one syntax object to another, but only if the certificate's key is provided, or if a sufficiently powerful inspector is provided. Thus, a certificate's inspector serves two roles: it determines the certificate's power to grant access, and also allows the certificate to be moved arbitrarily by anyone with a more powerful inspector.

(syntax-recertify new-stx old-stx inspector key-v) copies certain certificates of old-stx to new-stx: a certificate is copied if its inspector is either inspector or controlled by inspector, or if the certificate's key is key-v; otherwise the certificate is not copied. The result is a syntax object like new-stx, but with the copied certificates. (The new-stx object itself is not modified.) Both active and inactive certificates are copied.

12.6.4  Information on Structure Types

The define-struct form (see section 4.1) binds the name of a structure type to an expansion-time value that records the identifiers bound to the structure type, the constructor procedure, the predicate procedure, and the field accessor and mutator procedures. This information can be used during the expansion of other expressions by transformer that call syntax-local-value (see section 12.6).

For example, the define-struct variant for subtypes (see section 4.2) uses the base type name t to find the variable struct:t containing the base type's descriptor; it also folds the field accessor and mutator information for the base type into the information for the subtype. The match form (see Chapter 25 in PLT MzLib: Libraries Manual) uses a type name to find the predicates and field accessors for the structure type.

Besides using the information, other syntactic forms can even generate information with the same shape. For example, the struct form in an imported signature for unit/sig (see Chapter 52 in PLT MzLib: Libraries Manual) causes the unit/sig transformer to generate information about imported structure types, so that match and subtyping define-struct expressions work within the unit.

The expansion-time information for a structure type is represented as an immutable list of six items:

The implementor of a syntactic form can expect users of the form to know what kind of information is available about a structure type. For example, the match implementation works with structure information containing an incomplete set of accessor bindings, because the user is assumed to know what information is available in the context of the match expression. In particular, the match expression can appear in a unit/sig form with an imported structure type, in which case the user is expected to know the set of fields that are listed in the signature for the structure type.

12.6.5  Information on Expanded and Compiled Modules

MzScheme provides an interface for obtaining information about an expanded or compiled module declaration's imports and exports. This information is intended for use by tools such as a compilation manager. The information usually identifies modules through a module path index, which is a semi-interned48 opaque value that encodes a relative module path (see section 5.4) and another index to which it is relative.

Where an index is expected, a symbol can usually take its place, representing a literal module name. A symbol is used instead of an index when a module is imported using its name directly with require instead of a module path.

An index that returns #f for its path and base index represents ``self'' -- i.e., the module declaration that was the source of the index -- and such an index is always used as the root for a chain of indices. For example, when extracting information about an identifier's binding within a module, if the identifier is bound by a definition within the same module, the identifier's source module will be reported using the ``self'' index. If the identifier is instead defined in a module that is imported via a module path (as opposed to a literal module name), then the identifier's source module will be reported using an index that contains the required module path and the ``self'' index.

Information for an expanded module declaration is stored in a set of properties attached to the syntax object:

(compiled-module-expression? v) returns #t if v is a compiled expression for a module declaration, #f otherwise. See also section 14.3.

(module-compiled-name compiled-module-code) takes a module declaration in compiled form (see section 14.3) and returns a symbol for the module's declared name.

(module-compiled-imports compiled-module-code) takes a module declaration in compiled form (see section 14.3) and returns three values: an immutable list of module path indices (and symbols) for the module's explicit imports, an immutable list of module path indices (and symbols) for the module's explicit for-syntax imports, and an immutable list of module path indices (and symbols) for the module's explicit for-template imports.

(module-compiled-exports compiled-module-code) takes a module declaration in compiled form (see section 14.3) and returns two values: an immutable list of symbols for the module's explicit variable exports, an immutable list symbols for the module's explicit syntax exports.


44 In general, modules and for-syntax imports create a hierarchy of run times and expansion times. See section 12.3.4 for more information.

45 In this particular case, Shriram Krishnamurthi points out changing if-it to use (datum->syntax-object (syntax test) 'it) solves the problem in a sensible way.

46 In contrast, a namespace created by (scheme-report-environment 5) imports only syntax-rules into the transformer environment.

47 More precisely, the property is attached by the default read handler in syntax mode when using the default readtable.

48 Multiple references to the same relative module tend to use the same index value, but not always.