Input and Output

11.1  Ports

By definition, ports in MzScheme produce and consume bytes. When a port is provided to a character-based operation, such as read, the port's bytes are read and interpreted as a UTF-8 encoding of characters (see also section 1.2.3). Thus, reading a single character may require reading multiple bytes, and a procedure like char-ready? may need to peek several bytes into the stream to determine whether a character is available. In the case of a byte stream that does not correspond to a valid UTF-8 encoding, functions such as read-char may need to peek one byte ahead in the stream to discover that the stream is not a valid encoding.

When an input port produces a sequence of bytes that is not a valid UTF-8 encoding in a character-reading context, then bytes that constitute an invalid sequence are converted to the character ``?''. Specifically, bytes 255 and 254 are always converted to ``?'', bytes in the range 192 to 253 produce ``?'' when they are not followed by bytes that form a valid UTF-8 encoding, and bytes in the range 128 to 191 are converted to ``?'' when they are not part of a valid encoding that was started by a preceding byte in the range 192 to 253. To put it another way, when reading a sequence of bytes as characters, a minimal set of bytes are changed to 6330 so that the entire sequence of bytes is a valid UTF-8 encoding.

See section 3.6 for procedures that facilitate conversions using UTF-8 or other encodings. See also reencode-input-port and reencode-output-port in Chapter 35 in PLT MzLib: Libraries Manual for obtaining a UTF-8-based port from one that uses a different encoding of characters.

(port? v) returns #t if either (input-port? v) or (output-port? v) is #t, #f otherwise.

(port-closed? port) returns #t if the input or output port port is closed, #f otherwise.

(file-stream-port? port) returns #t if the given port is a file-stream port (see section 11.1.6, #f otherwise.

(terminal-port? port) returns #t if the given port is attached to an interactive terminal, #f otherwise.

11.1.1  End-of-File Constant

The global variable eof is bound to the end-of-file value. The standard Scheme predicate eof-object? returns #t only when applied to this value.

Reading from a port produces an end-of-file result when the port has no more data, but some ports may also return end-of-file mid-stream. For example, a port connected to a Unix terminal returns an end-of-file when the user types control-d; if the user provides more input, the port returns additional bytes after the end-of-file.

11.1.2  Current Ports

The standard Scheme procedures current-input-port and current-output-port are implemented as parameters in MzScheme. See section 7.9.1.2 for more information.

11.1.3  Opening File Ports

The open-input-file and open-output-file procedures accept an optional flag argument after the filename that specifies a mode for the file:

The open-output-file procedure can also take a flag argument that specifies how to proceed when a file with the specified name already exists:

The open-input-output-file procedure takes the same arguments as open-output-file, but it produces two values: an input port and an output port. The two ports are connected in that they share the underlying file device. This procedure is intended for use with special devices that can be opened by only one process, such as COM1 in Windows. For regular files, sharing the device can be confusing. For example, using one port does not automatically flush the other port's buffer (see section 11.1.6 for more information about buffers), and reading or writing in one port moves the file position (if any) for the other port. For regular files, use separate open-input-file and open-output-file calls to avoid confusion.

Extra flag arguments are passed to open-output-file in any order. Appropriate flag arguments can also be passed as the last argument(s) to call-with-input-file, with-input-from-file, call-with-output-file, and with-output-to-file. When conflicting flag arguments (e.g., both 'error and 'replace) are provided to open-output-file, with-output-to-file, or call-with-output-file, the exn:fail:contract exception is raised.

Both with-input-from-file and with-output-to-file close the port they create if control jumps out of the supplied thunk (either through a continuation or an exception), and the port remains closed if control jumps back into the thunk. The current input or output port is installed and restored with parameterize (see section 7.9.2).

See section 11.1.6 for more information on file ports. When an input or output file-stream port is created, it is placed into the management of the current custodian (see section 9.2).

11.1.4  Pipes

(make-pipe [limit-k input-name-v output-name-v]) returns two port values (see section 2.2): the first port is an input port and the second is an output port. Data written to the output port is read from the input port. The ports do not need to be explicitly closed.

The optional limit-k argument can be #f or a positive exact integer. If limit-k is omitted or #f, the new pipe holds an unlimited number of unread bytes (i.e., limited only by the available memory). If limit-k is a positive number, then the pipe will hold at most limit-k unread/unpeeked bytes; writing to the pipe's output port thereafter will block until a read or peek from the input port makes more space available. (Peeks effectively extend the port's capacity until the peeked bytes are read.)

The optional input-name-v and output-name-v are used as the names for the returned input and out ports, respectively, if they are supplied. Otherwise, the name of each port is 'pipe.

(pipe-content-length pipe-port) returns the number of bytes contained in a pipe, where pipe-port is either of the pipe's ports produced by make-pipe. The pipe's content length counts all bytes that have been written to the pipe and not yet read (though possibly peeked).

11.1.5  String Ports

Scheme input and output can be read from or collected into a string or byte string:

String input and output ports do not need to be explicitly closed. The file-position procedure, described in section 11.1.6, works for string ports in position-setting mode.

Example:

(define i (open-input-string "hello world"))
(define o (open-output-string))
(write (read i) o)
(get-output-string o) ; => "hello"

11.1.6  File-Stream Ports

A port created by open-input-file, open-output-file, subprocess, and related functions is a file-stream port. The initial input, output, and error ports in stand-alone MzScheme are also file-stream ports. The file-stream-port? predicate recognizes file-stream ports.

An input port is block buffered by default, which means that on any read, the buffer is filled with immediately-available bytes to speed up future reads. Thus, if a file is modified between a pair of reads to the file, the second read can produce stale data. Calling file-position to set an input port's file position flushes its buffer.

Most output ports are block buffered by default, but a terminal output port is line buffered, and the error output port is unbuffered. An output buffer is filled with a sequence of written bytes to be committed as a group, either when the buffer is full (in block mode) or when a newline is written (in line mode).

A port's buffering can be changed via file-stream-buffer-mode (described below). The two ports produced by open-input-output-file have independent buffers.

The following procedures work primarily on file-stream ports:

11.1.7  Custom Ports

The make-input-port and make-output-port procedures create custom ports with arbitrary control procedures. Correctly implementing a custom port can be tricky, because it amounts to implementing a device driver. Custom ports are mainly useful to obtain fine control over the action of committing bytes as read or written.

Many simple port variations can be implemented using threads and pipes. For example, if get-next-char is a function that produces either a character or eof, it can be turned into an input port as follows

(let-values ([(r w) (make-pipe 4096)])
  ;; Create a thread to move chars from get-next-char to the pipe
  (thread (lambda () (let loop ()
                       (let ([v (get-next-char)])
                         (if (eof-object? v)
                             (close-output-port w)
                             (begin
                               (write-char v w)
                               (loop)))))))
   ;; Return the read end of the pipe
   r)

The port.ss in MzLib provides several other port constructors; see Chapter 35 in PLT MzLib: Libraries Manual.

11.1.7.1  Custom Input

(make-input-port name-v read-proc optional-peek-proc close-proc [optional-progress-evt-proc optional-commit-proc optional-location-proc count-lines!-proc init-position optional-buffer-mode-proc]) creates an input port. The port is immediately open for reading. If close-proc procedure has no side effects, then the port need not be explicitly closed.

When read-proc or optional-peek-proc (or an event produced by one of these) returns a procedure, and the procedure is used to obtain a non-byte result.32 The procedure is called by read,33 read-syntax, read-honu, read-honu-syntax, read-byte-or-special, read-char-or-special, peek-byte-or-special, or peek-char-or-special. The special-value procedure can return an arbitrary value, and it will be called zero or one times (not necessarily before further reads or peeks from the port). See section 11.2.9 for more details on the procedure's arguments and result.

If read-proc or optional-peek-proc returns a special procedure when called by any reading procedure other than read, read-syntax, read-honu, read-honu-syntax, read-char-or-special, peek-char-or-special, read-byte-or-special, or peek-byte-or-special, then the exn:fail:contract exception is raised.

Examples:

;; A port with no input...
;; Easy: (open-input-bytes #"")
;; Hard:
(define /dev/null-in 
  (make-input-port 'null
                   (lambda (s) eof)
                   (lambda (skip s progress-evt) eof)
                   void
                   (lambda () never-evt)
                   (lambda (k progress-evt done-evt)
                     (error "no successful peeks!"))))
(read-char /dev/null-in) ; => eof
(peek-char /dev/null-in) ; => eof
(read-byte-or-special /dev/null-in)     ; => eof
(peek-byte-or-special /dev/null-in 100) ; => eof

;; A port that produces a stream of 1s:
(define infinite-ones 
  (make-input-port
   'ones
   (lambda (s) 
     (bytes-set! s 0 (char->integer #\1)) 1)
   #f
   void))
(read-string 5 infinite-ones) ; => "11111"

;; But we can't peek ahead arbitrarily far, because the
;; automatic peek must record the skipped bytes:
(peek-string 5 (expt 2 5000) infinite-ones) ; => error: out of memory

;; An infinite stream of 1s with a specific peek procedure:
(define infinite-ones 
  (let ([one! (lambda (s) 
                (bytes-set! s 0 (char->integer #\1)) 1)])
    (make-input-port
     'ones
     one!
     (lambda (s skip progress-evt) (one! s))
     void)))
(read-string 5 infinite-ones) ; => "11111"

;; Now we can peek ahead arbitrarily far:
(peek-string 5 (expt 2 5000) infinite-ones) ; => "11111"

;; The port doesn't supply procedures to implement progress events:
(port-provides-progress-evts? infinite-ones) ; => #f
(port-progress-evt infinite-ones) ; error: no progress events

;; Non-byte port results:
(define infinite-voids
  (make-input-port
   'voids
   (lambda (s) (lambda args 'void))
   (lambda (skip s) (lambda args 'void))
   void))
(read-char infinite-voids) ; => error: non-char in an unsupported context
(read-char-or-special infinite-voids) ; => 'void

;; This port produces 0, 1, 2, 0, 1, 2, etc., but it is not
;; thread-safe, because multiple threads might read and change n.
(define mod3-cycle/one-thread
  (let* ([n 2]
         [mod! (lambda (s delta)
                 (bytes-set! s 0 (+ 48 (modulo (+ n delta) 3)))
                 1)])
    (make-input-port
     'mod3-cycle/not-thread-safe
     (lambda (s) 
       (set! n (modulo (add1 n) 3))
       (mod! s 0))
     (lambda (s skip) 
       (mod! s skip))
     void)))
(read-string 5 mod3-cycle/one-thread) ; => "01201"
(peek-string 5 (expt 2 5000) mod3-cycle/one-thread) ; => "20120"

;; Same thing, but thread-safe and kill-safe, and with progress
;; events. Only the server thread touches the stateful part
;; directly. (See the output port examples for a simpler thread-safe
;; example, but this one is more general.)
(define (make-mod3-cycle)
  (define read-req-ch (make-channel))
  (define peek-req-ch (make-channel))
  (define progress-req-ch (make-channel))
  (define commit-req-ch (make-channel))
  (define close-req-ch (make-channel))
  (define closed? #f)
  (define n 0)
  (define progress-sema #f)
  (define (mod! s delta)
    (bytes-set! s 0 (+ 48 (modulo (+ n delta) 3)))
    1)
  ;; ----------------------------------------
  ;; The server has a list of outstanding commit requests,
  ;;  and it also must service each port operation (read, 
  ;;  progress-evt, etc.)
  (define (serve commit-reqs response-evts)
    (apply
     sync
     (handle-evt read-req-ch (handle-read commit-reqs response-evts))
     (handle-evt progress-req-ch (handle-progress commit-reqs response-evts))
     (handle-evt commit-req-ch (add-commit commit-reqs response-evts))
     (handle-evt close-req-ch (handle-close commit-reqs response-evts))
     (append
      (map (make-handle-response commit-reqs response-evts) response-evts)
      (map (make-handle-commit commit-reqs response-evts) commit-reqs))))
  ;; Read/peek request: fill in the string and commit
  (define ((handle-read commit-reqs response-evts) r)
    (let ([s (car r)]
          [skip (cadr r)]
          [ch (caddr r)]
          [nack (cadddr r)]
          [peek? (cddddr r)])
      (unless closed?
        (mod! s skip)
        (unless peek?
          (commit! 1)))
      ;; Add an event to respond:
      (serve commit-reqs
             (cons (choice-evt nack
                               (channel-put-evt ch (if closed? 0 1)))
                   response-evts))))
  ;; Progress request: send a peek evt for the current 
  ;;  progress-sema
  (define ((handle-progress commit-reqs response-evts) r)
    (let ([ch (car r)]
          [nack (cdr r)])
      (unless progress-sema
        (set! progress-sema (make-semaphore (if closed? 1 0))))
      ;; Add an event to respond:
      (serve commit-reqs
             (cons (choice-evt nack
                               (channel-put-evt
                                ch
                                (semaphore-peek-evt progress-sema)))
                   response-evts))))
  ;; Commit request: add the request to the list
  (define ((add-commit commit-reqs response-evts) r)
    (serve (cons r commit-reqs) response-evts))
  ;; Commit handling: watch out for progress, in which case
  ;;  the response is a commit failure; otherwise, try
  ;;  to sync for a commit. In either event, remove the
  ;;  request from the list
  (define ((make-handle-commit commit-reqs response-evts) r)
    (let ([k (car r)]
          [progress-evt (cadr r)]
          [done-evt (caddr r)]
          [ch (cadddr r)]
          [nack (cddddr r)])
      ;; Note: we don't check that k is < the sum of
      ;;  previous peeks, because the entire stream is actually
      ;;  known, but we could send an exception in that case.
      (choice-evt
       (handle-evt progress-evt
                   (lambda (x) 
                     (sync nack (channel-put-evt ch #f))
                     (serve (remq r commit-reqs) response-evts)))
       ;; Only create an event to satisfy done-evt if progress-evt
       ;;  isn't already ready.
       ;; Afterward, if progress-evt becomes ready, then this
       ;;  event-making function will be called again, because
       ;;  the server controls all posts to progress-evt.
       (if (sync/timeout 0 progress-evt)
           never-evt
           (handle-evt done-evt
                       (lambda (v)
                         (commit! k)
                         (sync nack (channel-put-evt ch #t))
                         (serve (remq r commit-reqs) response-evts)))))))
  ;; Response handling: as soon as the respondee listens,
  ;;  remove the response
  (define ((make-handle-response commit-reqs response-evts) evt)
    (handle-evt evt
                (lambda (x)
                  (serve commit-reqs
                         (remq evt response-evts)))))
  ;; Close handling: post the progress sema, if any, and set
  ;;   the closed? flag
  (define ((handle-close commit-reqs response-evts) r)
    (let ([ch (car r)]
          [nack (cdr r)])
      (set! closed? #t)
      (when progress-sema
        (semaphore-post progress-sema))
      (serve commit-reqs
             (cons (choice-evt nack
                               (channel-put-evt ch (void)))
                   response-evts))))
  ;; Helper for reads and post-peek commits:
  (define (commit! k)
    (when progress-sema
      (semaphore-post progress-sema)
      (set! progress-sema #f))
    (set! n (+ n k)))
  ;; Start the server thread:
  (define server-thread (thread (lambda () (serve null null))))
  ;; ----------------------------------------
  ;; Client-side helpers:
  (define (req-evt f)
    (nack-guard-evt
     (lambda (nack)
       ;; Be sure that the server thread is running:
       (thread-resume server-thread (current-thread))
       ;; Create a channel to hold the reply:
       (let ([ch (make-channel)])
         (f ch nack)
         ch))))
  (define (read-or-peek-evt s skip peek?)
    (req-evt (lambda (ch nack)
               (channel-put read-req-ch (list* s skip ch nack peek?)))))
  ;; Make the port:
  (make-input-port 'mod3-cycle
                   ;; Each handler for the port just sends
                   ;;  a request to the server
                   (lambda (s) (read-or-peek-evt s 0 #f))
                   (lambda (s skip) (read-or-peek-evt s skip #t))
                   (lambda () ; close
                     (sync (req-evt
                            (lambda (ch nack)
                              (channel-put progress-req-ch (list* ch nack))))))
                   (lambda () ; progress-evt
                     (sync (req-evt
                            (lambda (ch nack)
                              (channel-put progress-req-ch (list* ch nack))))))
                   (lambda (k progress-evt done-evt)  ; commit
                     (sync (req-evt
                            (lambda (ch nack)
                              (channel-put commit-req-ch
                                           (list* k progress-evt done-evt ch nack))))))))

(let ([mod3-cycle (make-mod3-cycle)])
  (let ([result1 #f]
        [result2 #f])
    (let ([t1 (thread (lambda ()
                        (set! result1 (read-string 5 mod3-cycle))))]
          [t2 (thread (lambda ()
                        (set! result2 (read-string 5 mod3-cycle))))])
      (thread-wait t1)
      (thread-wait t2)
      (string-append result1 "," result2))) ; => "02120,10201", maybe
  (let ([s (make-bytes 1)]
        [progress-evt (port-progress-evt mod3-cycle)])
    (peek-bytes-avail! s 0 progress-evt mod3-cycle) ; => 1
    s                                    ; => #"1"
    (port-commit-peeked 1 progress-evt (make-semaphore 1)
                           mod3-cycle)   ; => #t
    (sync/timeout 0 progress-evt)        ; => progress-evt
    (peek-bytes-avail! s 0 progress-evt mod3-cycle) ; => 0
    (port-commit-peeked 1 progress-evt (make-semaphore 1) 
                           mod3-cycle))  ; => #f
  (close-input-port mod3-cycle))

11.1.7.2  Custom Output

(make-output-port name-v evt write-proc close-proc [optional-write-special-proc optional-write-evt-proc optional-special-evt-proc optional-location-proc count-lines!-proc init-position optional-buffer-mode-proc]) creates an output port. The port is immediately open for writing. If close-proc procedure has no side effects, then the port need not be explicitly closed. The port can buffer data within its write-proc and optional-write-special-proc procedures.

Examples:

;; A port that writes anything to nowhere:
(define /dev/null-out
  (make-output-port 
   'null
   always-evt
   (lambda (s start end non-block? breakable?) (- end start))
   void
   (lambda (special non-block? breakable?) #t)
   (lambda (s start end) (wrap-evt
                          always-evt
                          (lambda (x)
                            (- end start))))
   (lambda (special) always-evt)))
(display "hello" /dev/null-out)            ; => void
(write-bytes-avail #"hello" /dev/null-out) ; => 5
(write-special 'hello /dev/null-out)       ; => #t
(sync (write-bytes-avail-evt #"hello" /dev/null-out)) ; => 5

;; A part that accumulates bytes as characters in a list,
;;  but not in a thread-safe way:
(define accum-list null)
(define accumulator/not-thread-safe
  (make-output-port 
   'accum/not-thread-safe
   always-evt
   (lambda (s start end non-block? breakable?)
     (set! accum-list
           (append accum-list
                   (map integer->char
                        (bytes->list (subbytes s start end)))))
     (- end start))
   void))
(display "hello" accumulator/not-thread-safe)
accum-list ; => '(#\h #\e #\l #\l #\o)

;; Same as before, but with simple thread-safety:
(define accum-list null)
(define accumulator 
  (let* ([lock (make-semaphore 1)]
         [lock-peek-evt (semaphore-peek-evt lock)])
    (make-output-port
     'accum
     lock-peek-evt
     (lambda (s start end non-block? breakable?)
       (if (semaphore-try-wait? lock)
           (begin
             (set! accum-list
                   (append accum-list
                           (map integer->char
                                (bytes->list (subbytes s start end)))))
             (semaphore-post lock)
             (- end start))
           ;; Cheap strategy: block until the list is unlocked,
           ;;   then return 0, so we get called again
           (wrap-evt
            lock-peek
            (lambda (x) 0))))
     void)))
(display "hello" accumulator)
accum-list ; => '(#\h #\e #\l #\l #\o)

;; A port that transforms data before sending it on
;;  to another port. Atomic writes exploit the
;;  underlying port's ability for atomic writes.
(define (make-latin-1-capitalize port)
  (define (byte-upcase s start end)
    (list->bytes
     (map (lambda (b) (char->integer
                       (char-upcase
                        (integer->char b))))
          (bytes->list (subbytes s start end)))))
  (make-output-port
   'byte-upcase
   ;; This port is ready when the original is ready:
   port
   ;; Writing procedure:
   (lambda (s start end non-block? breakable?)
     (let ([s (byte-upcase s start end)])
       (if non-block?
           (write-bytes-avail* s port)
           (begin
             (display s port)
             (bytes-length s)))))
   ;; Close procedure --- close original port:
   (lambda () (close-output-port port))
   #f
   ;; Write event:
   (and (port-writes-atomic? port)
        (lambda (s start end)
          (write-bytes-avail-evt (byte-upcase s start end) port)))))
(define orig-port (open-output-string))
(define cap-port (make-latin-1-capitalize orig-port))
(display "Hello" cap-port)
(get-output-string orig-port) ; => "HELLO"
(sync (write-bytes-avail-evt #"Bye" cap-port)) ; => 3
(get-output-string orig-port) ; => "HELLOBYE"

11.2  Reading and Writing

MzScheme's support for reading and writing includes many extensions compared to R5RS, both at the level of individual bytes and characters and at the level of S-expressions.

11.2.1  Reading Bytes, Characters, and Strings

In addition to the standard reading procedures, MzScheme provides byte-reading procedure, block-reading procedures such as read-line, and more.

11.2.1.1  Counting Positions, Lines, and Columns

By default, MzScheme keeps track of the position in a port as the number of bytes that have been read from or written to any port (independent of the read/write position, which is accessed or changed with file-position). Optionally, however, MzScheme can track the position in terms of characters (after UTF-8 decoding), instead of bytes, and it can track line locations and column locations; this optional tracking must be specifically enabled for a port via port-count-lines! or the port-count-lines-enabled parameter (see section 7.9.1.2). Position, line, and column locations for a port are used by read-syntax (see section 12.2 for more information) and read-honu-syntax. Position and line locations are numbered from 1; column locations are numbered from 0.

When counting lines, MzScheme treats linefeed, return, and return-linefeed combinations as a line terminator and as a single position (on all platforms). Each tab advances the column count to one before the next multiple of 8. When a sequence of bytes in the range 128 to 253 forms a UTF-8 encoding of a character, the position/column is incremented is incremented once for each byte, and then decremented appropriately when a complete encoding sequence is discovered. See also section 11.1 for more information on UTF-8 decoding for ports.

A position is known for any port as long as its value can be expressed as a fixnum (which is more than enough tracking for realistic applications in, say, syntax-error reporting). If the position for a port exceeds the value of the largest fixnum, then the position for the port becomes unknown, and line and column tacking is disabled. Return-linefeed combinations are treated as a single character position only when line and column counting is enabled.

Certain kinds of exceptions (see section 6.1) encapsulate source-location information using a srcloc structure, which has five fields:

The fields of a srcloc structure are immutable, so no field-mutator procedures are defined for srcloc. The srcloc structure type is transparent to all inspectors (see section 4.5).

11.2.2  Writing Bytes, Characters, and Strings

In addition to the standard printing procedures, MzScheme provides byte-writing procedures, block-writing procedures such as write-string, and more.

11.2.3  Writing Structured Data

The print procedure is used to print Scheme values in a context where a programmer expects to see a value:

The rationale for providing print is that display and write both have standard output conventions, and this standardization restricts the ways that an environment can change the behavior of these procedures. No output conventions should be assumed for print so that environments are free to modify the actual output generated by print in any way. Unlike the port display and write handlers, a global port print handler can be installed through the global-port-print-handler parameter (see section 7.9.1.2).

The fprintf, printf, and format procedures create formatted output:

When an illegal format string is supplied to one of these procedures, the exn:fail:contract exception is raised. When the format string requires more additional arguments than are supplied, the exn:fail:contract exception is raised. When more additional arguments are supplied than are used by the format string, the exn:fail:contract exception is raised.

For example,

(fprintf port "~a as a string is ~s.~n" '(3 4) "(3 4)")

prints this message to port:36

(3 4) as a string is "(3 4)".

followed by a newline.

11.2.4  Default Reader

MzScheme's input parser obeys the following non-standard rules. See also section 11.2.8 for information on configuring the input parser through a readtable.

Reading from a custom port can produce arbitrary values generated by the port; see section 11.1.7 for details. If the port generates a non-character value in a position where a character is required (e.g., within a string), the exn:fail:read:non-char exception is raised.

11.2.5  Default Printer

MzScheme's printer obeys the following non-standard rules (though the rules for print do not apply when the print-honu parameter is set to #t; see section 7.9.1.4).

11.2.5.1  Sharing Structure in Input and Output

MzScheme can read and print Common LISP-style graphs, values with shared structure (including cycles). Graphs are described by tagging the shared structure once with #n= (using some decimal integer n with no more than eight digits) and then referencing it later with #n# (using the same number n). For example, the following datum represents the infinite list of ones:

#0=(1 . #0#)

If this graph is entered into MzScheme's read-eval-print loop, MzScheme's compiler will loop forever, trying to compile an infinite expression. In contrast, the following expression defines ones to the infinite list of ones, using quote to hide the infinite list from the compiler:

(define ones (quote #0=(1 . #0#)))

A tagged structure can be referenced multiple times. Here, v is defined to be a vector containing the same cons cell in all three slots:

(define v #(#1=(cons 1 2) #1# #1#))

A tag #n= must appear to the left of all references #n#, and all references must appear in the same top-level datum as the tag. By default, MzScheme's printer will display a value without showing the shared structure:

#((1 . 2) (1 . 2) (1 . 2))

Graph reading and printing are controlled with the read-accept-graph and print-graph boolean parameters (see section 7.9.1.4). Graph reading is enabled by default, and graph printing is disabled by default. However, when the printer encounters a graph containing a cycle, graph printing is automatically enabled, temporarily. (For this reason, the display, write, and print procedures require memory proportional to the depth of the value being printed.) When graph reading is disabled and a graph is provided as input, the exn:fail:read exception is raised.

If the n in a #n= form or a #n# form contains more than eight digits, the exn:fail:read exception is raised. If a #n# form is not preceded by a #n= form using the same n, the exn:fail:read exception is raised. If two #n= forms are in the same expression for the same n, the exn:fail:read exception is raised.

11.2.6  Replacing the Reader

Each input port has its own port read handler. This handler is invoked to read from the port when the built-in read or read-syntax procedure is applied to the port.37 A port read handler is applied to either one argument or two arguments:

A port's read handler is configured with port-read-handler:

The default port read handler reads standard Scheme expressions with MzScheme's built-in parser (see section 11.2.4). It handles a special result from a custom input port (see section 11.1.7.1) by treating it as a single expression, except that special-comment values (see section 11.2.9.1) are treated as whitespace.

The read and read-syntax procedures themselves can be customized through a readtable; see section 11.2.8 for more information.

11.2.7  Replacing the Printer

Each output port has its own port display handler, port write handler, and port print handler. These handlers are invoked to output to the port when the standard display, write or print procedure is applied to the port. A port display/write/print handler takes a two arguments: the value to be printed and the destination port. The handler's return value is ignored.

The default port display and write handlers print Scheme expressions with MzScheme's built-in printer (see section 11.2.5). The default print handler calls the global port print handler (the value of the global-port-print-handler parameter; see section 7.9.1.2); the default global port print handler is the same as the default write handler.

11.2.8  Customizing the Reader through Readtables

A readtable configures MzScheme's built-in reader by adjusting the way that individual characters are parsed. MzScheme readtables are just like readtables in Common LISP, except that an individual readtable is immutable, and the procedures for creating and inspecting readtables are somewhat different than the Common LISP procedures.

The readtable is consulted at specific times by the reader:

In particular, after parsing a character that is mapped to the default behavior of semi-colon (``;''), the readtable is ignored until the comment's terminating newline is discovered. Similarly, the readtable does not affect string parsing until a closing double-quote is found. Meanwhile, if a character is mapped to the default behavior of an open parenthesis (``(''), then it starts sequence that is closed by any character that is mapped to a close parenthesis (``)''). An apparent exception is that the default parsing of a vertical bar (``|'') quotes a symbol until a matching character is found, but the parser is simply using the character that started the quote; it does not consult the readtable.

For many contexts, #f identifies the default readtable for MzScheme. In particular, #f is the initial value for the current-readtable parameter (see section 7.9.1.3), which causes the reader to behave as described in section 11.2.4. Adjust MzScheme's default reader by setting the current-readtable parameter to a readtable created with make-readtable.

(make-readtable readtable [char-or-false symbol-or-char readtable-or-proc ···1]) creates a new readtable that is like readtable (which can be #f), except that the reader's behavior is modified for each char according to the given symbol-or-char and readtable-or-proc. The ···1 for make-readtable applies to all three of char, symbol-or-char, and readtable-or-proc; in other words, the total number of arguments to make-readtable must be one modulo three.

The possible combinations for char-or-false, symbol-or-char, and readtable-or-proc are as follows:

If multiple 'dispatch-macro mappings are provided for a single char-or-false, all but the last one are ignored. Similarly, if multiple non-'dispatch-macro mappings are provided for a single char-or-false, all but the last one are ignored.

A reader macro proc must accept six arguments, and it can optionally accept two arguments. See section 11.2.9 for information on the procedure's arguments and results.

A reader macro normally reads characters from the given input port to produce a value to be used as the ``reader macro-expansion'' of the consumed characters. The reader macro might produce a special-comment value to cause the consumed character to be treated as whitespace, and it might use read/recursive or read-syntax/recursive; see section 11.2.9.1 and section 11.2.9.2 for more information on these topics.

(readtable-mapping readtable char), where readtable is not #f, produces information about the mappings in readtable for char. The result is three values:

Note that reader-macro procedures for the default readtable are not directly accessible. To invoke default behaviors, use read/recursive or read-syntax/recursive (see section 11.2.9.2) with a character and the #f readtable.

Extended example:

;; Provides raise-read-error and raise-read-eof-error
(require (lib "readerr.ss" "syntax"))

(define (skip-whitespace port)
  ;; Skips whitespace characters, sensitive to the current
  ;; readtable's definition of whitespace
  (let ([ch (peek-char port)])
    (unless (eof-object? ch)
      ;; Consult current readtable:
      (let-values ([(like-ch/sym proc dispatch-proc) 
                    (readtable-mapping (current-readtable) ch)])
        ;; If like-ch/sym is whitespace, then ch is whitespace
        (when (and (char? like-ch/sym)
                   (char-whitespace? like-ch/sym))
          (read-char port)
          (skip-whitespace port))))))

(define (skip-comments read-one port src)
  ;; Recursive read, but skip comments and detect EOF
  (let loop ()
    (let ([v (read-one)])
      (cond
       [(special-comment? v) (loop)]
       [(eof-object? v)
        (let-values ([(l c p) (port-next-location port)])
          (raise-read-eof-error "unexpected EOF in tuple" src l c p 1))]
       [else v]))))

(define (parse port read-one src)
  ;; First, check for empty tuple
  (skip-whitespace port)
  (if (eq? #\> (peek-char port))
      null
      (let ([elem (read-one)])
        (if (special-comment? elem)
            ;; Found a comment, so look for > again
            (parse port read-one src)
            ;; Non-empty tuple:
            (cons elem
                  (parse-nonempty port read-one src))))))

(define (parse-nonempty port read-one src)
  ;; Need a comma or closer
  (skip-whitespace port)
  (case (peek-char port)
    [(#\>) (read-char port)
     ;; Done
     null]
    [(#\,) (read-char port)
     ;; Read next element and recur
     (cons (skip-comments read-one port src)
           (parse-nonempty port read-one src))]
    [else
     ;; Either a comment or an error; grab location (in case
     ;; of error) and read recursively to detect comments
     (let-values ([(l c p) (port-next-location port)]
                  [(v) (read-one)])
       (cond
        [(special-comment? v)
         ;; It was a comment, so try again
         (parse-nonempty port read-one src)]
        [else
         ;; Wasn't a comment, comma, or closer; error
         ((if (eof-object? v) raise-read-eof-error raise-read-error)
          "expected `,' or `>'" src l c p 1)]))]))

(define (make-delims-table)
  ;; Table to use for recursive reads to disallow delimiters
  ;;  (except those in sub-expressions)
  (letrec ([misplaced-delimiter 
            (case-lambda
             [(ch port) (unexpected-delimiter ch port #f #f #f #f)]
             [(ch port src line col pos)
              (raise-read-error 
               (format "misplaced `~a' in tuple" ch) src line col pos 1)])])
    (make-readtable (current-readtable)
                    #\, 'terminating-macro misplaced-delimiter
                    #\> 'terminating-macro misplaced-delimiter)))

(define (wrap l) 
  `(make-tuple (list ,@l)))

(define parse-open-tuple
  (case-lambda
   [(ch port) 
    ;; `read' mode
    (wrap (parse port 
                 (lambda () (read/recursive port #f 
                                            (make-delims-table)))
                 (object-name port)))]
   [(ch port src line col pos)
    ;; `read-syntax' mode
    (datum->syntax-object
     #f
     (wrap (parse port 
                  (lambda () (read-syntax/recursive src port #f 
                                                    (make-delims-table)))
                  src))
     (let-values ([(l c p) (port-next-location port)])
       (list src line col pos (and pos (- p pos)))))]))
    

(define tuple-readtable
  (make-readtable #f #\< 'terminating-macro parse-open-tuple))

(parameterize ([current-readtable tuple-readtable])
  (read (open-input-string "<1 , 2 , \"a\">")))
;; => '(make-tuple (list 1 2 "a"))

(parameterize ([current-readtable tuple-readtable])
  (read (open-input-string "< #||# 1 #||# , #||# 2 #||# , #||# \"a\" #||# >")))
;; => '(make-tuple (list 1 2 "a"))

(define tuple-readtable+
  (make-readtable tuple-readtable
                  #\* 'terminating-macro (lambda a (make-special-comment #f))
                  #\_ #\space #f))
(parameterize ([current-readtable tuple-readtable+])
  (read (open-input-string "< * 1 __,__  2 __,__ * \"a\" * >")))
;; => '(make-tuple (list 1 2 "a"))

11.2.9  Reader-Extension Procedures

MzScheme's reader can be extended in three ways: through a reader-macro procedure in a readtable (see section 11.2.8), through a #reader form (see section 11.2.4), or through a custom-port byte reader that returns a ``special'' result procedure (see section 11.1.7.1). All three kinds of procedures accept similar arguments, and their results are treated in the same way by read and read-syntax (or, more precisely, by the default read handler; see section 11.2.6).

Calls to these reader-extension procedures can be triggered through read, read/recursive, read-syntax, or read-honu-syntax. In addition, a special-read procedure can be triggered by calls to read-honu, read-honu/recursive, read-honu-syntax, read-honu-syntax/recursive, read-char-or-special, or by the context of read-bytes-avail!, read-bytes-avail!*, read-bytes-avail!, and peek-bytes-avail!*.

Optional arities for reader-macro and special-result procedures allow them to distinguish reads via read, etc. from reads via read-syntax, etc. in the case that the source value is #f and no other location information is available.

Procedure arguments

A reader-macro procedure must accept six arguments, and it can optionally accept two arguments. The first two arguments are always the character that triggered the reader macro and the input port for reading. When the reader macro is triggered by read-syntax (or read-syntax/recursive), the procedure is passed four additional arguments that represent a source location. When the reader macro is triggered by read (or read/recursive), the procedure is passed only two arguments if it accepts two arguments, otherwise it is passed six arguments where the last four are all #f.

A #reader-loaded procedure accepts the same arguments as either read or read-syntax, depending on whether the procedure was loaded through read, etc. or through read-syntax, etc.

A special-result procedure must accept four arguments, and it can optionally accept zero arguments. When the special read is triggered by read-syntax (or read-honu-syntax, read-syntax/recursive, etc.), the procedure is passed four arguments that represent a source location. When the special read is triggered by read (or read-char-or-special, read-honu, read/syntax, etc.), the procedure is passed no arguments if it accepts zero arguments, otherwise it is passed four arguments that are all #f.

Procedure result

When a reader-extension procedure is called in syntax-reading mode (via read-syntax, etc.), it should generally return a syntax object that has no lexical context (e.g., a syntax object created using datum->syntax-object with #f as the first argument and with the given location information as the third argument). Another possible result is a special-comment value (see section 11.2.9.1). If the procedure's result is not a syntax object and not a special-comment value, it is converted to one using datum->syntax-object.

When a reader-extension procedure is called in non-syntax-reading modes, it should generally not return a syntax object. If a syntax object is returned, it is converted to a plain value using syntax-object->datum.

In either context, when the result from a reader-extension procedure is a special-comment value (see section 11.2.9.1), then read, read-syntax, etc. treat the value as a delimiting comment and otherwise ignore it.

Also in either context, the result may be copied to prevent mutation to pairs, vectors, or boxes before the read result is completed, and to support the construction of graphs with cycles. Mutable pairs, boxes, and vectors are copied, along with any pairs, boxes, or vectors that lead to such mutable values, to placeholders produced by a recursive read (see section 11.2.9.2), or to references of a shared value. Graph structure (including cycles) is preserved in the copy.

11.2.9.1  Special Comments

(make-special-comment v) creates a special-comment value that encapsulates v. The read, read-syntax, etc. procedures treat values constructed with make-special-comment as delimiting whitespace when returned by a reader-extension procedure (see section 11.2.9).

(special-comment? v) returns #t if v is the result of make-special-comment, #f otherwise.

(special-comment-value sc) returns the value encapsulated by the special-comment value sc. This value is never used directly by a reader, but it might be used by the context of a read-char-or-special, etc. call that detects a special comment.

11.2.9.2  Recursive Reads

(read/recursive [input-port char-or-false readtable]) is similar to calling read, but it is normally used during the dynamic extent of read within a reader-extension procedure (see section 11.2.9). The main effect of using read/recursive instead of read is that graph-structure annotations (see section 11.2.5.1) in the nested read are considered part of the overall read. Since the result is wrapped in a placeholder, however, it is not directly inspectable.

If char-or-false is provided and not #f, it is effectively prefixed to the beginning of input-port's stream for the read. (To prefix multiple characters, use input-port-append from MzLib's port library; see Chapter 35 in PLT MzLib: Libraries Manual.)

The readtable argument, which defaults to (current-readtable), is used for top-level parsing to satisfy the read request; recursive parsing within the read (e.g., to read the elements of a list) instead uses the current readtable as determined by the current-readtable parameter. A reader macro might call read/recursive with a character and readtable to effectively invoke the readtable's behavior for the character. If readtable is #f, the default readtable is used for top-level parsing.

When called within the dynamic extent of read, the read/recursive procedure produces either an opaque placeholder value, a special-comment value, or an end-of-file. The result is a special-comment value (see section 11.2.9.1) when the input stream's first non-whitespace content parses as a comment. The result is end-of-file when read/recursive encounters an end-of-file. Otherwise, the result is a placeholder that protects graph references that are not yet resolved. When this placeholder is returned within an S-expression that is produced by any reader-extension procedure (see section 11.2.9) for the same outermost read, it will be replaced with the actual read value before the outermost read returns.

(read-syntax/recursive [source-name-v input-port char-or-false readtable]) is analogous to calling read/recursive, but the resulting value encapsulates S-expression structure with source-location information. As with read/recursive, when read-syntax/recursive is used within the dynamic extent of read-syntax, the result of from read-syntax/recursive is either a special-comment value, end-of-file, or opaque graph-structure placeholder (not a syntax object). The placeholder can be embedded in an S-expression or syntax object returned by a reader macro, etc., and it will be replaced with the actual syntax object before the outermost read-syntax returns.

Using read/recursive within the dynamic extent of read-syntax does not allow graph structure for reading to be included in the outer read-syntax parsing, and neither does using read-syntax/recursive within the dynamic extent of read. In those cases, read/recursive and read-syntax/recursive produce results like read and read-syntax.

See section 11.2.8 for an extended example that uses read/recursive and read-syntax/recursive.

11.2.10  Customizing the Printer through Custom-Write Procedures

The built-in prop:custom-write structure type property associates a procedures to a structure type. The procedure is used by the default printer to display or write (or print) instances of the structure type.

See section 4.4 for general information on structure type properties.

The procedure for a prop:custom-write value takes three arguments: the structure to be printed, the target port, and a boolean that is #t for write mode and #f for display mode. The procedure should print the value to the given port using write, display, fprintf, write-special, etc.

The write handler, display handler, and print handler are specially configured for a port given to a custom-write procedure. Printing to the port through display, write, or print prints a value recursively with sharing annotations. To avoid a recursive print (i.e., to print without regard to sharing with a value currently being printed), print instead to a string or pipe and transfer the result to the target port using write-string and write-special. To recursively print but to a port other than the one given to the custom-write procedure, copy the given port's write handler, display handler, and print handler to the other port.

The port given to a custom-write handler is not necessarily the actual target port. In particular, to detect cycles and sharing, the printer invokes a custom-write procedure with a port that records recursive prints, and does not retain any other output.

Recursive print operations may trigger an escape from the call to the custom-write procedure (e.g., for pretty-printing where a tentative print attempt overflows the line, or for printing error output of a limited width).

The following example definition of a tuple type includes custom-write procedures that print the tuple's list content using angle brackets in write mode and no brackets in display mode. Elements of the tuple are printed recursively, so that graph and cycle structure can be represented.

(define (tuple-print tuple port write?)
  (when write? (write-string "<" port))
  (let ([l (tuple-ref tuple 0)])
    (unless (null? l)
      ((if write? write display) (car l) port)
      (for-each (lambda (e)
                  (write-string ", " port)
                  ((if write? write display) e port))
                (cdr l))))
  (when write? (write-string ">" port)))

(define-values (s:tuple make-tuple tuple? tuple-ref tuple-set!)
  (make-struct-type 'tuple #f 1 0 #f
                    (list (cons prop:custom-write tuple-print))))

(display (make-tuple '(1 2 "a"))) ; prints 1, 2, a

(let ([t (make-tuple (list 1 2 "a"))])
  (set-car! (tuple-ref t 0) t)
  (write t))  ; prints #0=<#0#, 2, "a">

11.3  Filesystem Utilities

MzScheme provides many operations for accessing and modifying filesystems in a (mostly) platform-independent manner. Additional filesystem utilities are in MzLib; see also Chapter 20 in PLT MzLib: Libraries Manual.

11.3.1  Paths

The format of a filesystem path varies across platforms. For example, under Unix, directories are separated by ``/'' while Windows uses both ``/'' and ``\''. Furthermore, for most Unix filesystems, the true name of a file is a byte string, but users prefer to see the bytes decoded in a locale-specific way when the filename is printed. MzScheme therefore provides a path datatype for managing filesystem paths, and procedures such as build-path, path->string, and bytes->path for manipulating paths.

When a MzScheme procedure takes a filesystem path as an argument, the path can be provided either as a string or as an instance of the path datatype. If a string is provided, it is converted to a path using string->path. A MzScheme procedure that generates a filesystem path always generates a path value.

By default, paths are created and manipulated for the current platform, but procedures that merely manipulate paths (without using the filesystem) can manipulate paths using conventions for other supported platforms. The bytes->path procedure accepts an optional argument that indicates the platform for the path, either 'unix or 'windows. For other functions, such as build-path or simplify-path, the behavior is sensitive to the kind of path that is supplied. Unless otherwise specified, a procedure that requires a path accepts only paths for the current platform.

Two path values are equal? when they are use the same convention type and when their byte-string representations are equal?. A path string (or byte string) cannot be empty, and it cannot contain a nul character or byte. When an empty string or a string containing nul is provided as a path to any procedure except absolute-path?, relative-path?, or complete-path? the exn:fail:contract exception is raised.

Most MzScheme primitives that take path perform an expansion on the path before using it. Procedures that build paths or merely check the form of a path do not perform this expansion, with the exception of simplify-path for Windows paths. For more information about path expansion and other platform-specific details, see section 20.1 for Unix and Mac OS X paths and section 20.2 for Windows paths.

The basic path utilities are as follows:

11.3.2  Locating Paths

The find-system-path and find-executable-path procedures locate useful files and directories:

11.3.3  Files

The file management utilities are:

11.3.4  Directories

The directory management utilities are:

11.4  Networking

MzScheme supports networking with the TCP and UDP protocols.

11.4.1  TCP

For information about TCP in general, see TCP/IP Illustrated, Volume 1 by W. Richard Stevens.

11.4.2  UDP

For information about UDP in general, see TCP/IP Illustrated, Volume 1 by W. Richard Stevens (which discusses UDP in addition to TCP).


30 63 is the same as (char->integer #\?).

31 Flushing is performed by the default port read handler (see section 11.2.6) rather than by read itself.

32 This non-byte result is not intended to return a character or eof; in particular, read-char raises an exception if it encounters a non-byte from a port.

33 More precisely, the procedure is used by the default port read handler; see also section 11.2.6.

34 A temporary string of size k is allocated while reading the input, even if the size of the result is less than k characters.

35 Only mid-stream eofs can be committed. A eof when the port is exhausted does not correspond to data in the stream.

36 Assuming that the current port display and write handlers are the default ones; see section 11.2.7 for more information.

37 The port read handler is not used for read/recursive or read-syntax/recursive.

38 For MrEd, the executable path is the name of a MrEd executable.

39 Under Windows, file-exists? reports #t for all variations of the special filenames (e.g., "LPT1", "x:/baddir/LPT1").

40 For FAT filesystems under Windows, directories do not have modification dates. Therefore, the creation date is returned for a directory (but the modification date is returned for a file).

41 MzScheme implements a listener with multiple sockets, if necessary, to accomodate multiple addresses with different protocol families. Under Linux, if hostname-string-or-false maps to both IPv4 and IPv6 addresses, then the behavior depends on whether IPv6 is supported and IPv6 sockets can be configured to listen to only IPv6 connections: if IPv6 is not supported or IPv6 sockets are not configurable, then the IPv6 addresses are ignored; otherwise, each IPv6 listener accepts only IPv6 connections.

42 If hostname-string is associated with multiple addresses, they are tried one at a time until a connection succeeds. The name "localhost" generally specifies the local machine.

43 The TCP protocol does not include a ``no longer reading'' state on connections, so tcp-abandon-port is equivalent to close-input-port on input TCP ports.

44 For most machines, the answer corresponds to the current machine's only Internet address. But when a machine serves multiple addresses, the result is connection-specific.