Chapter 11

Input and Output

11.1 Ports

By definition, ports in MzScheme produce and consume bytes. When a port is provided to a character-based operation, such as read, the port's bytes are read and interpreted as a UTF-8 encoding of characters (see also section 1.2.3). Thus, reading a single character may require reading multiple bytes, and a procedure like char-ready? may need to peek several bytes into the stream to determine whether a character is available. In the case of a byte stream that does not correspond to a valid UTF-8 encoding, functions such as read-char may need to peek one byte ahead in the stream to discover that the stream is not a valid encoding.

When an input port produces a sequence of bytes that is not a valid UTF-8 encoding in a character-reading context, then bytes that constitute an invalid sequence are converted to the character ``?''. Specifically, bytes 255 and 254 are always converted to ``?'', bytes in the range 192 to 253 produce ``?'' when they are not followed by bytes that form a valid UTF-8 encoding, and bytes in the range 128 to 191 are converted to ``?'' when they are not part of a valid encoding that was started by a preceding byte in the range 192 to 253. To put it another way, when reading a sequence of bytes as characters, a minimal set of bytes are changed to 63 ³⁰ so that the entire sequence of bytes is a valid UTF-8 encoding.

See section 3.6 for procedures that facilitate conversions using UTF-8 or other encodings. See also reencode-input-port and reencode-output-port in Chapter 35 in PLT MzLib: Libraries Manual for obtaining a UTF-8-based port from one that uses a different encoding of characters.

(port? v) returns #t if either (input-port? v) or (output-port? v) is #t, #f otherwise.

(port-closed? port) returns #t if the input or output port port is closed, #f otherwise.

(file-stream-port? port) returns #t if the given port is a file-stream port (see section 11.1.6, #f otherwise.

(terminal-port? port) returns #t if the given port is attached to an interactive terminal, #f otherwise.

11.1.1 End-of-File Constant

The global variable eof is bound to the end-of-file value. The standard Scheme predicate eof-object? returns #t only when applied to this value.

Reading from a port produces an end-of-file result when the port has no more data, but some ports may also return end-of-file mid-stream. For example, a port connected to a Unix terminal returns an end-of-file when the user types control-d; if the user provides more input, the port returns additional bytes after the end-of-file.

11.1.2 Current Ports

The standard Scheme procedures current-input-port and current-output-port are implemented as parameters in MzScheme. See section 7.9.1.2 for more information.

11.1.3 Opening File Ports

The open-input-file and open-output-file procedures accept an optional flag argument after the filename that specifies a mode for the file:

'binary -- bytes are returned from the port exactly as they are read from the file. Binary mode is the default mode.
'text -- return and linefeed bytes (10 and 13) are written to and read from the file are filtered by the port in a platform specific manner:
- Unix and Mac OS X: no filtering occurs.
- Windows reading: a return-linefeed combination from a file is returned by the port as a single linefeed; no filtering occurs for return bytes that are not followed by a linefeed, or for a linefeed that is not preceded by a return.
- Windows writing: a linefeed written to the port is translated into a return-linefeed combination in the file; no filtering occurs for returns.
In Windows, 'text mode works only with regular files; attempting to use 'text with other kinds of files triggers an exn:fail:filesystem exception.

The open-output-file procedure can also take a flag argument that specifies how to proceed when a file with the specified name already exists:

'error -- raise exn:fail:filesystem (this is the default)
'replace -- remove the old file and write a new one
'truncate -- overwrite the old data
'truncate/replace -- try 'truncate; if it fails, try 'replace
'append -- append to the end of the file under Unix and Mac OS X; under Windows, 'append is equivalent to 'update, except that the file position is immediately set to the end of the file after opening it
'update -- open an existing file without truncating it; if the file does not exist, the exn:fail:filesystem exception is raised

The open-input-output-file procedure takes the same arguments as open-output-file, but it produces two values: an input port and an output port. The two ports are connected in that they share the underlying file device. This procedure is intended for use with special devices that can be opened by only one process, such as COM1 in Windows. For regular files, sharing the device can be confusing. For example, using one port does not automatically flush the other port's buffer (see section 11.1.6 for more information about buffers), and reading or writing in one port moves the file position (if any) for the other port. For regular files, use separate open-input-file and open-output-file calls to avoid confusion.

Extra flag arguments are passed to open-output-file in any order. Appropriate flag arguments can also be passed as the last argument(s) to call-with-input-file, with-input-from-file, call-with-output-file, and with-output-to-file. When conflicting flag arguments (e.g., both 'error and 'replace) are provided to open-output-file, with-output-to-file, or call-with-output-file, the exn:fail:contract exception is raised.

Both with-input-from-file and with-output-to-file close the port they create if control jumps out of the supplied thunk (either through a continuation or an exception), and the port remains closed if control jumps back into the thunk. The current input or output port is installed and restored with parameterize (see section 7.9.2).

See section 11.1.6 for more information on file ports. When an input or output file-stream port is created, it is placed into the management of the current custodian (see section 9.2).

11.1.4 Pipes

(make-pipe [limit-k input-name-v output-name-v]) returns two port values (see section 2.2): the first port is an input port and the second is an output port. Data written to the output port is read from the input port. The ports do not need to be explicitly closed.

The optional limit-k argument can be #f or a positive exact integer. If limit-k is omitted or #f, the new pipe holds an unlimited number of unread bytes (i.e., limited only by the available memory). If limit-k is a positive number, then the pipe will hold at most limit-k unread/unpeeked bytes; writing to the pipe's output port thereafter will block until a read or peek from the input port makes more space available. (Peeks effectively extend the port's capacity until the peeked bytes are read.)

The optional input-name-v and output-name-v are used as the names for the returned input and out ports, respectively, if they are supplied. Otherwise, the name of each port is 'pipe.

(pipe-content-length pipe-port) returns the number of bytes contained in a pipe, where pipe-port is either of the pipe's ports produced by make-pipe. The pipe's content length counts all bytes that have been written to the pipe and not yet read (though possibly peeked).

11.1.5 String Ports

Scheme input and output can be read from or collected into a string or byte string:

(open-input-bytes bytes [name-v]) creates an input port that reads characters from bytes (see section 3.6). Modifying bytes afterward does not affect the byte stream produced by the port. The optional name-v argument is used as the name for the returned port; the default is 'string.
(open-input-string string [name-v]) creates an input port that reads bytes from the UTF-8 encoding (see section 1.2.3) of string. The optional name-v argument is used as the name for the returned port; the default is 'string.
(open-output-bytes [name-v]) creates an output port that accumulates the output into a byte string. The optional name-v argument is used as the name for the returned port; the default is 'string.
(open-output-string [name-v]) creates an output port that accumulates the output into a byte string. This procedure is the same as open-output-bytes.
(get-output-bytes string-output-port [reset? start-k end-k]) returns the bytes accumulated in string-output-port so far in a freshly-allocated byte string (including any bytes written after the port's current position, if any). If reset? is true, then all bytes are removed from the port, and the port's position is reset to 0; if reset? is #f (the default), then all bytes remain in the port for further accumulation (so they are returned for later calls to get-output-bytes or get-output-string), and the port's position is unchanged. The start-k and end-k arguments specify the range of bytes in the port to return; supplying start-k and end-k is the same as using subbytes on the result of get-output-bytes, but supplying them to get-output-bytes can avoid an allocation. The end-k argument can be #f, which corresponds to not passing a second argument to subbytes.
(get-output-string string-output-port) returns (bytes->string/utf-8 (get-output-bytes string-output-port) #\?); see also section 3.6.

String input and output ports do not need to be explicitly closed. The file-position procedure, described in section 11.1.6, works for string ports in position-setting mode.

Example:

(define i (open-input-string "hello world"))
(define o (open-output-string))
(write (read i) o)
(get-output-string o) ; => "hello"

11.1.6 File-Stream Ports

A port created by open-input-file, open-output-file, subprocess, and related functions is a file-stream port. The initial input, output, and error ports in stand-alone MzScheme are also file-stream ports. The file-stream-port? predicate recognizes file-stream ports.

An input port is block buffered by default, which means that on any read, the buffer is filled with immediately-available bytes to speed up future reads. Thus, if a file is modified between a pair of reads to the file, the second read can produce stale data. Calling file-position to set an input port's file position flushes its buffer.

Most output ports are block buffered by default, but a terminal output port is line buffered, and the error output port is unbuffered. An output buffer is filled with a sequence of written bytes to be committed as a group, either when the buffer is full (in block mode) or when a newline is written (in line mode).

A port's buffering can be changed via file-stream-buffer-mode (described below). The two ports produced by open-input-output-file have independent buffers.

The following procedures work primarily on file-stream ports:

(flush-output [output-port]) forces all buffered data in the given output port to be physically written. If output-port is omitted, then the current output port is flushed. Only file-stream ports and custom ports (see section 11.1.7) use buffers; when called on a port without a buffer, flush-output has no effect.

By default, a file-stream port is block-buffered, but this behavior can be modified with file-stream-buffer-mode. In addition, the initial current output and error ports are automatically flushed when read³¹, read-line, read-bytes, read-string, etc. are performed on the initial standard input port.
(file-stream-buffer-mode port [mode-symbol]) gets or sets the buffer mode for port, if possible. All file-stream ports support setting the buffer mode, TCP ports (see section 11.4) support setting and getting the buffer mode, and custom ports (see section 11.1.7) may support getting and setting buffer modes.

If mode-symbol is provided, it must be one of 'none, 'line (output only), or 'block, and the port's buffering is set accordingly. If the port does not support setting the mode, the exn:fail exception is raised.

If mode-symbol is not provided, the current mode is returned, or #f is returned if the mode cannot be determined. If file-stream-port is an input port and mode-symbol is 'line, the exn:fail:contract exception is raised.

For an input port, peeking always places peeked bytes into the port's buffer, even when the port's buffer mode is 'none; furthermore, on some platforms, testing the port for input (via char-ready? or sync) may be implemented with a peek. If an input port's buffer mode is 'none, then at most one byte is read for read-bytes-avail!*, read-bytes-avail!, peek-bytes-avail!*, or peek-bytes-avail!; if any bytes are buffered in the port (e.g., to satisfy a previous peek), the procedures may access multiple buffered bytes, but no further bytes are read.
(file-position port) returns the current read/write position of port. For file-stream and string ports, (file-position port k-or-eof) sets the read/write position to k-or-eof relative to the beginning of the file/string if k-or-eof is a number, or to the current end of the file/string if k-or-eof is eof. In position-setting mode, file-position raises the exn:fail:contract exception for port kinds other than file-stream and string ports. Calling file-position without a position on a non-file/non-string input port returns the number of bytes that have been read from that port if the position is known (see section 11.2.1.1), otherwise the exn:fail:filesystem exception is raised.

When (file-position port k) sets the position k beyond the current size of an output file or string, the file/string is enlarged to size k and the new region is filled with #\nul. If k is beyond the end of an input file or string, then reading thereafter returns eof without changing the port's position.

Not all file-stream ports support setting the position. If file-position is called with a position argument on such a file-stream port, the exn:fail:filesystem exception is raised.

When changing the file position for an output port, the port is first flushed if its buffer is not empty. Similarly, setting the position for an input port clears the port's buffer (even if the new position is the same as the old position). However, although input and output ports produced by open-input-output-file share the file position, setting the position via one port does not flush the other port's buffer.
(port-file-identity file-stream-port) returns an exact positive integer that represents the identity of the device and file read or written by file-stream-port. For two ports whose open times overlap, the result of port-file-identity is the same for both ports if and only if the ports access the same device and file. For ports whose open times do not overlap, no guarantee is provided for the port identities (even if the ports actually access the same file) -- except as can be inferred through relationships with other ports. If file-stream-port is closed, the exn:fail exception is raised. Under Windows 95, 98, and Me, if file-stream-port is connected to a pipe instead of a file, the exn:fail:filesystem exception is raised.

11.1.7 Custom Ports

The make-input-port and make-output-port procedures create custom ports with arbitrary control procedures. Correctly implementing a custom port can be tricky, because it amounts to implementing a device driver. Custom ports are mainly useful to obtain fine control over the action of committing bytes as read or written.

Many simple port variations can be implemented using threads and pipes. For example, if get-next-char is a function that produces either a character or eof, it can be turned into an input port as follows

(let-values ([(r w) (make-pipe 4096)])
  ;; Create a thread to move chars from get-next-char to the pipe
  (thread (lambda () (let loop ()
                       (let ([v (get-next-char)])
                         (if (eof-object? v)
                             (close-output-port w)
                             (begin
                               (write-char v w)
                               (loop)))))))
   ;; Return the read end of the pipe
   r)

The port.ss in MzLib provides several other port constructors; see Chapter 35 in PLT MzLib: Libraries Manual.

11.1.7.1 Custom Input

(make-input-port name-v read-proc optional-peek-proc close-proc [optional-progress-evt-proc optional-commit-proc optional-location-proc count-lines!-proc init-position optional-buffer-mode-proc]) creates an input port. The port is immediately open for reading. If close-proc procedure has no side effects, then the port need not be explicitly closed.

name-v -- the name for the input port, which is reported by object-name (see section 6.2.3).
read-proc -- a procedure that takes a single argument: a mutable byte string to receive read bytes. The procedure's result is one of the following:
- the number of bytes read, as an exact, non-negative integer;
- eof;
- a procedure of arity four (representing a ``special'' result, as discussed further below) and optionally of arity zero, but a procedure result is allowed only when optional-peek-proc is not #f; or
- a synchronizable event (see section 7.7) that becomes ready when the read is complete (roughly): the event's value can one of the above three results or another event like itself; in the last case, a reading process loops with sync until it gets a non-event result.
The read-proc procedure must not block indefinitely. If no bytes are immediately available for reading, the read-proc must return 0 or an event, and preferably an event (to avoid busy waits). The read-proc should not return 0 (or an event whose value is 0) when data is available in the port, otherwise polling the port will behave incorrectly. An event result from an event can also break polling.

If the result of a read-proc call is not one of the above values, the exn:fail:contract exception is raised. If a returned integer is larger than the supplied byte string's length, the exn:fail:contract exception is raised. If optional-peek-proc is #f and a procedure for a special result is returned, the exn:fail:contract exception is raised.

The read-proc procedure can report an error by raising an exception, but only if no bytes are read. Similarly, no bytes should be read if eof, an event, or a procedure is returned. In other words, no bytes should be lost due to spurious exceptions or non-byte data.

A port's reading procedure may be called in multiple threads simultaneously (if the port is accessible in multiple threads), and the port is responsible for its own internal synchronization. Note that improper implementation of such synchronization mechanisms might cause a non-blocking read procedure to block indefinitely.

If optional-peek-proc, optional-progress-evt-proc, and optional-commit-proc are all provided and non-#f, then the following is an acceptable implementation of read-proc:
```
   (lambda (bstr)
     (let* ([progress-evt (progress-evt-proc)]
            [v (peek-proc bstr 0 progress-evt)])
       (cond
        [(sync/timeout 0 progress-evt) 0] ; try again
        [(evt? v) (wrap-evt v (lambda (x) 0))] ; sync, then try again
        [(and (number? v) (zero? v)) 0] ; try again
        [else
         (if (optional-commit-proc (if (number? v) v 1)
                                   progress-evt
                                   always-evt)
             v      ; got a result
             0)]))) ; try again
```
An implementor may choose not to implement the optional- procedures, however, and even an implementor who does supply optional- procedures may provide a different read-proc that uses a fast path for non-blocking reads.
optional-peek-proc -- either #f or a procedure that takes three arguments:
- a mutable byte string to receive peeked bytes;
- a non-negative number of bytes (or specials) to skip before peeking; and
- either #f or a progress event produced by optional-progress-evt-proc.
The results and conventions for optional-peek-proc are mostly the same as for read-proc. The main difference is in the handling of the progress event, if it is not #f. If the given progress event becomes ready, the optional-peek-proc must abort any skip attempts and not peek any values. In particular, optional-peek-proc must not peek any values if the progress event is initially ready.

Unlike read-proc, optional-peek-proc should produce #f (or an event whose value is #f) if no bytes were peeked because the progress event became ready. Like read-proc, a 0 result indicates that another attempt is likely to succeed, so 0 is inappropriate when the progress event is ready. Also like read-proc, optional-peek-proc must not block indefinitely.

The skip count provided to optional-peek-proc is a number of bytes (or specials) that must remain present in the port -- in addition to the peek results -- when the peek results are reported. If a progress event is supplied, then the peek is effectively canceled when another process reads data before the given number can be skipped. If a progress event is not supplied and data is read, then the peek must effectively restart with the original skip count.

The system does not check that multiple peeks return consistent results, or that peeking and reading produce consistent results.

If optional-peek-proc is #f, then peeking for the port is implemented automatically in terms of reads, but with several limitations. First, the automatic implementation is not thread-safe. Second, the automatic implementation cannot handle special results (non-byte and non-eof), so read-proc cannot return a procedure for a special when optional-peek-proc is #f. Finally, the automatic peek implementation is incompatible with progress events, so if optional-peek-proc is #f, then progress-evt-proc and optional-commit-proc must be #f. See also make-input-port/peek-to-read in Chapter 35 in PLT MzLib: Libraries Manual.
close-proc -- a procedure of zero arguments that is called to close the port. The port is not considered closed until the closing procedure returns. The port's procedures will never be used again via the port after it is closed. However, the closing procedure can be called simultaneously in multiple threads (if the port is accessible in multiple threads), and it may be called during a call to the other procedures in another thread; in the latter case, any outstanding reads and peeks should be terminated with an error.
optional-progress-evt-proc -- either #f (the default), or a procedure that takes no arguments and returns an event. The event must become ready only after data is next read from the port or the port is closed. After the event becomes ready, it must remain so. (See also semaphore-peek-evt in section 7.4.)

If optional-progress-evt-proc is #f, then port-provides-progress-evts? applied to the port will produce #f, and the port will not be a valid argument to port-progress-evt.
optional-commit-proc -- either #f (the default), or a procedure that takes three arguments:
- an exact, positive integer k_r;
- a progress event produced by optional-progress-evt-proc;
- an event, done-evt, that is either a channel-put event, channel, semaphore, semaphore-peek event, always event, or never event.
A commit corresponds to removing data from the stream that was previously peeked, but only if no other process removed data first. (The removed data does not need to be reported, because it has been peeked already.) More precisely, assuming that k_p bytes, specials, and mid-stream eofs have been previously peeked or skipped at the start of the port's stream, optional-commit-proc must satisfy the following constraints:
- It must return only when the commit is complete or when the given progress event becomes ready.
- It must commit only if k_p is positive.
- If it commits, then it must do so with either k_r items or k_p items, whichever is smaller, and only if k_p is positive.
- It must never choose done-evt in a synchronization after the given progress event is ready, or after done-evt has been synchronized once.
- It must not treat any data as read from the port unless done-evt is chosen in a synchronization.
- It must not block indefinitely if done-evt is ready; it must return soon after the read completes or soon after the given progress event is ready, whichever is first.
- It can report an error by raising an exception, but only if no data is committed. In other words, no data should be lost due to an exception, including a break exception.
- It must return a true value if data is committed, #f otherwise. When it returns a value, the given progress event must be ready (perhaps because data was just committed).
- It must raise an exception if no data (including eof) has been peeked from the beginning of the port's stream, or if it would have to block indefinitely to wait for the given progress event to become ready.
A call to optional-commit-proc is parameterize-breaked to disable breaks.
optional-location-proc -- either #f (the default), or a procedure that takes no arguments and returns three values: the line number for the next item in the port's stream (a positive number or #f), the column number for the next item in the port's stream (a non-negative number or #f), and the position for the next item in the port's stream (a positive number or #f). See also section 11.2.1.1.

This procedure is only called if line counting is enabled for the port via port-count-lines! (in which case count-lines!-proc is called). The read, read-syntax, read-honu, and read-honu-syntax procedures assume that reading a non-whitespace character increments the column and position by one.
count-lines!-proc -- a procedure of no arguments that is called if and when line counting is enabled for the port. The default procedure is void.
init-position -- an exact, positive integer that determines the position of the port's first item, used when line counting is not enabled for the port. The default is 1.
optional-buffer-mode-proc -- either #f (the default) or a procedure that accepts zero or one arguments. If optional-buffer-mode-proc is #f, then the resulting port does not support a buffer-mode setting. Otherwise, the procedure is called with one symbol argument ('block or 'none) to set the buffer mode, and it is called with zero arguments to get the current buffer mode. In the latter case, the result must be 'block, 'none, or #f (unknown). See section 11.1.6 for more information on buffer modes.

When read-proc or optional-peek-proc (or an event produced by one of these) returns a procedure, and the procedure is used to obtain a non-byte result.³² The procedure is called by read,³³ read-syntax, read-honu, read-honu-syntax, read-byte-or-special, read-char-or-special, peek-byte-or-special, or peek-char-or-special. The special-value procedure can return an arbitrary value, and it will be called zero or one times (not necessarily before further reads or peeks from the port). See section 11.2.9 for more details on the procedure's arguments and result.

If read-proc or optional-peek-proc returns a special procedure when called by any reading procedure other than read, read-syntax, read-honu, read-honu-syntax, read-char-or-special, peek-char-or-special, read-byte-or-special, or peek-byte-or-special, then the exn:fail:contract exception is raised.

Examples:

;; A port with no input...
;; Easy: (open-input-bytes #"")
;; Hard:
(define /dev/null-in 
  (make-input-port 'null
                   (lambda (s) eof)
                   (lambda (skip s progress-evt) eof)
                   void
                   (lambda () never-evt)
                   (lambda (k progress-evt done-evt)
                     (error "no successful peeks!"))))
(read-char /dev/null-in) ; => eof
(peek-char /dev/null-in) ; => eof
(read-byte-or-special /dev/null-in)     ; => eof
(peek-byte-or-special /dev/null-in 100) ; => eof

;; A port that produces a stream of 1s:
(define infinite-ones 
  (make-input-port
   'ones
   (lambda (s) 
     (bytes-set! s 0 (char->integer #\1)) 1)
   #f
   void))
(read-string 5 infinite-ones) ; => "11111"

;; But we can't peek ahead arbitrarily far, because the
;; automatic peek must record the skipped bytes:
(peek-string 5 (expt 2 5000) infinite-ones) ; => error: out of memory

;; An infinite stream of 1s with a specific peek procedure:
(define infinite-ones 
  (let ([one! (lambda (s) 
                (bytes-set! s 0 (char->integer #\1)) 1)])
    (make-input-port
     'ones
     one!
     (lambda (s skip progress-evt) (one! s))
     void)))
(read-string 5 infinite-ones) ; => "11111"

;; Now we can peek ahead arbitrarily far:
(peek-string 5 (expt 2 5000) infinite-ones) ; => "11111"

;; The port doesn't supply procedures to implement progress events:
(port-provides-progress-evts? infinite-ones) ; => #f
(port-progress-evt infinite-ones) ; error: no progress events

;; Non-byte port results:
(define infinite-voids
  (make-input-port
   'voids
   (lambda (s) (lambda args 'void))
   (lambda (skip s) (lambda args 'void))
   void))
(read-char infinite-voids) ; => error: non-char in an unsupported context
(read-char-or-special infinite-voids) ; => 'void

;; This port produces 0, 1, 2, 0, 1, 2, etc., but it is not
;; thread-safe, because multiple threads might read and change n.
(define mod3-cycle/one-thread
  (let* ([n 2]
         [mod! (lambda (s delta)
                 (bytes-set! s 0 (+ 48 (modulo (+ n delta) 3)))
                 1)])
    (make-input-port
     'mod3-cycle/not-thread-safe
     (lambda (s) 
       (set! n (modulo (add1 n) 3))
       (mod! s 0))
     (lambda (s skip) 
       (mod! s skip))
     void)))
(read-string 5 mod3-cycle/one-thread) ; => "01201"
(peek-string 5 (expt 2 5000) mod3-cycle/one-thread) ; => "20120"

;; Same thing, but thread-safe and kill-safe, and with progress
;; events. Only the server thread touches the stateful part
;; directly. (See the output port examples for a simpler thread-safe
;; example, but this one is more general.)
(define (make-mod3-cycle)
  (define read-req-ch (make-channel))
  (define peek-req-ch (make-channel))
  (define progress-req-ch (make-channel))
  (define commit-req-ch (make-channel))
  (define close-req-ch (make-channel))
  (define closed? #f)
  (define n 0)
  (define progress-sema #f)
  (define (mod! s delta)
    (bytes-set! s 0 (+ 48 (modulo (+ n delta) 3)))
    1)
  ;; ----------------------------------------
  ;; The server has a list of outstanding commit requests,
  ;;  and it also must service each port operation (read, 
  ;;  progress-evt, etc.)
  (define (serve commit-reqs response-evts)
    (apply
     sync
     (handle-evt read-req-ch (handle-read commit-reqs response-evts))
     (handle-evt progress-req-ch (handle-progress commit-reqs response-evts))
     (handle-evt commit-req-ch (add-commit commit-reqs response-evts))
     (handle-evt close-req-ch (handle-close commit-reqs response-evts))
     (append
      (map (make-handle-response commit-reqs response-evts) response-evts)
      (map (make-handle-commit commit-reqs response-evts) commit-reqs))))
  ;; Read/peek request: fill in the string and commit
  (define ((handle-read commit-reqs response-evts) r)
    (let ([s (car r)]
          [skip (cadr r)]
          [ch (caddr r)]
          [nack (cadddr r)]
          [peek? (cddddr r)])
      (unless closed?
        (mod! s skip)
        (unless peek?
          (commit! 1)))
      ;; Add an event to respond:
      (serve commit-reqs
             (cons (choice-evt nack
                               (channel-put-evt ch (if closed? 0 1)))
                   response-evts))))
  ;; Progress request: send a peek evt for the current 
  ;;  progress-sema
  (define ((handle-progress commit-reqs response-evts) r)
    (let ([ch (car r)]
          [nack (cdr r)])
      (unless progress-sema
        (set! progress-sema (make-semaphore (if closed? 1 0))))
      ;; Add an event to respond:
      (serve commit-reqs
             (cons (choice-evt nack
                               (channel-put-evt
                                ch
                                (semaphore-peek-evt progress-sema)))
                   response-evts))))
  ;; Commit request: add the request to the list
  (define ((add-commit commit-reqs response-evts) r)
    (serve (cons r commit-reqs) response-evts))
  ;; Commit handling: watch out for progress, in which case
  ;;  the response is a commit failure; otherwise, try
  ;;  to sync for a commit. In either event, remove the
  ;;  request from the list
  (define ((make-handle-commit commit-reqs response-evts) r)
    (let ([k (car r)]
          [progress-evt (cadr r)]
          [done-evt (caddr r)]
          [ch (cadddr r)]
          [nack (cddddr r)])
      ;; Note: we don't check that k is < the sum of
      ;;  previous peeks, because the entire stream is actually
      ;;  known, but we could send an exception in that case.
      (choice-evt
       (handle-evt progress-evt
                   (lambda (x) 
                     (sync nack (channel-put-evt ch #f))
                     (serve (remq r commit-reqs) response-evts)))
       ;; Only create an event to satisfy done-evt if progress-evt
       ;;  isn't already ready.
       ;; Afterward, if progress-evt becomes ready, then this
       ;;  event-making function will be called again, because
       ;;  the server controls all posts to progress-evt.
       (if (sync/timeout 0 progress-evt)
           never-evt
           (handle-evt done-evt
                       (lambda (v)
                         (commit! k)
                         (sync nack (channel-put-evt ch #t))
                         (serve (remq r commit-reqs) response-evts)))))))
  ;; Response handling: as soon as the respondee listens,
  ;;  remove the response
  (define ((make-handle-response commit-reqs response-evts) evt)
    (handle-evt evt
                (lambda (x)
                  (serve commit-reqs
                         (remq evt response-evts)))))
  ;; Close handling: post the progress sema, if any, and set
  ;;   the closed? flag
  (define ((handle-close commit-reqs response-evts) r)
    (let ([ch (car r)]
          [nack (cdr r)])
      (set! closed? #t)
      (when progress-sema
        (semaphore-post progress-sema))
      (serve commit-reqs
             (cons (choice-evt nack
                               (channel-put-evt ch (void)))
                   response-evts))))
  ;; Helper for reads and post-peek commits:
  (define (commit! k)
    (when progress-sema
      (semaphore-post progress-sema)
      (set! progress-sema #f))
    (set! n (+ n k)))
  ;; Start the server thread:
  (define server-thread (thread (lambda () (serve null null))))
  ;; ----------------------------------------
  ;; Client-side helpers:
  (define (req-evt f)
    (nack-guard-evt
     (lambda (nack)
       ;; Be sure that the server thread is running:
       (thread-resume server-thread (current-thread))
       ;; Create a channel to hold the reply:
       (let ([ch (make-channel)])
         (f ch nack)
         ch))))
  (define (read-or-peek-evt s skip peek?)
    (req-evt (lambda (ch nack)
               (channel-put read-req-ch (list* s skip ch nack peek?)))))
  ;; Make the port:
  (make-input-port 'mod3-cycle
                   ;; Each handler for the port just sends
                   ;;  a request to the server
                   (lambda (s) (read-or-peek-evt s 0 #f))
                   (lambda (s skip) (read-or-peek-evt s skip #t))
                   (lambda () ; close
                     (sync (req-evt
                            (lambda (ch nack)
                              (channel-put progress-req-ch (list* ch nack))))))
                   (lambda () ; progress-evt
                     (sync (req-evt
                            (lambda (ch nack)
                              (channel-put progress-req-ch (list* ch nack))))))
                   (lambda (k progress-evt done-evt)  ; commit
                     (sync (req-evt
                            (lambda (ch nack)
                              (channel-put commit-req-ch
                                           (list* k progress-evt done-evt ch nack))))))))

(let ([mod3-cycle (make-mod3-cycle)])
  (let ([result1 #f]
        [result2 #f])
    (let ([t1 (thread (lambda ()
                        (set! result1 (read-string 5 mod3-cycle))))]
          [t2 (thread (lambda ()
                        (set! result2 (read-string 5 mod3-cycle))))])
      (thread-wait t1)
      (thread-wait t2)
      (string-append result1 "," result2))) ; => "02120,10201", maybe
  (let ([s (make-bytes 1)]
        [progress-evt (port-progress-evt mod3-cycle)])
    (peek-bytes-avail! s 0 progress-evt mod3-cycle) ; => 1
    s                                    ; => #"1"
    (port-commit-peeked 1 progress-evt (make-semaphore 1)
                           mod3-cycle)   ; => #t
    (sync/timeout 0 progress-evt)        ; => progress-evt
    (peek-bytes-avail! s 0 progress-evt mod3-cycle) ; => 0
    (port-commit-peeked 1 progress-evt (make-semaphore 1) 
                           mod3-cycle))  ; => #f
  (close-input-port mod3-cycle))

11.1.7.2 Custom Output

(make-output-port name-v evt write-proc close-proc [optional-write-special-proc optional-write-evt-proc optional-special-evt-proc optional-location-proc count-lines!-proc init-position optional-buffer-mode-proc]) creates an output port. The port is immediately open for writing. If close-proc procedure has no side effects, then the port need not be explicitly closed. The port can buffer data within its write-proc and optional-write-special-proc procedures.

name-v -- the name for the output port, which is reported by object-name (see section 6.2.3).
evt -- a synchronization event (see section 7.7; e.g., a semaphore or another port). The event is used in place of the output port when the port is supplied to synchronization procedures like sync. Thus, the event should be unblocked when the port is ready for writing at least one byte without blocking, or ready to make progress in flushing an internal buffer without blocking. The event must not unblock unless the port is ready for writing; otherwise, the guarantees of sync will be broken for the output port. Use always-evt if writes to the port always succeed without blocking.
write-proc -- a procedure of five arguments:
- an immutable byte string containing bytes to write;
- a non-negative exact integer for a starting offset (inclusive) into the byte string;
- a non-negative exact integer for an ending offset (exclusive) into the byte string;
- a boolean; #f indicates that the port is allowed to keep the written bytes in a buffer, and that it is allowed to block indefinitely; #t indicates that the write should not block, and that the port should attempt to flush its buffer and completely write new bytes instead of buffering them;
- a boolean; #t indicates that if the port blocks for a write, then it should enable breaks while blocking (e.g., using sync/enable-break; this argument is always #f if the fourth argument is #t.
The procedure returns one of the following:
- a non-negative exact integer representing the number of bytes written or buffered;
- #f if no bytes could be written, perhaps because the internal buffer could not be completely flushed;
- a synchronizable event (see section 7.7) that acts like the result of write-bytes-avail-evt to complete the write.
Since write-proc can produce an event, an acceptable implementation of write-proc is to pass its first three arguments to the port's optional-write-evt-proc. Some port implementors, however, may choose not to provide optional-write-evt-proc (perhaps because writes cannot be made atomic), or may implement write-proc to enable a fast path for non-blocking writes or to enable buffering.

From a user's perspective, the difference between buffered and completely written data is (1) buffered data can be lost in the future due to a failed write, and (2) flush-output forces all buffered data to be completely written. Under no circumstances is buffering required.

If the start and end indices are the same, then the fourth argument to write-proc will be #f, and the write request is actually a flush request for the port's buffer (if any), and the result should be 0 for a successful flush (or if there is no buffer).

The result should never be 0 if the start and end indices are different, otherwise the exn:fail:contract exception is raised. If a returned integer is larger than the supplied byte-string range, the exn:fail:contract exception is raised.

The #f result should be avoided, unless the next write attempt is likely to work. Otherwise, if data cannot be written, return an event instead.

An event returned by write-proc can return #f or another event like itself, in contrast to events produced by write-bytes-avail-evt or optional-write-evt-proc. A writing process loops with sync until it obtains a non-event result.

The write-proc procedure is always called with breaks disabled, independent of whether breaks were enabled when the write was requested by a client of the port. If breaks were enabled for a blocking operation, then the fifth argument to write-proc will be #t, which indicates that write-proc should re-enable breaks while blocking.

If the writing procedure raises an exception, due either to write or commit operations, it must not have committed any bytes (though it may have committed previously buffered bytes).

A port's writing procedure may be called in multiple threads simultaneously (if the port is accessible in multiple threads). The port is responsible for its own internal synchronization. Note that improper implementation of such synchronization mechanisms might cause a non-blocking write procedure to block.
close-proc -- a procedure of zero arguments that is called to close the port. The port is not considered closed until the closing procedure returns. The port's procedures will never be used again via the port after it is closed. However, the closing procedure can be called simultaneously in multiple threads (if the port is accessible in multiple threads), and it may be called during a call to the other procedures in another thread; in the latter case, any outstanding writes or flushes should be terminated immediately with an error.
optional-write-special-proc -- either #f (the default), or a procedure to handle write-special calls for the port. If #f, then the port does not support special output, and port-writes-special? will return #f when applied to the port.

If a procedure is supplied, it takes three arguments: the special value to write, a boolean that is #f if the procedure can buffer the special value and block indefinitely, and a boolean that is #t if the procedure should enable breaks while blocking. The result is one of the following:
- a non-event true value, which indicates that the special is written;
- #f if the special could not be written, perhaps because an internal buffer could not be completely flushed;
- a synchronizable event (see section 7.7) that acts like the result of write-special-evt to complete the write.
Since optional-write-special-proc can return an event, passing the first argument to an implementation of option-write-special-evt-proc is acceptable as an optional-write-special-proc.

As for write-proc, the #f result is discouraged, since it can lead to busy waiting. Also as for write-proc, an event produced by optional-write-special-proc is allowed to produce #f or another event like itself. The optional-write-special-proc procedure is always called with breaks disabled, independent of whether breaks were enabled when the write was requested by a client of the port.
optional-write-evt-proc -- either #f (the default) or a procedure of three arguments:
- an immutable byte string containing bytes to write;
- a non-negative exact integer for a starting offset (inclusive) into the byte string, and
- a non-negative exact integer for an ending offset (exclusive) into the byte string.
The result is a synchronizable event (see section 7.7) to act as the result of write-bytes-avail-evt for the port (i.e., to complete a write or flush), which becomes available only as data is committed to the port's underlying device, and whose result is the number of bytes written.

If optional-write-evt-proc is #f, then port-writes-atomic? will produce #f with applied to the port, and the port will not be a valid argument to procedures such as write-bytes-avail-evt.

Otherwise, an event returned by optional-write-evt-proc must not cause data to be written to the port unless the event is chosen in a synchronization, and it must write to the port if the event is chosen (i.e., the write must appear atomic with respect to the synchronization).

If the event's result integer is larger than the supplied byte-string range, the exn:fail:contract exception is raised by a wrapper on the event. If the start and end indices are the same (i.e., no bytes are to be written), then the event should produce 0 when the buffer is completely flushed. (If the port has no buffer, then it is effectively always flushed.)

If the event raises an exception, due either to write or commit operations, it must not have committed any new bytes (though it may have committed previously buffered bytes).

Naturally, a port's events may be used in multiple threads simultaneously (if the port is accessible in multiple threads). The port is responsible for its own internal synchronization.
optional-write-special-evt-proc -- either #f (the default), or a procedure to handle write-special-evt calls for the port. This argument must be #f if either optional-write-special-proc or optional-write-evt-proc is #f, and it must be a procedure if both of those arguments are procedures.

If it is a procedure, it takes one argument: the special value to write. The resulting event (with its constraints) is analogous to the result of optional-write-evt-proc.

If the event raises an exception, due either to write or commit operations, it must not have committed the special value (though it may have committed previously buffered bytes and values).
optional-location-proc -- either #f (the default), or a procedure that takes no arguments and returns three values: the line number for the next item written to the port's stream (a positive number or #f), the column number for the next item written to port's stream (a non-negative number or #f), and the position for the next item written to port's stream (a positive number or #f). See also section 11.2.1.1.

This procedure is only called if line counting is enabled for the port via port-count-lines! (in which case count-lines!-proc is called).
count-lines!-proc -- a procedure of no arguments that is called if and when line counting is enabled for the port. The default procedure is void.
init-position -- an exact, positive integer that determines the position of the port's first output item, used when line counting is not enabled for the port. The default is 1.
optional-buffer-mode-proc -- either #f (the default) or a procedure that accepts zero or one arguments. If optional-buffer-mode-proc is #f, then the resulting port does not support a buffer-mode setting. Otherwise, the procedure is called with one symbol argument ('block, 'line, or 'none) to set the buffer mode, and it is called with zero arguments to get the current buffer mode. In the latter case, the result must be 'block, 'line, 'none, or #f (unknown). See section 11.1.6 for more information on buffer modes.

Examples:

;; A port that writes anything to nowhere:
(define /dev/null-out
  (make-output-port 
   'null
   always-evt
   (lambda (s start end non-block? breakable?) (- end start))
   void
   (lambda (special non-block? breakable?) #t)
   (lambda (s start end) (wrap-evt
                          always-evt
                          (lambda (x)
                            (- end start))))
   (lambda (special) always-evt)))
(display "hello" /dev/null-out)            ; => void
(write-bytes-avail #"hello" /dev/null-out) ; => 5
(write-special 'hello /dev/null-out)       ; => #t
(sync (write-bytes-avail-evt #"hello" /dev/null-out)) ; => 5

;; A part that accumulates bytes as characters in a list,
;;  but not in a thread-safe way:
(define accum-list null)
(define accumulator/not-thread-safe
  (make-output-port 
   'accum/not-thread-safe
   always-evt
   (lambda (s start end non-block? breakable?)
     (set! accum-list
           (append accum-list
                   (map integer->char
                        (bytes->list (subbytes s start end)))))
     (- end start))
   void))
(display "hello" accumulator/not-thread-safe)
accum-list ; => '(#\h #\e #\l #\l #\o)

;; Same as before, but with simple thread-safety:
(define accum-list null)
(define accumulator 
  (let* ([lock (make-semaphore 1)]
         [lock-peek-evt (semaphore-peek-evt lock)])
    (make-output-port
     'accum
     lock-peek-evt
     (lambda (s start end non-block? breakable?)
       (if (semaphore-try-wait? lock)
           (begin
             (set! accum-list
                   (append accum-list
                           (map integer->char
                                (bytes->list (subbytes s start end)))))
             (semaphore-post lock)
             (- end start))
           ;; Cheap strategy: block until the list is unlocked,
           ;;   then return 0, so we get called again
           (wrap-evt
            lock-peek
            (lambda (x) 0))))
     void)))
(display "hello" accumulator)
accum-list ; => '(#\h #\e #\l #\l #\o)

;; A port that transforms data before sending it on
;;  to another port. Atomic writes exploit the
;;  underlying port's ability for atomic writes.
(define (make-latin-1-capitalize port)
  (define (byte-upcase s start end)
    (list->bytes
     (map (lambda (b) (char->integer
                       (char-upcase
                        (integer->char b))))
          (bytes->list (subbytes s start end)))))
  (make-output-port
   'byte-upcase
   ;; This port is ready when the original is ready:
   port
   ;; Writing procedure:
   (lambda (s start end non-block? breakable?)
     (let ([s (byte-upcase s start end)])
       (if non-block?
           (write-bytes-avail* s port)
           (begin
             (display s port)
             (bytes-length s)))))
   ;; Close procedure --- close original port:
   (lambda () (close-output-port port))
   #f
   ;; Write event:
   (and (port-writes-atomic? port)
        (lambda (s start end)
          (write-bytes-avail-evt (byte-upcase s start end) port)))))
(define orig-port (open-output-string))
(define cap-port (make-latin-1-capitalize orig-port))
(display "Hello" cap-port)
(get-output-string orig-port) ; => "HELLO"
(sync (write-bytes-avail-evt #"Bye" cap-port)) ; => 3
(get-output-string orig-port) ; => "HELLOBYE"

11.2 Reading and Writing

MzScheme's support for reading and writing includes many extensions compared to R5RS, both at the level of individual bytes and characters and at the level of S-expressions.

11.2.1 Reading Bytes, Characters, and Strings

In addition to the standard reading procedures, MzScheme provides byte-reading procedure, block-reading procedures such as read-line, and more.

(read-line [input-port mode-symbol]) returns a string containing the next line of bytes from input-port. If input-port is omitted, the current input port is used.

Characters are read from input-port until a line separator or an end-of-file is read. The line separator is not included in the result string (but it is removed from the port's stream). If no characters are read before an end-of-file is encountered, eof is returned.

The mode-symbol argument determines the line separator(s). It must be one of the following symbols:
- 'linefeed breaks lines on linefeed characters; this is the default.
- 'return breaks lines on return characters.
- 'return-linefeed breaks lines on return-linefeed combinations. If a return character is not followed by a linefeed character, it is included in the result string; similarly, a linefeed that is not preceded by a return is included in the result string.
- 'any breaks lines on any of a return character, linefeed character, or return-linefeed combination. If a return character is followed by a linefeed character, the two are treated as a combination.
- 'any-one breaks lines on either a return or linefeed character, without recognizing return-linefeed combinations.
Return and linefeed characters are detected after the conversions that are automatically performed when reading a file in text mode. For example, reading a file in text mode under Windows automatically changes return-linefeed combinations to a linefeed. Thus, when a file is opened in text mode, 'linefeed is usually the appropriate read-line mode.
(read-bytes-line [input-port mode-symbol]) is analogous to read-line, but it reads bytes and produces a byte string.
(read-string k [input-port]) returns a string containing the next k characters from input-port. The default value of input-port is the current input port.

If k is 0, then the empty string is returned. Otherwise, if fewer than k characters are available before an end-of-file is encountered, then the returned string will contain only those characters before the end-of-file (i.e., the returned string's length will be less than k). ³⁴ If no characters are available before an end-of-file, then eof is returned.

If an error occurs during reading, some characters may be lost (i.e., if read-string successfully reads some characters before encountering an error, the characters are dropped.)
(read-bytes k [input-port]) is analogous to read-string, but it reads bytes and produces a byte string.
(read-string! string [input-port start-k end-k]) reads characters from input-port like read-string, but puts them into string starting from index start-k (inclusive) up to end-k (exclusive). The default value of input-port is the current input port. The default value of start-k is 0. The default value of end-k is the length of the string. Like substring, the exn:fail:contract exception is raised if start-k or end-k is out-of-range for string.

If the difference between start-k and end-k is 0, then 0 is returned and bytes is not modified. If no bytes are available before an end-of-file, then eof is returned. Otherwise, the return value is the number of bytes read. If m bytes are read and m < end-k - start-k, then bytes is not modified at indices start-k + m though end-k.
(read-bytes! string [input-port start-k end-k]) is analogous to read-string!, but it reads bytes and puts them into a byte string.
(read-bytes-avail! bytes [input-port start-k end-k]) is like read-bytes!, but returns without blocking after reading immediately-available bytes, and it may return a procedure for a ``special'' result. The read-bytes-avail! procedure blocks only if no bytes (or specials) are yet available. Also unlike read-bytes!, read-bytes-avail! never drops bytes; if read-bytes-avail! successfully reads some bytes and then encounters an error, it suppresses the error (treating it roughly like an end-of-file) and returns the read bytes. (The error will be triggered by future reads.) If an error is encountered before any bytes have been read, an exception is raised.

When input-port produces a special value, as described in section 11.1.7, the result is a procedure of four arguments. The four arguments correspond to the location of the special value within the port, as described in section 11.1.7. If the procedure is called more than once with valid arguments, the exn:fail:contract exception is raised. If read-bytes-avail returns a special-producing procedure, then it does not place characters in bytes. Similarly, read-bytes-avail places only as many bytes into bytes as are available before a special value in the port's stream.
(read-bytes-avail!* bytes [input-port start-k end-k]) is like read-bytes-avail!, except that it returns 0 immediately if no bytes (or specials) are available for reading and the end-of-file is not reached.
(read-bytes-avail!/enable-break bytes [input-port start-k end-k]) is like read-bytes-avail!, except that breaks are enabled during the read (see also section 6.7). If breaking is disabled when read-bytes-avail!/enable-break is called, and if the exn:break exception is raised as a result of the call, then no bytes will have been read from input-port.
(peek-string k skip-k [input-port]) is similar to read-string, except that the returned characters are preserved in the port for future reads. (More precisely, undecoded bytes are left for future reads.) The skip-k argument indicates a number of bytes (not characters) in the input stream to skip before collecting characters to return; thus, in total, the next skip-k bytes plus k characters are inspected.

For most kinds of ports, inspecting skip-k bytes and k characters requires at least skip-k + k bytes of memory overhead associated with the port, at least until the bytes/characters are read. No such overhead is required when peeking into a string port (see section 11.1.5), a pipe port (see section 11.1.4), or a custom port with a specific peek procedure (depending on how the peek procedure is implemented; see section 11.1.7).

If a port produces eof mid-stream, peek skips beyond the eof always produce eof until the eof is read.
(peek-bytes k skip-k [input-port]) is analogous to peek-string, but it peeks bytes and produces a byte string.
(peek-string! string skip-k [input-port start-k end-k]) is like read-string!, but for peeking, and with a skip-k argument like peek-string.
(peek-bytes! bytes skip-k [input-port start-k end-k]) is analogous to peek-string!, but it peeks bytes and puts them into a byte string.
(peek-bytes-avail! bytes skip-k [progress-evt input-port start-k end-k]) is like read-bytes-avail!, but for peeking, and with two extra arguments. The skip-k argument is as in peek-bytes. The progress-evt argument must be either #f (the default) or an event produced by port-progress-evt for input-port.

To peek, peek-bytes-avail! blocks until finding an end-of-file, at least one byte (or special) past the skipped bytes, or until a non-#f progress-evt becomes ready. Furthermore, if progress-evt is ready before bytes are peeked, no bytes are peeked or skipped, and progress-evt may cut short the skipping process if it becomes available during the peek attempt.

The result of peek-bytes-avail! is 0 only in the case that progress-evt becomes ready before bytes are peeked.
(peek-bytes-avail!* bytes skip-k [progress-evt input-port start-k end-k]) is like read-bytes-avail!*, but for peeking, and with skip-k and progress-evt arguments like peek-bytes-avail!. Since this procedure never blocks, it may return before even skip-k bytes are available from the port.
(peek-bytes-avail!/enable-break bytes skip-k [progress-evt input-port start-k end-k]) is the peeking version of read-bytes-avail!/enable-break, with skip-k and progress-evt arguments like peek-bytes-avail!.
(read-byte [input-port]) is analogous to read-char, but it reads and returns a byte (or eof) instead of a character.
(read-char-or-special [input-port]) is the same as read-char, except that if the input port returns a non-byte value (through a value-generating procedure in a custom port; see section 11.1.7 and section 11.2.9.1 for details), the non-byte value is returned.
(read-byte-or-special [input-port]) is analogous to read-char-or-special, but it reads and returns a byte instead of a character.
(peek-char [input-port skip-k]) extends the standard peek-char with an optional argument (defaulting to 0) that represents the number of bytes (not characters) to skip.
(peek-byte [input-port skip-k]) is analogous to peek-char, but it reads and returns a byte instead of a character.
(peek-char-or-special [input-port skip-k]) is the same as peek-char, except that if the input port returns a non-byte value after skip-k byte positions, it is returned.
(peek-byte-or-special [input-port skip-k progress-evt]) is analogous to peek-char-or-special, but it reads and returns a byte instead of a character, and it supports a progress-evt argument (which is #f by default) like peek-bytes-avail!.
(port-progress-evt [input-port]) returns an event that becomes ready after any subsequent read from input-port, or after input-port is closed. After the event becomes ready, it remains ready. If progress events are unavailable for input-port (as reported by port-provides-progress-evts?) the exn:fail:contract exception is raised.
(port-provides-progress-evts? input-port) returns #t if port-progress-evt can return an event for input-port. All built-in kinds of ports support progress events, but ports created with make-input-port (see section 11.1.7) may not.
(port-commit-peeked k progress-evt evt [input-port]) attempts to commit as read the first k previously peeked bytes, non-byte specials, and eofs from input-port, or the first eof or special value peeked from input-port.³⁵ The read commits only if progress-evt does not become ready first (i.e., if no other process reads from input-port first), and only if evt is chosen by a sync within port-commit-peeked (in which case the event result is ignored); the evt must be either a channel-put event, channel, semaphore, semaphore-peek event, always event, or never event. Suspending the thread that calls port-commit-peeked may or may not prevent the commit from proceeding. The result from port-commit-peeked is #t if data is committed, and #f otherwise.

If no data has been peeked from input-port and progress-evt is not ready, then exn:fail:contract exception is raised. If fewer than k items have been peeked at the current start of input-port's stream, then only the peeked items are committed as read. If input-port's stream currently starts at an eof or a non-byte special value, then only the eof or special value is committed as read.

If progress-evt is not a result of port-progress-evt applied to input-port, then exn:fail:contract exception is raised.

11.2.1.1 Counting Positions, Lines, and Columns

By default, MzScheme keeps track of the position in a port as the number of bytes that have been read from or written to any port (independent of the read/write position, which is accessed or changed with file-position). Optionally, however, MzScheme can track the position in terms of characters (after UTF-8 decoding), instead of bytes, and it can track line locations and column locations; this optional tracking must be specifically enabled for a port via port-count-lines! or the port-count-lines-enabled parameter (see section 7.9.1.2). Position, line, and column locations for a port are used by read-syntax (see section 12.2 for more information) and read-honu-syntax. Position and line locations are numbered from 1; column locations are numbered from 0.

(port-count-lines! port) turns on line and column counting for a port. Counting can be turned on at any time, though generally it is turned on before any data is read from or written to a port. When a port is created, if the value of the port-count-lines-enabled parameter is true (see section 7.9.1.2), then line counting is automatically enabled for the port. Line counting cannot be disabled for a port after it is enabled.

When counting lines, MzScheme treats linefeed, return, and return-linefeed combinations as a line terminator and as a single position (on all platforms). Each tab advances the column count to one before the next multiple of 8. When a sequence of bytes in the range 128 to 253 forms a UTF-8 encoding of a character, the position/column is incremented is incremented once for each byte, and then decremented appropriately when a complete encoding sequence is discovered. See also section 11.1 for more information on UTF-8 decoding for ports.

A position is known for any port as long as its value can be expressed as a fixnum (which is more than enough tracking for realistic applications in, say, syntax-error reporting). If the position for a port exceeds the value of the largest fixnum, then the position for the port becomes unknown, and line and column tacking is disabled. Return-linefeed combinations are treated as a single character position only when line and column counting is enabled.

(port-next-location port) returns three values: a positive exact integer or #f for the line number of the next read/written item, a non-negative exact integer or #f for the next item's column, and a positive exact integer or #f for the next item's position. The next column and position normally increases as bytes are read from or written to the port, but if line/character counting is enabled for port, the column and position results can decrease after reading or writing a byte that ends a UTF-8 encoding sequence.

Certain kinds of exceptions (see section 6.1) encapsulate source-location information using a srcloc structure, which has five fields:

source -- An arbitrary value identifying the source, often a path (see section 11.3.1).
line -- The line number, a positive exact integer (counts from 1) or #f (unknown).
column -- The column number, a non-negative exact integer (counts from 0) or #f (unknown).
position -- The starting position, a positive exact integer (counts from 1) or #f (unknown).
span -- The number of covered positions, a non-negative exact integer (counts from 0) or #f (unknown).

The fields of a srcloc structure are immutable, so no field-mutator procedures are defined for srcloc. The srcloc structure type is transparent to all inspectors (see section 4.5).

11.2.2 Writing Bytes, Characters, and Strings

In addition to the standard printing procedures, MzScheme provides byte-writing procedures, block-writing procedures such as write-string, and more.

(write-string string [output-port start-k end-k]) write characters to output-port from string starting from index start-k (inclusive) up to end-k (exclusive). The default value of output-port is the current output port. The default value of start-k is 0. The default value of end-k is the length of the string. Like substring, the exn:fail:contract exception is raised if start-k or end-k is out-of-range for string.

The result is the number of characters written to output-port, which is always (- end-k start-k).
(write-bytes bytes [output-port start-k end-k]) is analogous to write-string, but it writes a byte string.
(write-bytes-avail bytes [output-port start-k end-k]) is like write-bytes, but it returns without blocking after writing as many bytes as it can immediately flush. It blocks only if no bytes can be flushed immediately. The result is the number of bytes written and flushed to output-port; if start-k is the same as end-k, then the result can be 0 (indicating a successful flush of any buffered data), otherwise the result is at least 1 but possibly less than (- end-k start-k).

The write-bytes-avail procedure never drops bytes; if write-bytes-avail successfully writes some bytes and then encounters an error, it suppresses the error and returns the number of written bytes. (The error will be triggered by future writes.) If an error is encountered before any bytes have been written, an exception is raised.
(write-bytes-avail* bytes [output-port start-k end-k]) is like write-bytes-avail, except that it never blocks, it returns #f if the port contains buffered data that cannot be written immediately, and it returns 0 if the port's internal buffer (if any) is flushed but no additional bytes can be written immediately.
(write-bytes-avail/enable-break bytes [input-port start-k end-k]) is like write-bytes-avail, except that breaks are enabled during the write. The procedure provides a guarantee about the interaction of writing and breaks: if breaking is disabled when write-bytes-avail/enable-break is called, and if the exn:break exception is raised as a result of the call, then no bytes will have been written to output-port. See also section 6.7.
(write-byte byte [output-port]) is analogous to write-char, but for writing a byte instead of a character.
(write-special v [output-port]) writes v directly to output-port if it supports special writes, or raises exn:fail:contract if the port does not support special write. The result is always #t, indicating that the write succeeded.
(write-special-avail* v [output-port]) is like write-special, but without blocking. If v cannot be written immediately, the result is #f without writing v, otherwise the result is #t and v is written.
(write-bytes-avail-evt bytes [output-port start-k end-k]) is similar to write-bytes-avail, but instead of writing bytes immediately, it returns a synchronizable event (see section 7.7). The output-port must support atomic writes, as indicated by port-writes-atomic?.

Synchronizing on the object starts a write from bytes, and the event becomes ready when bytes are written (unbuffered) to the port. If start-k and end-k are the same, then the synchronization result is 0 when the port's internal buffer (if any) is flushed, otherwise the result is a positive exact integer. If the event is not selected in a synchronization, then no bytes will have been written to output-port.
(write-special-evt v [output-port]) is similar to write-special, but instead of writing the special value immediately, it returns a synchronizable event (see section 7.7). The output-port must support atomic writes, as indicated by port-writes-atomic?.

Synchronizing on the object starts a write of the special value, and the event becomes ready when the value is written (unbuffered) to the port. If the event is not selected in a synchronization, then no value will have been written to output-port.
(port-writes-atomic? output-port) returns #t if write-bytes-avail/enable-break can provide an exclusive-or guarantee (break or write, but not both) for output-port, and if the port can be used with procedures like write-bytes-avail-evt. MzScheme's file-stream ports, pipes, string ports, and TCP ports all support atomic writes; ports created with make-output-port (see section 11.1.7) may support atomic writes.
(port-writes-special? output-port) returns #t if procedures like write-special can write arbitrary values to the port. MzScheme's file-stream ports, pipes, string ports, and TCP ports all reject special values, but ports created with make-output-port (see section 11.1.7) may support them.

11.2.3 Writing Structured Data

The print procedure is used to print Scheme values in a context where a programmer expects to see a value:

(print v [output-port]) outputs v to output-port. The default value of output-port is the current output port.

The rationale for providing print is that display and write both have standard output conventions, and this standardization restricts the ways that an environment can change the behavior of these procedures. No output conventions should be assumed for print so that environments are free to modify the actual output generated by print in any way. Unlike the port display and write handlers, a global port print handler can be installed through the global-port-print-handler parameter (see section 7.9.1.2).

The fprintf, printf, and format procedures create formatted output:

(fprintf output-port format-string v ···) prints formatted output to output-port, where format-string is a string that is printed; format-string can contain special formatting tags:
- ~n or ~% prints a newline
- ~a or ~A displays the next argument among the vs
- ~s or ~S writes the next argument among the vs
- ~v or ~V prints the next argument among the vs
- ~e or ~E outputs the next argument among the vs using the current error value conversion handler (see section 7.9.1.7) and current error printing width
- ~c or ~C write-chars the next argument in vs; if the next argument is not a character, the exn:fail:contract exception is raised
- ~b or ~B prints the next argument among the vs in binary; if the next argument is not an exact number, the exn:fail:contract exception is raised
- ~o or ~O prints the next argument among the vs in octal; if the next argument is not an exact number, the exn:fail:contract exception is raised
- ~x or ~X prints the next argument among the vs in hexadecimal; if the next argument is not an exact number, the exn:fail:contract exception is raised
- ~~ prints a tilde (~)
- ~w, where w is a whitespace character, skips characters in format-string until a non-whitespace character is encountered or until a second end-of-line is encountered (whichever happens first). An end-of-line is either #\return, #\newline, or #\return followed immediately by #\newline (on all platforms).
The return value is void.
(printf format-string v ···) same as fprintf with the current output port.
(format format-string v ···) same as fprintf with a string output port where the final string is returned as the result.

When an illegal format string is supplied to one of these procedures, the exn:fail:contract exception is raised. When the format string requires more additional arguments than are supplied, the exn:fail:contract exception is raised. When more additional arguments are supplied than are used by the format string, the exn:fail:contract exception is raised.

For example,

(fprintf port "~a as a string is ~s.~n" '(3 4) "(3 4)")

prints this message to port:³⁶

(3 4) as a string is "(3 4)".

followed by a newline.

11.2.4 Default Reader

MzScheme's input parser obeys the following non-standard rules. See also section 11.2.8 for information on configuring the input parser through a readtable.

Square brackets (``['' and ``]'') and curly braces (``{'' and ``}'') can be used in place of parentheses. An open square bracket must be closed by a closing square bracket and an open curly brace must be closed by a closing curly brace. Whether square brackets are treated as parentheses is controlled by the read-square-bracket-as-paren parameter (see section 7.9.1.3). Similarly, the parsing of curly braces is controlled with the read-curly-brace-as-paren parameter. When square brackets and curly braces are not treated as parentheses, they are disallowed as input. By default, square brackets and curly braces are treated as parentheses.
Vector constants can be unquoted, and a vector size can be specified with a decimal integer between the # and opening parenthesis. If the specified size is larger than the number of vector elements that are provided, the last specified element is used to fill the remaining vector slots. For example, #4(1 2) is equivalent to #(1 2 2 2). If no vector elements are specified, the vector is filled with 0. If a vector size is provided and it is smaller than the number of elements provided, the exn:fail:read exception is raised.
Boxed constants can be created using #&. The datum following #& is treated as a quoted constant and put into the new box. (Space and comments following the #& are ignored.) Box reading is controlled with the read-accept-box boolean parameter (see section 7.9.1.3). Box reading is enabled by default. When box reading is disabled and #& is provided as input, the exn:fail:read exception is raised.
Expressions beginning with #' are wrapped with syntax in the same way that expressions starting with ' are wrapped with quote. Similarly, #` generates quasisyntax, #, generates unsyntax, and #,@ generates unsyntax-splicing. See also section 12.2.1.2.
The following character constants are recognized:
- #\nul or #\null (ASCII 0)
- #\backspace (ASCII 8)
- #\tab (ASCII 9)
- #\newline or #\linefeed (ASCII 10)
- #\vtab (ASCII 11)
- #\page (ASCII 12)
- #\return (ASCII 13)
- #\space (ASCII 32)
- #\rubout (ASCII 127)
Whenever #\ is followed by at least two alphabetic characters, characters are read from the input port until the next non-alphabetic character is returned. If the resulting string of letters does not match one of the above constants (case-insensitively), the exn:fail:read exception is raised.

Character constants can also be specified through direct Unicode values in octal notation (up to 255): #\n₁n₂n₃ where n₁ is in the range [0, 3] and n₂ and n₃ are in the range [0, 7]. Whenever #\ is followed by at least two characters in the range [0, 7], the next character must also be in this range, and the resulting octal number must be in the range 000₈ to 377₈.

Finally, character constants can be specified through direct Unicode values in hexadecimal notation: #\un₁...n_k or #\Un₁...n_k, where each n_i is a hexadecimal digit (0-9, a-f, or A-F), and k is no more than 4 for #\u or 6 for #\U. Whenever #\ is followed by a u or U and one hexadecimal digit, the character constant is terminated by either the first non-hexadecimal character in the stream, or the fourth/sixth hexadecimal character, whichever comes first. The resulting hexadecimal number must be a valid argument to integer->char, otherwise the exn:fail:read exception is raised.

Unless otherwise specified above, character-constants are terminated after the character following #\ . For example, if #\ is followed by an alphabetic character other than u and then a non-alphabetic character, then the character constant is terminated. If #\ is followed by a 8 or 9, then the constant is terminated. If #\ is followed by a non-alphabetic, non-decimal-digit character then the constant is terminated.
Within string constants, the following escape sequences are recognized in addition to \" and \\:
- \a: alarm (ASCII 7)
- \b: backspace (ASCII 8)
- \t: tab (ASCII 9)
- \n: linefeed (ASCII 10)
- \v: vertical tab (ASCII 11)
- \f: formfeed (ASCII 12)
- \r: return (ASCII 13)
- \e: escape (ASCII 27)
- \': quote (i.e., the backslash has no effect)
- \o, \oo, or \ooo: Unicode for octal o, oo, or ooo, where each o is 0, 1, 2, 3, 4, 5, 6, or 7. The \ooo form takes precedence over the \oo form, and \oo takes precedence over \o.
- \xh or \xhh: Unicode for hexadecimal h or hh, where each h is 0, 1, 2, 3, 4, 5, 6, 7, a, A, b, B, c, C, d, D, e, E, f, or F. The \xhh form takes precedence over the \xh form.
- \uh, \uhh, \uhhh, or \uhhhh: like \x, but with up to four hexadecimal digits (longer sequences take precedence). The resulting hexadecimal number must be a valid argument to integer->char, otherwise the exn:fail:read exception is raised.
- \Uh, \Uhh, \Uhhh, \Uhhhh, \Uhhhhh, \Uhhhhhh, \Uhhhhhhh, or \Uhhhhhhhh: like \x, but with up to eight hexadecimal digits (longer sequences take precedence). The resulting hexadecimal number must be a valid argument to integer->char, otherwise the exn:fail:read exception is raised.
Furthermore, a backslash followed by a linefeed, carriage return or return-linefeed combination is elided, allowing string constants to span lines. Any other use of backslash within a string constant is an error.
A string constant preceded by # is a byte-string constant. Byte string constants support the same escape sequences as character strings except \u and \U.
The sequence #<< starts a here string. The characters following #<< until a newline character define a terminator for the string. The content of the string includes all characters between the #<< line and a line whose only content is the specified terminator. More precisely, the content of the string starts after a newline following #<<, and it ends before a newline that is followed by the terminator, where the terminator is itself followed by either a newline or end-of-file. No escape sequences are recognized between the starting and terminating lines; all characters are included in the string (and terminator) literally. A return character is not treated as a line separator in this context. If no characters appear between #<< and a newline or end-of-file, or if an end-of-file is encountered before a terminating line, the exn:fail:read exception is raised.
The syntax for numbers is extended as described in section 3.3. Numbers containing a decimal point or exponent (e.g., 1.3, 2e78) are normally read as inexact. If the read-decimal-as-inexact parameter is set to #f, then such numbers are instead read as exact. The parameter does not affect the parsing of numbers with an explicit exactness tag (#e or #i).
A parenthesized sequence containing two delimited dots (``.'') triggers infix parsing. A single datum must appear between the dots, and one or more datums must appear before the first dot and after the last dot:
(left-datum ···¹ . first-datum . right-datum ···¹)
The resulting list consists of the datum between the dots, followed by the remaining datums in order:
(first-datum left-datum ···¹ right-datum ···¹)
Consequently, the input expression (1 . < . 2) produces #t, and (1 2 . + . 3 4 5) produces 15.
When the read-accept-dot parameter is set to #f, then a delimited dot (``.'') is disallowed in input. When the read-accept-quasiquote parameter is set to #f, then a backquote or comma is disallowed in input. These modes simplify Scheme's input model for students.
MzScheme's identifier and symbol syntax is considerably more liberal than the syntax specified by R5RS. When input is scanned for tokens, the following characters delimit an identifier in addition to whitespace:

" , ' ` ; ( ) [ ] { }

In addition, an identifier cannot start with a hash mark (``#'') unless the hash mark is immediately followed by a percent sign (``%''). The only other special characters are backslash (``\'') and quoting vertical bars (``|''); any other character is used as part of an identifier.

Symbols containing special characters (including delimiters) are expressed using an escaping backslash (``\'') or quoting vertical bars (``|''):
- A backslash preceding any character includes that character in the symbol literally; double backslashes produce a single backslash in the symbol.
- Characters between a pair of vertical bars are included in the symbol literally. Quoting bars can be used for any part of a symbol, or the whole symbol can be quoted. Backslashes and quoting bars can be mixed within a symbol, but a backslash is not a special character within a pair of quoting bars.
Characters quoted with a backslash or a vertical bar always preserve their case, even when identifiers are read case-insensitively.

An input token constructed in this way is an identifier when it is not a numerical constant (following the extended number syntax described in section 3.3). A token containing a backslash or vertical bars is never treated as a numerical constant.

Examples:
- (quote a\(b) produces the same symbol as (string->symbol "a(b").
- (quote A\B) produces the same symbol as (string->symbol "aB") when identifiers are read without case-sensitivity.
- (quote a\ b), (quote |a b|), and (quote a| |b) all produce the same symbol as (string->symbol "a b").
- (quote |a||b|) is the same as (quote |ab|), which produces the same symbol as (string->symbol "ab").
- (quote 10) is the number 10, but (quote |10|) produces the same symbol as (string->symbol "10").
Whether a vertical bar is used as a special or normal symbol character is controlled with the read-accept-bar-quote boolean parameter (see section 7.9.1.3). Vertical bar quotes are enabled by default. Quoting backslashes cannot be disabled.
By default, symbols are read case-sensitively. Case sensitivity for reading can be controlled in three ways:
- Quoting part of a symbol with an escaping backslash (``\'') or quoting vertical bar (``|'') always preserves the case of the quoted portion, as described above.
- The sequence #cs can be used as a prefix for any expression to make reading symbols within the expression case-sensitive. A #ci prefix similarly makes reading symbols in an expression case-insensitive. Whitespace can appear between a #cs or #ci prefix and its expression, and prefixes can be nested. Backslash and vertical-bar quotes override a #ci prefix.
- When the read-case-sensitive parameter (see section 7.9.1.3) is set to #t, then case is preserved when reading symbols. The default is #t, and it is set to #t while loading a module (see section 5.8). A #cs or #ci prefix overrides the parameter setting, as does backslash or vertical-bar quoting.
Symbol case conversions are not sensitive to the current locale (see section 1.2.2).
A symbol-like expression that starts with an unquoted hash and colon (``#:'') is parsed as a keyword constant. After the leading colon, backslashes, vertical bars, and case sensitivity are handled as for symbols, except that a keyword expression can never be interpreted as a number.
Expressions of the form #rxstring are literal regexp values (see section 10) where string is a string constant. The regexp produced by #rxstring is the same as produced by (regexp string). If string is not a valid pattern, the exn:fail:read exception is raised.

Expressions of the form #rx#string are similarly literal byte-regexp values. The regexp produced by #rx#string is the same as produced by (byte-regexp #string).
Expressions of the form #pxstring and #px#string are like the #rx variants, except that the regexp is as produced by pregexp and byte-pregexp (see section 10) instead of regexp and byte-regexp.
Expressions of the form #hash((key-datum . val-datum) ···) are literal immutable hash tables. The hash table maps each key-datum to its val-datum, comparing keys with equal?. The table is constructed by adding each key-datum mapping from left to right, so later mappings can hide earlier mappings if the key-datums are equal?. An expression of the form #hasheq((key-datum . val-datum) ···) produces an immutable hash table with keys compared using eq?. If the value of read-square-bracket-as-paren parameter (see section 7.9.1.3) is true, matching parentheses in a #hash or #hasheq constant can be replaced by matching square brackets. Similarly, matching curly braces can be used if read-curly-brace-as-paren is true.
Values with shared structure are expressed using #n= and #n#, where n is a decimal integer. See section 11.2.5.1.
Expressions of the form #%x are symbols, where x can be a symbol or a number.
Expressions beginning with #~ are interpreted as compiled MzScheme code. See section 14.3.
Multi-line comments are started with #| and terminated with |#. Comments of this form can be nested arbitrarily.
A #; comments out the next datum. Whitespace and comments (including #; comments) may appear between the #; and the commented-out datum. Graph-structure annotations with #n= and #n# work within the comment as if the datum were not commented out (e.g., bindings can be introduced with #n= for use in parts of the datum that are not commented out). When #; appears at the beginning of a top-level datum, however, graph-structure bindings are discarded (along with the first following datum) before reading the second following datum.
If the first line of a loaded file begins with #!, it is ignored by the default load handler. If an ignored line ends with a backslash (``\''), then the next line is also ignored. (The #! convention is for shell scripts; see Chapter 18 for details.)
A #hx shifts the reader into H-expression mode (see section 19) for one H-expression. A #sx has no effect in normal mode, but in H-expression mode, it shifts the reader back to (normal) S-expression mode. The read-honu and read-honu-syntax procedures read as if the stream starts with #hx.
A #honu shifts the reader into H-expression mode (see section 19) and reads repeatedly until an end-of-file is encountered. The H-expression results are wrapped in a module-formed S-expression, as described in section 19.
A #reader must be followed by a datum. The datum is passed to the procedure that is the value of the current-reader-guard parameter (see section 7.9.1.3), and the result is used as a module path. The module path is passed to dynamic-require (see section 5.5) with either 'read or 'read-syntax (depending on whether parsing started with read or read-syntax). The resulting procedure should accept the same arguments as read or read-syntax (with all optional arguments as required). The procedure is given the port whose stream contained #reader, and it should produce a datum result. If the result is a syntax object in read mode it is converted to a datum using syntax-object->datum; if the result is not a syntax object in read-syntax mode, it is converted to one using datum->syntax-object. See also section 11.2.9.1 and section 11.2.9.2 for information on special-comment results and recursive reads. If the read-accept-reader parameter is set to #f, then #reader is disallowed as input.

Reading from a custom port can produce arbitrary values generated by the port; see section 11.1.7 for details. If the port generates a non-character value in a position where a character is required (e.g., within a string), the exn:fail:read:non-char exception is raised.

11.2.5 Default Printer

MzScheme's printer obeys the following non-standard rules (though the rules for print do not apply when the print-honu parameter is set to #t; see section 7.9.1.4).

A vector can be printed by write and print using the shorthand described in section 11.2.4, where the vector's length is printed between the leading # and the opening parenthesis and repeated tail elements are omitted. For example, #(1 2 2 2) is printed as #4(1 2). The display procedure does not output vectors using this shorthand. Shorthand vector printing is controlled with the print-vector-length boolean parameter (see section 7.9.1.4). Shorthand vector printing is enabled by default.
Boxes (see section 3.11) can be printed with the #& notation (see section 11.2.4). When box printing is disabled, all boxes are printed unreadably as #<box>. Box printing is controlled with the print-box boolean parameter (see section 7.9.1.4). Box printing is enabled by default.
Structures (see Chapter 4) can be printed using either a custom-write procedure or vector notation. See section 11.2.10 for information on custom-write procedures; the following information applies only when no custom-write procedure is specified. In the vector form of output, the first item is a symbol of the form struct:s -- where s is the name of the structure -- and the remaining elements are the elements of the structure, but the vector exposes only as much information about the structure as the current inspector can access (see section 4.5). When structure printing is disabled, or when no part of the structure is accessible to the current inspector, a structure is printed unreadably as #<struct:s>. Structure printing is controlled with the print-struct boolean parameter (see section 7.9.1.4). Structure printing is enabled by default.
Symbols containing spaces or special characters write using escaping backslashes and quoting vertical bars. When the read-case-sensitive parameter is set to #f, then symbols containing uppercase characters also use escaping backslashes or quoting vertical bars. In addition, symbols are quoted with vertical bars or a leading backslash when they would otherwise print the same as a numerical constant. If the value of the read-accept-bar-quote boolean parameter is #f (see section 7.9.1.3), then backslashes are always used to escape special characters instead of quoting them with vertical bars, and a vertical bar is not treated as a special character. Otherwise, quoting bars are used in printing when bar at the beginning and one at the end suffices to correctly print the symbol. See section 11.2.4 for more information about symbol parsing. Symbols display without escaping or quoting special characters.
Keywords write and display the same as symbols, except with a leading hash and colon, and without special handing when the printed form matches a number (since the leading #: distinguishes the keyword).
Characters with the special names described in section 11.2.4 write using the same name. (Some characters have multiple names; the #\newline and #\nul names are used instead of #\linefeed and #\null). Other graphic characters (according to char-graphic?; see section 3.4) write as #\ followed by the single character, and all others characters are written in #\u notation with four digits or #\U notation with eight digits (using the latter only if the character value it does not fit in four digits). All characters display as a single character.
Strings containing non-graphic, non-blank characters (according to char-graphic? and char-blank?; see section 3.4) write using the escape sequences described in section 11.2.4, using \a, \b, \t, \n, \v, \f, \r, or \e if possible, otherwise using \u with four hexadecimal digits or \U with eight hexadecimal digits (using the latter only if the character value does not fit into four digits). All strings display as their literal character sequences.
Byte strings write using #", where each byte in the string content is written using the corresponding ASCII decoding if the byte is between 0 and 127 and the character is graphic or blank (according to char-graphic? and char-blank?; see section 3.4). Otherwise, the byte is written using \a, \b, \t, \n, \v, \f, \r, or \e if possible, otherwise using \o with one to three octal digits (only as many as necessary). All strings display as their literal byte sequence; this byte sequence may not be a valid UTF-8 encoding, so it may not correspond to a sequence of characters.
Paths (see section 11.3.1) by write like other unreadable values using #<path:...>. A path displays in the same way as the result of path->string applied to the path.
Regexp values print using the form #rxstring, where string is the write form of the regexp's source character string or byte string. Similarly, byte-regexp values print starting with #rx#.
Hash tables by default print unreadably as #<hash-table>. When the print-hash-table parameter is set to true (see section 7.9.1.4), hash tables print using the form #hash((key . val) ···) or #hasheq((key . val) ···) for tables using equal? or eq? key comparisons, respectively. Hash tables with weakly held keys always print unreadably as #<hash-table>.
Values with shared structure can be printed using #n= and #n#, where n is a decimal integer. See section 11.2.5.1.
A value with no readable format prints as #<...>, but only when the print-unreadable parameter is set to #t (the default; see also section 7.9.1.4). When the parameter's value is #f, attempting to print an unreadable value raises exn:fail:contract.

11.2.5.1 Sharing Structure in Input and Output

MzScheme can read and print Common LISP-style graphs, values with shared structure (including cycles). Graphs are described by tagging the shared structure once with #n= (using some decimal integer n with no more than eight digits) and then referencing it later with #n# (using the same number n). For example, the following datum represents the infinite list of ones:

#0=(1 . #0#)

If this graph is entered into MzScheme's read-eval-print loop, MzScheme's compiler will loop forever, trying to compile an infinite expression. In contrast, the following expression defines ones to the infinite list of ones, using quote to hide the infinite list from the compiler:

(define ones (quote #0=(1 . #0#)))

A tagged structure can be referenced multiple times. Here, v is defined to be a vector containing the same cons cell in all three slots:

(define v #(#1=(cons 1 2) #1# #1#))

A tag #n= must appear to the left of all references #n#, and all references must appear in the same top-level datum as the tag. By default, MzScheme's printer will display a value without showing the shared structure:

#((1 . 2) (1 . 2) (1 . 2))

Graph reading and printing are controlled with the read-accept-graph and print-graph boolean parameters (see section 7.9.1.4). Graph reading is enabled by default, and graph printing is disabled by default. However, when the printer encounters a graph containing a cycle, graph printing is automatically enabled, temporarily. (For this reason, the display, write, and print procedures require memory proportional to the depth of the value being printed.) When graph reading is disabled and a graph is provided as input, the exn:fail:read exception is raised.

If the n in a #n= form or a #n# form contains more than eight digits, the exn:fail:read exception is raised. If a #n# form is not preceded by a #n= form using the same n, the exn:fail:read exception is raised. If two #n= forms are in the same expression for the same n, the exn:fail:read exception is raised.

11.2.6 Replacing the Reader

Each input port has its own port read handler. This handler is invoked to read from the port when the built-in read or read-syntax procedure is applied to the port.³⁷ A port read handler is applied to either one argument or two arguments:

A single argument is supplied when the port is used with read; the argument is the port being read. The return value is the value that was read from the port (or end-of-file).
Two arguments are supplied when the port is used with read-syntax; the first argument is the port being read, and the second argument is a value indicating the source. The return value is a syntax object that was read from the port (or end-of-file).

A port's read handler is configured with port-read-handler:

(port-read-handler input-port) returns the current port read handler for input-port.
(port-read-handler input-port proc) sets the handler for input-port to proc.

The default port read handler reads standard Scheme expressions with MzScheme's built-in parser (see section 11.2.4). It handles a special result from a custom input port (see section 11.1.7.1) by treating it as a single expression, except that special-comment values (see section 11.2.9.1) are treated as whitespace.

The read and read-syntax procedures themselves can be customized through a readtable; see section 11.2.8 for more information.

11.2.7 Replacing the Printer

Each output port has its own port display handler, port write handler, and port print handler. These handlers are invoked to output to the port when the standard display, write or print procedure is applied to the port. A port display/write/print handler takes a two arguments: the value to be printed and the destination port. The handler's return value is ignored.

(port-display-handler output-port) returns the current port display handler for output-port.
(port-display-handler output-port proc) sets the display handler for output-port to proc.
(port-write-handler output-port) returns the current port write handler for output-port.
(port-write-handler output-port proc) sets the write handler for output-port to proc.
(port-print-handler output-port) returns the current port print handler for output-port.
(port-print-handler output-port proc) sets the print handler for output-port to proc.

The default port display and write handlers print Scheme expressions with MzScheme's built-in printer (see section 11.2.5). The default print handler calls the global port print handler (the value of the global-port-print-handler parameter; see section 7.9.1.2); the default global port print handler is the same as the default write handler.

11.2.8 Customizing the Reader through Readtables

A readtable configures MzScheme's built-in reader by adjusting the way that individual characters are parsed. MzScheme readtables are just like readtables in Common LISP, except that an individual readtable is immutable, and the procedures for creating and inspecting readtables are somewhat different than the Common LISP procedures.

The readtable is consulted at specific times by the reader:

when looking for the start of an S-expression;
when determining how to parse an S-expression that starts with hash (``#'');
when looking for a delimiter to terminate a symbol or number;
when looking for an opener (such as ``(''), closer (such as ``)''), or dot (``.'') after the first character parsed as a sequence for a list, vector, or hash table; or
when looking for an opener after #n in a vector of specified length n.

In particular, after parsing a character that is mapped to the default behavior of semi-colon (``;''), the readtable is ignored until the comment's terminating newline is discovered. Similarly, the readtable does not affect string parsing until a closing double-quote is found. Meanwhile, if a character is mapped to the default behavior of an open parenthesis (``(''), then it starts sequence that is closed by any character that is mapped to a close parenthesis (``)''). An apparent exception is that the default parsing of a vertical bar (``|'') quotes a symbol until a matching character is found, but the parser is simply using the character that started the quote; it does not consult the readtable.

For many contexts, #f identifies the default readtable for MzScheme. In particular, #f is the initial value for the current-readtable parameter (see section 7.9.1.3), which causes the reader to behave as described in section 11.2.4. Adjust MzScheme's default reader by setting the current-readtable parameter to a readtable created with make-readtable.

(make-readtable readtable [char-or-false symbol-or-char readtable-or-proc ···¹]) creates a new readtable that is like readtable (which can be #f), except that the reader's behavior is modified for each char according to the given symbol-or-char and readtable-or-proc. The ···¹ for make-readtable applies to all three of char, symbol-or-char, and readtable-or-proc; in other words, the total number of arguments to make-readtable must be one modulo three.

The possible combinations for char-or-false, symbol-or-char, and readtable-or-proc are as follows:

char 'terminating-macro proc -- causes char to be parsed as a delimiter, and an unquoted/uncommented char in the input string triggers a call to the reader macro proc; the activity of proc is described further below. Conceptually, characters like semi-colon (``;'') and parentheses are mapped to terminating reader macros in the default readtable.
char 'non-terminating-macro proc -- like the 'terminating-macro variant, but char is not treated as a delimiter, so it can be used in the middle of an identifier or number. Conceptually, hash (``#'') is mapped to a non-terminating macro in the default readtable.
char 'dispatch-macro proc -- like the 'non-terminating-macro variant, but char only when it follows a hash (``#'') -- or, more precisely, when the character follows one that has been mapped to the behavior of hash in the default readtable.
char like-char readtable -- causes char to be parsed in the same way that like-char is parsed in readtable, where readtable can be #f to indicate the default readtable. Mapping a character to the same actions as vertical bar (``|'') in the default reader means that the character starts quoting for symbols, and the same character terminates the quote; in contrast, mapping a character to the same action as a double quote means that the character starts a string, but the string is still terminated with a closing double quote. Finally, mapping a character to an action in the default readtable means that the character's behavior is sensitive to parameters that affect the original character; for example, mapping a character to the same action is a curly brace (``{'') in the default readtable means that the character is disallowed when the read-curly-brace-as-paren parameter is set to #f.
#f 'non-terminating-macro proc -- replaces the macro used to parse characters with no specific mapping: i.e., characters (other than hash or vertical bar) that can start a symbol or number with the default readtable.

If multiple 'dispatch-macro mappings are provided for a single char-or-false, all but the last one are ignored. Similarly, if multiple non-'dispatch-macro mappings are provided for a single char-or-false, all but the last one are ignored.

A reader macro proc must accept six arguments, and it can optionally accept two arguments. See section 11.2.9 for information on the procedure's arguments and results.

A reader macro normally reads characters from the given input port to produce a value to be used as the ``reader macro-expansion'' of the consumed characters. The reader macro might produce a special-comment value to cause the consumed character to be treated as whitespace, and it might use read/recursive or read-syntax/recursive; see section 11.2.9.1 and section 11.2.9.2 for more information on these topics.

(readtable-mapping readtable char), where readtable is not #f, produces information about the mappings in readtable for char. The result is three values:

either a character (mapping is to same behavior as the character in the default readtable), 'terminating-macro, or 'non-terminating-macro; this result reports the main (i.e., non-'dispatch-macro) mapping for char. When the result is a character, then char is mapped to the same behavior as the returned character in the default readtable.
either #f or a reader-macro procedure; the result is a procedure when the first result is 'terminating-macro or 'non-terminating-macro.
either #f or a reader-macro procedure; the result is a procedure when the character has a 'dispatch-macro mapping in readtable to override the default dispatch behavior.

Note that reader-macro procedures for the default readtable are not directly accessible. To invoke default behaviors, use read/recursive or read-syntax/recursive (see section 11.2.9.2) with a character and the #f readtable.

Extended example:

;; Provides raise-read-error and raise-read-eof-error
(require (lib "readerr.ss" "syntax"))

(define (skip-whitespace port)
  ;; Skips whitespace characters, sensitive to the current
  ;; readtable's definition of whitespace
  (let ([ch (peek-char port)])
    (unless (eof-object? ch)
      ;; Consult current readtable:
      (let-values ([(like-ch/sym proc dispatch-proc) 
                    (readtable-mapping (current-readtable) ch)])
        ;; If like-ch/sym is whitespace, then ch is whitespace
        (when (and (char? like-ch/sym)
                   (char-whitespace? like-ch/sym))
          (read-char port)
          (skip-whitespace port))))))

(define (skip-comments read-one port src)
  ;; Recursive read, but skip comments and detect EOF
  (let loop ()
    (let ([v (read-one)])
      (cond
       [(special-comment? v) (loop)]
       [(eof-object? v)
        (let-values ([(l c p) (port-next-location port)])
          (raise-read-eof-error "unexpected EOF in tuple" src l c p 1))]
       [else v]))))

(define (parse port read-one src)
  ;; First, check for empty tuple
  (skip-whitespace port)
  (if (eq? #\> (peek-char port))
      null
      (let ([elem (read-one)])
        (if (special-comment? elem)
            ;; Found a comment, so look for > again
            (parse port read-one src)
            ;; Non-empty tuple:
            (cons elem
                  (parse-nonempty port read-one src))))))

(define (parse-nonempty port read-one src)
  ;; Need a comma or closer
  (skip-whitespace port)
  (case (peek-char port)
    [(#\>) (read-char port)
     ;; Done
     null]
    [(#\,) (read-char port)
     ;; Read next element and recur
     (cons (skip-comments read-one port src)
           (parse-nonempty port read-one src))]
    [else
     ;; Either a comment or an error; grab location (in case
     ;; of error) and read recursively to detect comments
     (let-values ([(l c p) (port-next-location port)]
                  [(v) (read-one)])
       (cond
        [(special-comment? v)
         ;; It was a comment, so try again
         (parse-nonempty port read-one src)]
        [else
         ;; Wasn't a comment, comma, or closer; error
         ((if (eof-object? v) raise-read-eof-error raise-read-error)
          "expected `,' or `>'" src l c p 1)]))]))

(define (make-delims-table)
  ;; Table to use for recursive reads to disallow delimiters
  ;;  (except those in sub-expressions)
  (letrec ([misplaced-delimiter 
            (case-lambda
             [(ch port) (unexpected-delimiter ch port #f #f #f #f)]
             [(ch port src line col pos)
              (raise-read-error 
               (format "misplaced `~a' in tuple" ch) src line col pos 1)])])
    (make-readtable (current-readtable)
                    #\, 'terminating-macro misplaced-delimiter
                    #\> 'terminating-macro misplaced-delimiter)))

(define (wrap l) 
  `(make-tuple (list ,@l)))

(define parse-open-tuple
  (case-lambda
   [(ch port) 
    ;; `read' mode
    (wrap (parse port 
                 (lambda () (read/recursive port #f 
                                            (make-delims-table)))
                 (object-name port)))]
   [(ch port src line col pos)
    ;; `read-syntax' mode
    (datum->syntax-object
     #f
     (wrap (parse port 
                  (lambda () (read-syntax/recursive src port #f 
                                                    (make-delims-table)))
                  src))
     (let-values ([(l c p) (port-next-location port)])
       (list src line col pos (and pos (- p pos)))))]))
    

(define tuple-readtable
  (make-readtable #f #\< 'terminating-macro parse-open-tuple))

(parameterize ([current-readtable tuple-readtable])
  (read (open-input-string "<1 , 2 , \"a\">")))
;; => '(make-tuple (list 1 2 "a"))

(parameterize ([current-readtable tuple-readtable])
  (read (open-input-string "< #||# 1 #||# , #||# 2 #||# , #||# \"a\" #||# >")))
;; => '(make-tuple (list 1 2 "a"))

(define tuple-readtable+
  (make-readtable tuple-readtable
                  #\* 'terminating-macro (lambda a (make-special-comment #f))
                  #\_ #\space #f))
(parameterize ([current-readtable tuple-readtable+])
  (read (open-input-string "< * 1 __,__  2 __,__ * \"a\" * >")))
;; => '(make-tuple (list 1 2 "a"))

11.2.9 Reader-Extension Procedures

MzScheme's reader can be extended in three ways: through a reader-macro procedure in a readtable (see section 11.2.8), through a #reader form (see section 11.2.4), or through a custom-port byte reader that returns a ``special'' result procedure (see section 11.1.7.1). All three kinds of procedures accept similar arguments, and their results are treated in the same way by read and read-syntax (or, more precisely, by the default read handler; see section 11.2.6).

Calls to these reader-extension procedures can be triggered through read, read/recursive, read-syntax, or read-honu-syntax. In addition, a special-read procedure can be triggered by calls to read-honu, read-honu/recursive, read-honu-syntax, read-honu-syntax/recursive, read-char-or-special, or by the context of read-bytes-avail!, read-bytes-avail!*, read-bytes-avail!, and peek-bytes-avail!*.

Optional arities for reader-macro and special-result procedures allow them to distinguish reads via read, etc. from reads via read-syntax, etc. in the case that the source value is #f and no other location information is available.

Procedure arguments

A reader-macro procedure must accept six arguments, and it can optionally accept two arguments. The first two arguments are always the character that triggered the reader macro and the input port for reading. When the reader macro is triggered by read-syntax (or read-syntax/recursive), the procedure is passed four additional arguments that represent a source location. When the reader macro is triggered by read (or read/recursive), the procedure is passed only two arguments if it accepts two arguments, otherwise it is passed six arguments where the last four are all #f.

A #reader-loaded procedure accepts the same arguments as either read or read-syntax, depending on whether the procedure was loaded through read, etc. or through read-syntax, etc.

A special-result procedure must accept four arguments, and it can optionally accept zero arguments. When the special read is triggered by read-syntax (or read-honu-syntax, read-syntax/recursive, etc.), the procedure is passed four arguments that represent a source location. When the special read is triggered by read (or read-char-or-special, read-honu, read/syntax, etc.), the procedure is passed no arguments if it accepts zero arguments, otherwise it is passed four arguments that are all #f.

Procedure result

When a reader-extension procedure is called in syntax-reading mode (via read-syntax, etc.), it should generally return a syntax object that has no lexical context (e.g., a syntax object created using datum->syntax-object with #f as the first argument and with the given location information as the third argument). Another possible result is a special-comment value (see section 11.2.9.1). If the procedure's result is not a syntax object and not a special-comment value, it is converted to one using datum->syntax-object.

When a reader-extension procedure is called in non-syntax-reading modes, it should generally not return a syntax object. If a syntax object is returned, it is converted to a plain value using syntax-object->datum.

In either context, when the result from a reader-extension procedure is a special-comment value (see section 11.2.9.1), then read, read-syntax, etc. treat the value as a delimiting comment and otherwise ignore it.

Also in either context, the result may be copied to prevent mutation to pairs, vectors, or boxes before the read result is completed, and to support the construction of graphs with cycles. Mutable pairs, boxes, and vectors are copied, along with any pairs, boxes, or vectors that lead to such mutable values, to placeholders produced by a recursive read (see section 11.2.9.2), or to references of a shared value. Graph structure (including cycles) is preserved in the copy.

11.2.9.1 Special Comments

(make-special-comment v) creates a special-comment value that encapsulates v. The read, read-syntax, etc. procedures treat values constructed with make-special-comment as delimiting whitespace when returned by a reader-extension procedure (see section 11.2.9).

(special-comment? v) returns #t if v is the result of make-special-comment, #f otherwise.

(special-comment-value sc) returns the value encapsulated by the special-comment value sc. This value is never used directly by a reader, but it might be used by the context of a read-char-or-special, etc. call that detects a special comment.

11.2.9.2 Recursive Reads

(read/recursive [input-port char-or-false readtable]) is similar to calling read, but it is normally used during the dynamic extent of read within a reader-extension procedure (see section 11.2.9). The main effect of using read/recursive instead of read is that graph-structure annotations (see section 11.2.5.1) in the nested read are considered part of the overall read. Since the result is wrapped in a placeholder, however, it is not directly inspectable.

If char-or-false is provided and not #f, it is effectively prefixed to the beginning of input-port's stream for the read. (To prefix multiple characters, use input-port-append from MzLib's port library; see Chapter 35 in PLT MzLib: Libraries Manual.)

The readtable argument, which defaults to (current-readtable), is used for top-level parsing to satisfy the read request; recursive parsing within the read (e.g., to read the elements of a list) instead uses the current readtable as determined by the current-readtable parameter. A reader macro might call read/recursive with a character and readtable to effectively invoke the readtable's behavior for the character. If readtable is #f, the default readtable is used for top-level parsing.

When called within the dynamic extent of read, the read/recursive procedure produces either an opaque placeholder value, a special-comment value, or an end-of-file. The result is a special-comment value (see section 11.2.9.1) when the input stream's first non-whitespace content parses as a comment. The result is end-of-file when read/recursive encounters an end-of-file. Otherwise, the result is a placeholder that protects graph references that are not yet resolved. When this placeholder is returned within an S-expression that is produced by any reader-extension procedure (see section 11.2.9) for the same outermost read, it will be replaced with the actual read value before the outermost read returns.

(read-syntax/recursive [source-name-v input-port char-or-false readtable]) is analogous to calling read/recursive, but the resulting value encapsulates S-expression structure with source-location information. As with read/recursive, when read-syntax/recursive is used within the dynamic extent of read-syntax, the result of from read-syntax/recursive is either a special-comment value, end-of-file, or opaque graph-structure placeholder (not a syntax object). The placeholder can be embedded in an S-expression or syntax object returned by a reader macro, etc., and it will be replaced with the actual syntax object before the outermost read-syntax returns.

Using read/recursive within the dynamic extent of read-syntax does not allow graph structure for reading to be included in the outer read-syntax parsing, and neither does using read-syntax/recursive within the dynamic extent of read. In those cases, read/recursive and read-syntax/recursive produce results like read and read-syntax.

See section 11.2.8 for an extended example that uses read/recursive and read-syntax/recursive.

11.2.10 Customizing the Printer through Custom-Write Procedures

The built-in prop:custom-write structure type property associates a procedures to a structure type. The procedure is used by the default printer to display or write (or print) instances of the structure type.

See section 4.4 for general information on structure type properties.

The procedure for a prop:custom-write value takes three arguments: the structure to be printed, the target port, and a boolean that is #t for write mode and #f for display mode. The procedure should print the value to the given port using write, display, fprintf, write-special, etc.

The write handler, display handler, and print handler are specially configured for a port given to a custom-write procedure. Printing to the port through display, write, or print prints a value recursively with sharing annotations. To avoid a recursive print (i.e., to print without regard to sharing with a value currently being printed), print instead to a string or pipe and transfer the result to the target port using write-string and write-special. To recursively print but to a port other than the one given to the custom-write procedure, copy the given port's write handler, display handler, and print handler to the other port.

The port given to a custom-write handler is not necessarily the actual target port. In particular, to detect cycles and sharing, the printer invokes a custom-write procedure with a port that records recursive prints, and does not retain any other output.

Recursive print operations may trigger an escape from the call to the custom-write procedure (e.g., for pretty-printing where a tentative print attempt overflows the line, or for printing error output of a limited width).

The following example definition of a tuple type includes custom-write procedures that print the tuple's list content using angle brackets in write mode and no brackets in display mode. Elements of the tuple are printed recursively, so that graph and cycle structure can be represented.

(define (tuple-print tuple port write?)
  (when write? (write-string "<" port))
  (let ([l (tuple-ref tuple 0)])
    (unless (null? l)
      ((if write? write display) (car l) port)
      (for-each (lambda (e)
                  (write-string ", " port)
                  ((if write? write display) e port))
                (cdr l))))
  (when write? (write-string ">" port)))

(define-values (s:tuple make-tuple tuple? tuple-ref tuple-set!)
  (make-struct-type 'tuple #f 1 0 #f
                    (list (cons prop:custom-write tuple-print))))

(display (make-tuple '(1 2 "a"))) ; prints 1, 2, a

(let ([t (make-tuple (list 1 2 "a"))])
  (set-car! (tuple-ref t 0) t)
  (write t))  ; prints #0=<#0#, 2, "a">

11.3 Filesystem Utilities

MzScheme provides many operations for accessing and modifying filesystems in a (mostly) platform-independent manner. Additional filesystem utilities are in MzLib; see also Chapter 20 in PLT MzLib: Libraries Manual.

11.3.1 Paths

The format of a filesystem path varies across platforms. For example, under Unix, directories are separated by ``/'' while Windows uses both ``/'' and ``\''. Furthermore, for most Unix filesystems, the true name of a file is a byte string, but users prefer to see the bytes decoded in a locale-specific way when the filename is printed. MzScheme therefore provides a path datatype for managing filesystem paths, and procedures such as build-path, path->string, and bytes->path for manipulating paths.

When a MzScheme procedure takes a filesystem path as an argument, the path can be provided either as a string or as an instance of the path datatype. If a string is provided, it is converted to a path using string->path. A MzScheme procedure that generates a filesystem path always generates a path value.

By default, paths are created and manipulated for the current platform, but procedures that merely manipulate paths (without using the filesystem) can manipulate paths using conventions for other supported platforms. The bytes->path procedure accepts an optional argument that indicates the platform for the path, either 'unix or 'windows. For other functions, such as build-path or simplify-path, the behavior is sensitive to the kind of path that is supplied. Unless otherwise specified, a procedure that requires a path accepts only paths for the current platform.

Two path values are equal? when they are use the same convention type and when their byte-string representations are equal?. A path string (or byte string) cannot be empty, and it cannot contain a nul character or byte. When an empty string or a string containing nul is provided as a path to any procedure except absolute-path?, relative-path?, or complete-path? the exn:fail:contract exception is raised.

Most MzScheme primitives that take path perform an expansion on the path before using it. Procedures that build paths or merely check the form of a path do not perform this expansion, with the exception of simplify-path for Windows paths. For more information about path expansion and other platform-specific details, see section 20.1 for Unix and Mac OS X paths and section 20.2 for Windows paths.

The basic path utilities are as follows:

(path? v) returns #t if v is a path value for the current platform (not a string, and not a path for a different platform), #f otherwise.
(path-string? v) returns #t if v is either a path value or a non-empty string without nul characters, #f otherwise.
(path-for-some-system? v) returns #t if v is a path value for some platform (not a string), #f otherwise.
(string->path string) produces a path whose byte-string name is (string->bytes/locale string (char->integer #\?)); see section 3.6 for more information on string->bytes/locale. Beware that the current locale might not encode every string, in which case string->path can produce the same path for different strings. See also string->path-element, which should be used instead of string->path when a string represents a single path element.
(bytes->path bytes [type-symbol]) produces a path (for some platform) whose byte-string name is bytes. The optional type-symbol specifies the convention to use for the path, and it can be any possible result from system-path-convention-type (see below); it defaults to the value for the current platform. For converting relative path elements from literals, use instead bytes->path-element (described below), which applies a suitable encoding for individual elements.
(path->string path) produces a string that represents path by decoding path's byte-string name using the current locale's encoding; ``?'' is used in the result string where encoding fails, and if the encoding result is the empty string, then the result is "?". The resulting string is suitable for displaying to a user, string-ordering comparisons, etc., but it is not suitable for re-creating a path (possibly modified) via string->path, since decoding and re-encoding the path's byte string may lose information. Furthermore, for display and sorting based on individual path elements (such as pathless file names), use path-element->string, instead, to avoid special encodings use to represent some relative paths. See section 20.2 for specific information about the conversion of Windows paths.
(path->bytes path) produces path's byte string representation. No information is lost in this translation, so that (bytes->path (path->bytes path) (path-convention-type path)) always produces a path is that is equal? to path. The path argument can be a path for any platform. Conversion to and from byte values is useful for marshaling and unmarshaling paths, but manipulating the byte form of a path is generally a mistake. In particular, the byte string may start with a \\?\REL encoding for Windows paths or a ./~ encoding for Unix and Mac OS X paths. Instead of path->bytes, use split-path and path-element->bytes (described below) to manipulate individual path elements.
(string->path-element string) is like string->path, except that string corresponds to a single relative element in a path, and it is encoded as necessary to convert it to a path. See section 20.1 for more information on the conversion for Unix and Mac OS X paths, and see section 20.2 for more information on the conversion for Windows paths. If string does not correspond to any path element (e.g., it is an absolute path, or it can be split), or if it corresponds to an up-directory or same-directory indicator under Unix and Mac OS X, then exn:fail:contract exception is raised. As for path->string, information can be lost from string in the locale-specific conversion to a path.
(bytes->path-element bytes [type-symbol]) is like bytes->path, except that bytes corresponds to a single relative element in a path. In terms of conversions and restrictions on bytes, bytes->path-element is like string->path-element. The bytes->path-element procedure is generally the best choice for reconstructing a path based on another path (where the other path is deconstructed with split-path and path-element->bytes) when ASCII-level manipulation of path elements is necessary.
(path-element->string path) is like path->string, except any encoding prefix is removed. See section 20.1 for more information on the conversion for Unix and Mac OS X paths, and see section 20.2 for more information on the conversion for Windows paths. In addition, trailing path separators are removed, as by split-path. The path argument must be such that split-path applied to path would return 'relative as its first result and a path as its second result, otherwise the exn:fail:contract exception is raised. The path-element->string procedure is generally the best choice for presenting a pathless file or directory name to a user.
(path-element->bytes path) is like path->bytes, except that any encoding prefix is removed, etc., as for path-element->string. For any reasonable locale, consecutive ASCII characters in the printed form of path are mapped to consecutive byte values that match each character's code-point value, and a leading or trailing ASCII character is mapped to a leading or trailing byte, respectively. The path argument can be a path for any platform. The path-element->bytes procedure is generally the right choice (in combination with split-path) for extracting the content of a path to manipulate it at the ASCII level (then reassembling the result with bytes->path-element and build-path).
(path-convention-type path) accepts a path value (not a string) and returns its convention type. The possible results are the same as the possible results of system-path-convention-type (see below).
(system-path-convention-type) returns the path convention type of the current platform: 'unix for Unix and Mac OS X, 'windows for Windows.

(build-path base-path sub-path ···) creates a path given a base path and any number of sub-path extensions. If base-path is an absolute path, the result is an absolute path; if base is a relative path, the result is a relative path. Each sub-path must be either a relative path, a directory name, the symbol 'up (indicating the relative parent directory), or the symbol 'same (indicating the relative current directory). For Windows paths, if base-path is a drive specification (with or without a trailing slash) the first sub-path can be an absolute (driveless) path. For all platforms, the last sub-path can be a filename.

The base-path and sub-paths arguments can be paths for any platform. The platform for the resulting path is inferred from the base-path and sub-path arguments, where string arguments imply a path for the current platform. If different arguments are for different platforms, the exn:fail:contract exception is raised. If no argument implies a platform (i.e., all are 'up or 'same), the generated path is for the current platform.

Each sub-path and base-path can optionally end in a directory separator. If the last sub-path ends in a separator, it is included in the resulting path.

If base-path or sub-path is an illegal path string (because it is empty or contains a nul character), the exn:fail:contract exception is raised.

The build-path procedure builds a path without checking the validity of the path or accessing the filesystem.

See section 20.1 for more information on the construction of Unix and Mac OS X paths, and see section 20.2 for more information on the construction of Windows paths.

The following examples assume that the current directory is /home/joeuser for Unix examples and C:\Joe's Files for Windows examples.

(define p1 (build-path (current-directory) "src" "scheme")) 
  ; Unix: p1 => "/home/joeuser/src/scheme"
  ; Windows: p1 => "C:\Joe's Files\src\scheme"
(define p2 (build-path 'up 'up "docs" "MzScheme")) 
  ; Unix: p2 => "../../docs/MzScheme"
  ; Windows: p2 => "..\..\docs\MzScheme"
(build-path p2 p1) 
  ; Unix and Windows: raises exn:fail:contract because p1 is absolute 
(build-path p1 p2) 
  ; Unix: => "/home/joeuser/src/scheme/../../docs/MzScheme"
  ; Windows: => "C:\Joe's Files\src\scheme\..\..\docs\MzScheme"

(build-path/convention-type type-symbol base-path sub-path ···) is like build-path, except a path convention type is specified explicitly. The type-symbol argument must be a possible result of system-path-convention-type (see above) for some platform.
(absolute-path? path) returns #t if path is an absolute path, #f otherwise. The path argument can be a path for any platform. If path is not a legal path string (e.g., it contains a nul character), #f is returned. This procedure does not access the filesystem.
(relative-path? path) returns #t if path is a relative path, #f otherwise. The path argument can be a path for any platform. If path is not a legal path string (e.g., it contains a nul character), #f is returned. This procedure does not access the filesystem.
(complete-path? path) returns #t if path is a completely determined path (not relative to a directory or drive), #f otherwise. The path argument can be a path for any platform. Note that for Windows paths, an absolute path can omit the drive specification, in which case the path is neither relative nor complete. If path is not a legal path string (e.g., it contains a nul character), #f is returned. This procedure does not access the filesystem.
(path->complete-path path [base-path]) returns path as a complete path. If path is already a complete path, it is returned as the result. Otherwise, path is resolved with respect to the complete path base-path. If base-path is omitted, path is resolved with respect to the current directory. If base-path is provided and it is not a complete path, the exn:fail:contract exception is raised. The path and base-path arguments can paths for any platform, as long as both are supplied; if they are for different platforms, the exn:fail:contract exception is raised. This procedure does not access the filesystem.
(path->directory-path path) returns path if path syntactically refers to a directory and ends in a separator, otherwise it returns an extended version of path that specifies a directory and ends with a separator. For example, under Unix and Mac OS X, the path x/y/ syntactically refers to a directory and ends in a separator, but x/y would be extended to x/y/, and x/.. would be extended to x/../. The path argument can be a path for any platform, and the result will be for the same platform. This procedure does not access the filesystem.
(resolve-path path) expands path and returns a path that references the same file or directory as path. Under Unix and Mac OS X, if path is a soft link to another path, then the referenced path is returned (this may be a relative path with respect to the directory owning path) otherwise path is returned (after expansion).
(expand-path path) returns the expanded version of path (as described at the beginning of this section). The filesystem might be accessed, but the source or expanded path might be a non-existent path.
(simplify-path path [use-filesystem?]) eliminates redundant path separators (except for a single trailing separator), up-directory (``..''), and same-directory (``.'') indicators in path, and changes forward-slash separators to backslahs separators in Windows paths, such that the result accesses the same file or directory (if it exists) as path. In general, the pathname is normalized as much as possible -- without consulting the filesystem if use-filesystem? is #f, and (under Windows) without changing the case of letters within the path. If path syntactically refers to a directory, the result ends with a directory separator.

When path is simplified and use-filesystem? is true (the default), a complete path is returned; if path is relative, it is resolved with respect to the current directory, and up-directory indicators are removed taking into account soft links (so that the resulting path refers to the same directory as before).

When use-filesystem? is #f, up-directory indicators are removed by deleting a preceding path element, and the result can be a relative path with up-directory indicators remaining at the beginning of the path or, for Unix and Mac OS X paths, after an initial path element that starts with tilde (``~''); otherwise, up-directory indicators are dropped when they refer to the parent of a root directory. Similarly, the result can be the same as (build-path 'same) (but with a trailing separator) if eliminating up-directory indicators leaves only same-directory indicators, and the result can start with a same-directory indicator for Unix and Mac OS X paths if eliminating it would make the result start with a tilde (``~'').

The path argument can be a path for any platform when use-filesystem? is #f, and the resulting path is for the same platform.

The filesystem might be accessed when use-filesystem? is true, but the source or expanded path might be a non-existent path. If path cannot be simplified due to a cycle of links, the exn:fail:filesystem exception is raised (but a successfully simplified path may still involve a cycle of links if the cycle did not inhibit the simplification).

See section 20.1 for more information on simplifying Unix and Mac OS X paths, and see section 20.2 for more information on simplifying Windows paths.
(normal-case-path path) returns path with ``normalized'' case letters. For Unix and Mac OS X paths, this procedure always returns the input path, because filesystems for these platforms can be case-sensitive. For Windows paths, if path does not start \\?\, the resulting string uses only lowercase letters, based on the current locale. In addition, for Windows paths when the path does not start \\?\, all forward slashes (``/'') are converted to backward slashes (``\''), and trailing spaces and periods are removed. The path argument can be a path for any platform, but beware that local-sensitive decoding and conversion of the path may be different on the current platform than for the path's platform. This procedure does not access the filesystem.
(split-path path) deconstructs path into a smaller path and an immediate directory or file name. Three values are returned (see section 2.2):
- base is either
  - a path,
  - 'relative if path is an immediate relative directory or filename, or
  - #f if path is a root directory.
- name is either
  - a directory-name path,
  - a filename,
  - 'up if the last part of path specifies the parent directory of the preceding path (e.g., ``..'' under Unix), or
  - 'same if the last part of path specifies the same directory as the preceding path (e.g., ``.'' under Unix).
- must-be-dir? is #t if path explicitly specifies a directory (e.g., with a trailing separator), #f otherwise. Note that must-be-dir? does not specify whether name is actually a directory or not, but whether path syntactically specifies a directory.
Compared to path, redundant separators (if any) are removed in the result base and name. If base is #f, then name cannot be 'up or 'same. The path argument can be a path for any platform, and resulting paths for the same platform. This procedure does not access the filesystem.

See section 20.1 for more information on splitting Unix and Mac OS X paths, and see section 20.2 for more information on splitting Windows paths.

(path-replace-suffix path string-or-bytes) returns a path that is the same as path, except that the suffix for the last element of the path is changed to string-or-bytes. If the last element of path has no suffix, then string-or-bytes is added to the path. A suffix is defined as a period followed by any number of non-period characters/bytes at the end of the path element. The path argument can be a path for any platform, and the result is for the same platform. If path represents a root, the exn:fail:contract exception is raised.

11.3.2 Locating Paths

The find-system-path and find-executable-path procedures locate useful files and directories:

(find-system-path kind-symbol) returns a machine-specific path for a standard type of path specified by kind-symbol, which must be one of the following:
- 'home-dir -- the current user's home directory.
  
  Under Unix and Mac OS X, this directory is determined by expanding the path ~, which is expanded by first checking for a HOME environment variable. If none is defined, the USER and LOGNAME environment variables are consulted (in that order) to find a user name, and then system files are consulted to locate the user's home directory.
  
  Under Windows, the user's home directory is the user-specific profile directory as determined by the Windows registry. If the registry cannot provide a directory for some reason, the value of the USERPROFILE environment variable is used instead, as long as it refers to a directory that exists. If USERPROFILE also fails, the directory is the one specified by the HOMEDRIVE and HOMEPATH environment variables. If those environment variables are not defined, or if the indicated directory still does not exist, the directory containing the MzScheme executable is used as the home directory.
- 'pref-dir -- the standard directory for storing the current user's preferences. Under Unix, the directory is .plt-scheme in the user's home directory. Under Windows, it is PLT Scheme in the user's application-data folder as specified by the Windows registry; the application-data folder is usually Application Data in the user's profile directory. Under Mac OS X, it is Library/Preferences in the user's home directory. This directory might not exist.
- 'pref-file -- a file that contains a symbol-keyed association list of preference values. The file's directory path always matches the result returned for 'pref-dir. The file name is plt-prefs.ss under Unix and Windows, and it is org.plt-scheme.prefs.ss under Mac OS X. The file's directory might not exist. See also get-preference in Chapter 20 in PLT MzLib: Libraries Manual.
- 'temp-dir -- the standard directory for storing temporary files. Under Unix and Mac OS X, this is the directory specified by the TMPDIR environment variable, if it is defined.
- 'init-dir -- the directory containing the initialization file used by stand-alone MzScheme application. It is the same as the current user's home directory.
- 'init-file -- the file loaded at start-up by the stand-alone MzScheme application. The directory part of the path is the same path as returned for 'init-dir. The file name is platform-specific:
  - Unix and Mac OS X: .mzschemerc
  - Windows: mzschemerc.ss
- 'addon-dir -- a directory for installing PLT Scheme extensions. It's the same as 'pref-dir, except under Mac OS X, where it's Library/PLT Scheme in the user's home directory. This directory might not exist.
- 'doc-dir -- the standard directory for storing the current user's documents. It's the same as 'home-dir under Unix and Mac OS X. Under Windows, it is the user's documents folder as specified by the Windows registry; the documents folder is usually My Documents in the user's home directory.
- 'desk-dir -- the directory for the current user's desktop. Under Unix, it's the same as 'home-dir. Under Windows, it is the user's desktop folder as specified by the Windows registry; the documents folder is usually Desktop in the user's home directory. Under Mac OS X, it is the desktop directory, which is specifically ~/Desktop under Mac OS X.
- 'sys-dir -- the directory containing the operating system for Windows. Under Unix and Mac OS X, the result is "/".
- 'exec-file -- the path of the MzScheme executable as provided by the operating system for the current invocation.³⁸
- 'run-file -- the path of the current executable; this may be different from result for 'exec-file because an alternate path was provided through a --name or -N command-line flag to stand-alone MzScheme (or MrEd), or because an embedding executable installed an alternate path. In particular a ``launcher'' script created by make-mzscheme-launcher sets this path to the script's path. In the stand-alone MzScheme application, this path is also bound initially to program.
- 'collects-dir -- a path to the main collection of libraries (see section 16). If this path is relative, it's relative to the directory of (find-system-path 'exec-file). This path is normally embedded in a stand-alone MzScheme executable, but it can be overridden by the --collects or -X command-line flag.
- 'orig-dir -- the current directory at start-up, which can be useful in converting a relative-path result from (find-system-path 'exec-file) or (find-system-path 'run-file) to a complete path.
(path-list-string->path-list string default-path-list) parses a string or byte string containing a list of paths, and returns a list of path strings. Under Unix and Mac OS X, paths in a path list are separated by a colon (``:''); under Windows, paths are separated by a semi-colon (``;''). Whenever the path list contains an empty path, the list default-path-list is spliced into the returned list of paths. Parts of string that do not form a valid path are not included in the returned list.
(find-executable-path program-sub-path related-sub-path [deepest?]) finds a path for the executable program-sub-path, returning #f if the path cannot be found.

If related-sub-path is not #f, then it must be a relative path string, and the path found for program-sub-path must be such that the file or directory related-sub-path exists in the same directory as the executable. The result is then the full path for the found related-sub-path, instead of the path for the executable.

This procedure is used by MzScheme (as a stand-alone executable) to find the standard library collection directory (see Chapter 16). In this case, program is the name used to start MzScheme and related is "collects". The related-sub-path argument is used because, under Unix and Mac OS X, program-sub-path may involve to a sequence of soft links; in this case, related-sub-path determines which link in the chain is relevant.

If related-sub-path is not #f, then when find-executable-path does not finds a program-sub-path that is a link to another file path, the search can continue with the destination of the link. Further links are inspected until related-sub-path is found or the end of the chain of links is reached. If deepest? is #f (the default), then the result corresponds to the first path in a chain of links for which related-sub-path is found (and further links are not actually explored); otherwise, the result corresponds to the last link in the chain for which related-sub-path is found.

If program-sub-path is a pathless name, find-executable-path gets the value of the PATH environment variable; if this environment variable is defined, find-executable-path tries each path in PATH as a prefix for program-sub-path using the search algorithm described above for path-containing program-sub-paths. If the PATH environment variable is not defined, program-sub-path is prefixed with the current directory and used in the search algorithm above. (Under Windows, the current directory is always implicitly the first item in PATH, so find-executable-path checks the current directory first under Windows.)

11.3.3 Files

The file management utilities are:

(file-exists? path) returns #t if a file (not a directory) path exists, #f otherwise.³⁹
(link-exists? path) returns #t if a link path exists (Unix and Mac OS X), #f otherwise. Note that the predicates file-exists? or directory-exists? work on the final destination of a link or series of links, while link-exists? only follows links to resolve the base part of path (i.e., everything except the last name in the path). This procedure never raises the exn:fail:filesystem exception.
(delete-file path) deletes the file with path path if it exists, returning void if a file was deleted successfully, otherwise the exn:fail:filesystem exception is raised. If path is a link, the link is deleted rather than the destination of the link.
(rename-file-or-directory old-path new-path [exists-ok?]) renames the file or directory with path old-path -- if it exists -- to the path new-path. If the file or directory is renamed successfully, void is returned, otherwise the exn:fail:filesystem exception is raised.

This procedure can be used to move a file/directory to a different directory (on the same disk) as well as rename a file/directory within a directory. Unless exists-ok? is provided as a true value, new-path cannot refer to an existing file or directory. Even if exists-ok? is true, new-path cannot refer to an existing file when old-path is a directory, and vice versa. (If new-path exists and is replaced, the replacement is atomic in the filesystem, except under Windows 95, 98, or Me. However, the check for existence is not included in the atomic action, which means that race conditions are possible when exists-ok? is false or not supplied.)

If old-path is a link, the link is renamed rather than the destination of the link, and it counts as a file for replacing any existing new-path.
(file-or-directory-modify-seconds path [secs-n fail-thunk]) returns the file or directory's last modification date as platform-specific seconds (see also section 15.1) when secs-n is not provided or is #f.⁴⁰ If secs-n is provided and not #f, the access and modification times of path are set to the given time. On error (e.g., if no such file exists), fail-thunk is called if it is provided, otherwise the exn:fail:filesystem exception is raised
(file-or-directory-permissions path) returns a list containing 'read, 'write, and/or 'execute for the given file or directory path. On error (e.g., if no such file exists), the exn:fail:filesystem exception is raised. Under Unix and Mac OS X, permissions are checked for the current effective user instead of the real user.
(file-size path) returns the (logical) size of the specified file in bytes. (Under Mac OS X, this size excludes the resource-fork size.) On error (e.g., if no such file exists), the exn:fail:filesystem exception is raised.
(copy-file src-path dest-path) creates the file dest-path as a copy of src-path. If the file is successfully copied, void is returned, otherwise the exn:fail:filesystem exception is raised. If dest-path already exists, the copy will fail. File permissions are preserved in the copy. Under Mac OS X, the resource fork is also preserved in the copy. If src-path refers to a link, the target of the link is copied, rather than the link itself.
(make-file-or-directory-link to-path path) creates a link path to to-path under Unix and Mac OS X. The creation will fail if path already exists. The to-path need not refer to an existing file or directory, and to-path is not expanded before writing the link. If the link is created successfully, void is returned, otherwise the exn:fail:filesystem exception is raised. Under Windows, the exn:fail:unsupported exception is raised always.

11.3.4 Directories

The directory management utilities are:

(current-directory) returns the current directory and (current-directory path) sets the current directory to path. This procedure is actually a parameter, as described in section 7.9.1.1.
(current-drive) returns the current drive name Windows. For other platforms, the exn:fail:unsupported exception is raised. The current drive is always the drive of the current directory.
(directory-exists? path) returns #t if path refers to a directory, #f otherwise.
(make-directory path) creates a new directory with the path path. If the directory is created successfully, void is returned, otherwise the exn:fail:filesystem exception is raised.
(delete-directory path) deletes an existing directory with the path path. If the directory is deleted successfully, void is returned, otherwise the exn:fail:filesystem exception is raised.
(rename-file-or-directory old-path new-path exists-ok?), as described in the previous section, renames directories.
(file-or-directory-modify-seconds path), as described in the previous section, gets directory dates.
(file-or-directory-permissions path), as described in the previous section, gets directory permissions.
(directory-list [path]) returns a list of all files and directories in the directory specified by path. If path is omitted, a list of files and directories in the current directory is returned. Under Unix and Mac OS X, an element of the list can start with period-slash-tilde (``./~'') if it would otherwise start with tilde (``~''). Under Windows, an element of the list may start with \\?\REL\\.
(filesystem-root-list) returns a list of all current root directories. Obtaining this list can be particularly slow under Windows.

11.4 Networking

MzScheme supports networking with the TCP and UDP protocols.

11.4.1 TCP

For information about TCP in general, see TCP/IP Illustrated, Volume 1 by W. Richard Stevens.

(tcp-listen port-k [max-allow-wait-k reuse? hostname-string-or-false]) creates a ``listening'' server on the local machine at the specified port number (where port-k is an exact integer between 1 and 65535 inclusive). The max-allow-wait-k argument determines the maximum number of client connections that can be waiting for acceptance. (When max-allow-wait-k clients are waiting acceptance, no new client connections can be made.) The default value for max-allow-wait-k argument is 4.

If the reuse? argument is true, then tcp-listen will create a listener even if the port is involved in a TIME_WAIT state. Such a use of reuse? defeats certain guarantees of the TCP protocol; see Stevens's book for details. Furthermore, on many modern platforms, a true value for reuse? overrides TIME_WAIT only if the listener was previously created with a true value for reuse?. The default for reuse? is #f.

If hostname-string-or-false is #f (the default), then the listener accepts connections to all of the listening machine's addresses.⁴¹ Otherwise, the listener accepts connections only at the interface(s) associated with the given hostname. For example, providing "127.0.0.1" as hostname-string-or-false creates a listener that accepts only connections to "127.0.0.1" (the loopback interface) from the local machine.

The return value of tcp-listen is a TCP listener value. This value can be used in future calls to tcp-accept, tcp-accept-ready?, and tcp-close. Each new TCP listener value is placed into the management of the current custodian (see section 9.2).

If the server cannot be started by tcp-listen, the exn:fail:network exception is raised.
(tcp-connect hostname-string port-k [local-hostname-string-or-false local-port-k-or-false]) attempts to connect as a client to a listening server. The hostname-string argument is the server host's Internet address name ⁴² (e.g., "www.plt-scheme.org"), and port-k (an exact integer between 1 and 65535) is the port where the server is listening.

The optional local-hostname-string-or-false and local-port-k-or-false specify the client's address and port. If both are #f (the default), the client's address and port are selected automatically. If local-hostname-string-or-false is not #f, then local-port-k-or-false must be non-#f. If local-port-k-or-false is non-#f and local-hostname-string-or-false is #f, then the given port is used but the address is selected automatically.

Two values (see section 2.2) are returned by tcp-connect: an input port and an output port. Data can be received from the server through the input port and sent to the server through the output port. If the server is a MzScheme process, it can obtain ports to communicate to the client with tcp-accept. These ports are placed into the management of the current custodian (see section 9.2).

Initially, the returned input port is block-buffered, and the returned output port is block-buffered. Change the buffer mode using file-stream-buffer-mode (see section 11.1.6).

Both of the returned ports must be closed to terminate the TCP connection. When both ports are still open, closing the output port with close-output-port sends a TCP close to the server (which is seen as an end-of-file if the server reads the connection through a port). In contrast, tcp-abandon-port (see below) closes the output port, but does not send a TCP close until the input port is also closed.

Note that the TCP protocol does not support a state where one end is willing to send but not read, nor does it include an automatic message when one end of a connection is fully closed. Instead, the other end of a connection discovers that one end is fully closed only as a response to sending data; in particular, some number of writes on the still-open end may appear to succeed, though writes will eventually produce an error.

If a connection cannot be established by tcp-connect, the exn:fail:network exception is raised.
(tcp-connect/enable-break hostname-string port-k [local-hostname-string local-port-k]) is like tcp-connect, but breaking is enabled (see section 6.7) while trying to connect. If breaking is disabled when tcp-connect/enable-break is called, then either ports are returned or the exn:break exception is raised, but not both.
(tcp-accept tcp-listener) accepts a client connection for the server associated with tcp-listener. The tcp-listener argument is a TCP listener value returned by tcp-listen. If no client connection is waiting on the listening port, the call to tcp-accept will block. (See also tcp-accept-ready?, below.)

Two values (see section 2.2) are returned by tcp-accept: an input port and an output port. Data can be received from the client through the input port and sent to the client through the output port. These ports are placed into the management of the current custodian (see section 9.2).

In terms of buffering and connection states, the ports act the same as ports from tcp-connect.

If a connection cannot be accepted by tcp-accept, or if the listener has been closed, the exn:fail:network exception is raised.
(tcp-accept-ready? tcp-listener) tests whether an unaccepted client has connected to the server associated with tcp-listener. The tcp-listener argument is a TCP listener value returned by tcp-listen. If a client is waiting, the return value is #t, otherwise it is #f. A client is accepted with the tcp-accept procedure, which returns ports for communicating with the client and removes the client from the list of unaccepted clients.

If the listener has been closed, the exn:fail:network exception is raised.
(tcp-accept/enable-break tcp-listener) is like tcp-accept, but breaking is enabled (see section 6.7) while trying to accept a connection. If breaking is disabled when tcp-accept/enable-break is called, then either ports are returned or the exn:break exception is raised, but not both.
(tcp-close tcp-listener) shuts down the server associated with tcp-listener. The tcp-listener argument is a TCP listener value returned by tcp-listen. All unaccepted clients receive an end-of-file from the server; connections to accepted clients are unaffected.

If the listener has already been closed, the exn:fail:network exception is raised.

The listener's port number may not become immediately available for new listeners (with the default reuse? argument of tcp-listen). For further information, see Stevens's explanation of the TIME_WAIT TCP state.
(tcp-listener? v) returns #t if v is a TCP listener value created by tcp-listen, #f otherwise.
(tcp-accept-evt tcp-listener) returns a synchronizable event (see section 7.7) that is in a blocking state when tcp-accept on tcp-listener would block. If the event is chosen in a synchronization, the result is a list of two items, which correspond to the two results of tcp-accept. (If the event is not chosen, no connections are accepted.)
(tcp-abandon-port tcp-port) is like close-output-port or close-input-port (depending on whether tcp-port is an input or output port), but if tcp-port is an output port and its associated input port is not yet closed, then the other end of the TCP connection does not receive a TCP close message until the input port is also closed.⁴³
(tcp-addresses tcp-port [port-numbers?]) returns two strings when port-numbers? is #f (the default). The first string is the Internet address for the local machine a viewed by the given TCP port's connection.⁴⁴ The second string is the Internet address for the other end of the connection.

If port-numbers? is true, then four results are returned: a string for the local machine's address, an exact integer between 1 and 65535 for the local machine's port number, a string for the remote machine's address, and an exact integer between 1 and 65535 for the remote machine's port number.

If the given port has been closed, the exn:fail:network exception is raised.
(tcp-port? v) returns #t if v is a port returned by tcp-accept, tcp-connect, tcp-accept/enable-break, or tcp-connect/enable-break, #f otherwise.

11.4.2 UDP

For information about UDP in general, see TCP/IP Illustrated, Volume 1 by W. Richard Stevens (which discusses UDP in addition to TCP).

(udp-open-socket [family-hostname-string-or-false family-port-k-or-false]) creates and returns a UDP socket to send and receive datagrams (broadcasting is allowed). Initially, the socket is not bound or connected to any address or port.

If family-hostname-string-or-false or family-port-k-or-false is provided and not #f, then the socket's protocol family is determined from these arguments. The socket is not bound to the hostname or port number. For example, the arguments might be the hostname and port to which messages will be sent through the socket, which ensures that the socket's protocol family is consistent with the destination. Alternately, the arguments might be the same as for a future call to udp-bind!, which ensures that the socket's protocol family is consistent with the binding. If neither family-hostname-string-or-false nor family-port-k-or-false is provided as non-#f, then the socket's protocol family is IPv4.
(udp-bind! udp-socket hostname-string-or-false port-k) binds an unbound udp-socket to the local port number port-k (an exact integer between 1 and 65535). The result is always void.

If hostname-string-or-false is #f, then the socket accepts connections to all of the listening machine's IP addresses at port-k. Otherwise, the socket accepts connections only at the IP address associated with the given name. For example, providing "127.0.0.1" as hostname-string-or-false typically creates a listener that accepts only connections to "127.0.0.1" from the local machine.

A socket cannot receive datagrams until it is bound to a local address and port. If a socket is not bound before it is used with a sending procedure udp-send, udp-send-to, etc., the sending procedure binds the socket to a random local port. Similarly, if an event from udp-send-evt or udp-send-to-evt is chosen for a synchronization (see section 7.7), the socket is bound; if the event is not chosen, the socket may or may not become bound. The binding of a bound socket cannot be changed.

If udp-socket is already bound or closed, the exn:fail:network exception is raised.
(udp-connect! udp-socket hostname-string-or-false port-k-or-false) connects the socket to the indicated remote address and port if hostname-string-or-false is a string and port-k-or-false is an exact integer between 1 and 65535. The result is always void.

If hostname-string-or-false is #f, then port-k-or-false also must be #f, and the port is disconnected (if connected). If one of hostname-string-or-false or port-k-or-false is #f and the other is not, the exn:fail:contract exception is raised.

A connected socket can be used with udp-send (not udp-send-to), and it accepts datagrams only from the connected address and port. A socket need not be connected to receive datagrams. A socket can be connected, re-connected, and disconnected any number of times.

If udp-socket is closed, the exn:fail:network exception is raised.
(udp-send-to udp-socket hostname-address port-k bytes [start-k end-k]) sends (subbytes bytes start-k end-k) as a datagram from the unconnected udp-socket to the socket at the remote machine hostname-address on the port port-k. The udp-socket need not be bound or connected; if it is not bound, udp-send-to binds it to a random local port. If the socket's outgoing datagram queue is too full to support the send, udp-send-to blocks until the datagram can be queued. The result is always void.

The optional start-k argument defaults to 0, and the optional end-k argument defaults to the length of bytes. If start-k is greater than the length of bytes, or if end-k is less than start-k or greater than the length of bytes, the exn:fail:contract exception is raised.

If udp-socket is closed or connected, the exn:fail:network exception is raised.
(udp-send udp-socket bytes [start-k end-k]) is like udp-send-to, except that udp-socket must be connected, and the datagram goes to the connection target. If udp-socket is closed or unconnected, the exn:fail:network exception is raised.
(udp-send-to* udp-socket hostname-address port-k bytes [start-k end-k]) is like udp-send-to, except that it never blocks; if the socket's outgoing queue is too full to support the send, #f is returned, otherwise the datagram is queued and the result is #t.
(udp-send* udp-socket bytes [start-k end-k]) is like udp-send, except that (like udp-send-to) it never blocks and returns #f or #t.
(udp-send-to/enable-break udp-socket hostname-address port-k bytes [start-k end-k]) is like udp-send-to, but breaking is enabled (see section 6.7) while trying to send the datagram. If breaking is disabled when udp-send-to/enable-break is called, then either the datagram is sent or the exn:break exception is raised, but not both.
(udp-send/enable-break udp-socket bytes [start-k end-k]) is like udp-send, except that breaks are enabled like udp-send-to/enable-break.
(udp-receive! udp-socket mutable-bytes [start-k end-k]) accepts up to end-k - start-k bytes of udp-socket's next incoming datagram into mutable-bytes, writing the datagram bytes starting at position start-k within mutable-bytes. The udp-socket must be bound to a local address and port (but need not be connected). If no incoming datagram is immediately available, udp-receive! blocks until one is available.

Three values are returned: an exact integer that indicates the number of received bytes (between 0 and end-k - start-k), a hostname string indicating the source address of the datagram, and an exact integer between 1 and 65535 indicating the source port of the datagram. If the received datagram is longer than end-k - start-k bytes, the remainder is discarded.

The optional start-k argument defaults to 0, and the optional end-k argument defaults to the length of mutable-bytes. If start-k is greater than the length of mutable-bytes, or if end-k is less than start-k or greater than the length of mutable-bytes, the exn:fail:contract exception is raised.
(udp-receive!* udp-socket mutable-bytes [start-k end-k]) is like udp-receive!, except that it never blocks. If no datagram is available, the three result values are all #f.
(udp-receive!/enable-break udp-socket mutable-bytes [start-k end-k]) is like udp-receive!, but breaking is enabled (see section 6.7) while trying to receive the datagram. If breaking is disabled when udp-receive!/enable-break is called, then either a datagram is received or the exn:break exception is raised, but not both.
(udp-close udp-socket) closes udp-socket, discarding unreceived datagrams. If the socket is already closed, the exn:fail:network exception is raised.
(udp? v) returns #t if v is a socket created by udp-open-socket, #f otherwise.
(udp-bound? udp-socket) returns #t if udp-socket is bound to a local address and port, #f otherwise.
(udp-connected? udp-socket) returns #t if udp-socket is connected to a remote address and port, #f otherwise.
(udp-send-ready-evt udp-socket) returns a synchronizable event (see section 7.7) that is in a blocking state when udp-send-to on udp-socket would block.
(udp-receive-ready-evt udp-socket) returns a synchronizable event (see section 7.7) that is in a blocking state when udp-receive! on udp-socket would block.
(udp-send-to-evt udp-socket hostname-address port-k bytes [start-k end-k]) returns a synchronizable event. The event is in a blocking state when udp-send on udp-socket would block. Otherwise, if the event is chosen in a synchronization, data is sent as for (udp-send-to udp-socket hostname-address port-k bytes start-k end-k), and the synchronization result is void. (No bytes are sent if the event is not chosen.)
(udp-send-evt udp-socket bytes [start-k end-k]) is like udp-send-to-evt, except that udp-socket must be connected when the event is synchronized, and if the event is chosen in a synchronization, the datagram goes to the connection target. If udp-socket is closed or unconnected, the exn:fail:network exception is raised during a synchronization attempt.
(udp-receive!-evt udp-socket bytes [start-k end-k]) returns a synchronizable event. The event is in a blocking state when udp-receive on udp-socket would block. Otherwise, if the event is chosen in a synchronization, data is receive into bytes as for (udp-receive! udp-socket bytes start-k end-k), and the synchronization result is a list of three values, corresponding to the three results from udp-receive!. (No bytes are received and the bytes content is not modified if the event is not chosen.)

³⁰ 63 is the same as (char->integer #\?).

³¹ Flushing is performed by the default port read handler (see section 11.2.6) rather than by read itself.

³² This non-byte result is not intended to return a character or eof; in particular, read-char raises an exception if it encounters a non-byte from a port.

³³ More precisely, the procedure is used by the default port read handler; see also section 11.2.6.

³⁴ A temporary string of size k is allocated while reading the input, even if the size of the result is less than k characters.

³⁵ Only mid-stream eofs can be committed. A eof when the port is exhausted does not correspond to data in the stream.

³⁶ Assuming that the current port display and write handlers are the default ones; see section 11.2.7 for more information.

³⁷ The port read handler is not used for read/recursive or read-syntax/recursive.

³⁸ For MrEd, the executable path is the name of a MrEd executable.

³⁹ Under Windows, file-exists? reports #t for all variations of the special filenames (e.g., "LPT1", "x:/baddir/LPT1").

⁴⁰ For FAT filesystems under Windows, directories do not have modification dates. Therefore, the creation date is returned for a directory (but the modification date is returned for a file).

⁴¹ MzScheme implements a listener with multiple sockets, if necessary, to accomodate multiple addresses with different protocol families. Under Linux, if hostname-string-or-false maps to both IPv4 and IPv6 addresses, then the behavior depends on whether IPv6 is supported and IPv6 sockets can be configured to listen to only IPv6 connections: if IPv6 is not supported or IPv6 sockets are not configurable, then the IPv6 addresses are ignored; otherwise, each IPv6 listener accepts only IPv6 connections.

⁴² If hostname-string is associated with multiple addresses, they are tried one at a time until a connection succeeds. The name "localhost" generally specifies the local machine.

⁴³ The TCP protocol does not include a ``no longer reading'' state on connections, so tcp-abandon-port is equivalent to close-input-port on input TCP ports.

⁴⁴ For most machines, the answer corresponds to the current machine's only Internet address. But when a machine serves multiple addresses, the result is connection-specific.