Chapter 11

Input and Output

11.1  Ports

The global variable eof is bound to the end-of-file value. The standard Scheme predicate eof-object? returns #t only when applied to this value. The predicate port? returns #t only for values for which either input-port? or output-port? returns #t.

11.1.1  Current Ports

The standard Scheme procedures current-input-port and current-output-port are implemented as parameters in MzScheme. See section 7.7.1.2 for more information.

11.1.2  Opening File Ports

The open-input-file and open-output-file procedures accept an optional flag argument after the filename that specifies a mode for the file:

The open-output-file procedure can also take a flag argument that specifies how to proceed when a file with the specified name already exists:

The open-input-output-file procedure takes the same arguments as open-output-file, but it produces two values: an input port and an output port. The two ports are connected in that they share the underlying file device. See section 11.1.5 for more information.

Extra flag arguments are passed to open-output-file in any order. Appropriate flag arguments can also be passed as the last argument(s) to call-with-input-file, with-input-from-file, call-with-output-file, and with-output-to-file. When conflicting flag arguments (e.g., both 'error and 'replace) are provided to open-output-file, with-output-to-file, or call-with-output-file, the exn:application:mismatch exception is raised.

Both with-input-from-file and with-output-to-file close the port they create if control jumps out of the supplied thunk (either through a continuation or an exception), and the port remains closed if control jumps back into the thunk. The current input or output port is installed and restored with parameterize (see section 7.7.2).

See section 11.1.5 for more information on file ports. When an input or output file-stream port is created, it is placed into the management of the current custodian (see section 9.2).

11.1.3  Pipes

(make-pipe [limit-k]) returns two port values (see section 2.2): the first port is an input port and the second is an output port. Data written to the output port is read from the input port. The ports do not need to be explicitly closed.

The optional limit-k argument can be #f or a positive exact integer. If limit-k is omitted or #f, the new pipe holds an unlimited number of unread characters (i.e., limited only by the available memory). If limit-k is a positive number, then the pipe will hold at most limit-k unread characters; writing to the pipe's output port thereafter will block until a read from the input port makes more space available.

11.1.4  String Ports

Scheme input and output can be read from or collected into a string:

String input and output ports do not need to be explicitly closed. The file-position procedure, described in section 11.1.5, works for string ports in position-setting mode.

Example:

(define i (open-input-string "hello world"))
(define o (open-output-string))
(write (read i) o)
(get-output-string o) ; => "hello"

11.1.5  File-Stream Ports

A port created by open-input-file, open-output-file, subprocess, and related functions is a file-stream port. The initial input, output, and error ports in stand-alone MzScheme are also file-stream ports.

(file-stream-port? port) returns #t if the given port is a file-stream port, #f otherwise.

Both input and output file-stream ports use a buffer. For an input port, a buffer is filled with immediately-available characters to speed up future reads. Thus, if a file is modified between a pair of reads to the file, the second read can produce stale data. Calling file-position to set an input port's file position flushes its buffer. For an output port, a buffer is filled to with a sequence of written to be committed as a group, typically when a newline is written. An output port's buffer use can be controlled via file-stream-buffer-mode (described below). The two ports produced by open-input-output-file have independent buffers.

Three procedures work primarily on file-stream ports:

11.1.6  Custom Ports

The make-custom-input-port and make-custom-output-port procedures create ports with arbitrary control procedures.

11.1.6.1  Custom Input

(make-custom-input-port read-string-proc peek-string-proc-or-false close-proc) creates an input port. The port is immediately open for reading. If close-proc procedure has no side effects, then the port need not be explicitly closed.

When read-string-proc returns a procedure, the procedure is called by read,18 read-syntax, or read-char-or-special to ``read'' non-character input from the port. The procedure is called exactly once before additional characters are read from the port, and the procedure must return two values: an arbitrary value and an exact, non-negative integer. The first return value is used as the read result, and the second is used as the width in characters of the result (for port position tracking). If read-string-proc or peek-string-proc returns a procedure when called by any reading procedure other than read, read-syntax, read-char-or-special, or peek-char-or-special, then the exn:application:mismatch exception is raised.

The four arguments to the procedure represent the source location of the non-character value, as much as it is known (see section 11.2.3). The first argument is an arbitrary value representing the source for read values -- the one passed to read-syntax -- or #f if read or read-char-or-special was called. The second argument is a line number (exact, positive integer) if known, or #f otherwise. The third is a column number (exact, non-negative integer) or #f, and the fourth is a position number (exact, positive integer) or #f.

When the procedure returns a syntax object, then the syntax object is used directly in the result of read-syntax, and converted with syntax-object->datum for the result of read. If the result is not a syntax object, then the result is used directly in the result for read, and converted with datum->syntax-object for the result of read-syntax. In either case, structure sharing that occurs only as a the result of multiple non-character results is not preserved as syntax sharing.

Instead of returning two values, the procedure can raise the exn:special-comment exception to indicate that the special result is a comment, and therefore produces no read result. When called by read and read-syntax, the exception is caught. The exception's width field indicates the width of the special object in port positions, like the second return value for a non-comment result.

Examples:

(define /dev/null (make-custom-input-port (lambda (s) eof) #f void))
(read-char /dev/null) ; => eof

;; This port produces a stream of 1s:
(define infinite-ones 
  (make-custom-input-port
   (lambda (s) (string-set! s 0 #\1) 1)
   #f
   void))
(read-string 5 infinite-ones) ; => "11111"

;; To peek ahead 25000 characters, we have to
;;  buffer all 25000 preceding characters:
(peek-string 5 (expt 2 5000) infinite-ones)  ; Out of memory!

;; With a peek procedure, we can support arbitrary lookahead:
(define better-infinite-ones 
  (make-custom-input-port
   (lambda (s) (string-set! s 0 #\1) 1)
   (lambda (s offset) (string-set! s 0 #\1) 1)
   void))
(peek-string 5 (expt 2 5000) better-infinite-ones) ; => "11111"

;; This port produces 0, 1, 2, 0, 1, 2, etc:
(define mod3-cycle/not-thread-safe
  (let ([n 0])
    (make-custom-input-port
     (lambda (s) (string-set! s 0 (integer->char (+ 48 n)))
                 (set! n (modulo (add1 n) 3))
                 1)
     #f
     void)))
(read-string 5 mod3-cycle/not-thread-safe) ; => "01201"

;; Same thing, but safe for concurrent access:
(define mod3-cycle
  (let ([n 0]
	;; for guarding use of n:
	[lock (make-semaphore 1)])
    (make-custom-input-port
     (lambda (s)
       (if (semaphore-try-wait? lock)
	   (begin
	     (string-set! s 0 (integer->char (+ 48 n)))
	     (set! n (modulo (add1 n) 3))
	     (semaphore-post lock)
	     1)
	   ;; another thread must have the lock;
	   ;; instead of spinning, wait until the
	   ;; lock looks available:
	   (make-semaphore-peek lock)))
     #f
     void)))
(let ([result1 #f]
      [result2 #f])
  (let ([t1 (thread (lambda ()
		      (set! result1 (read-string 5 mod3-cycle))))]
	[t2 (thread (lambda ()
		      (set! result2 (read-string 5 mod3-cycle))))])
    (thread-wait t1)
    (thread-wait t2)
    (string-append result1 "," result2))) ; => "0212,10201", maybe

;; Non-character port results:
(define infinite-voids
  (make-custom-input-port
   (lambda (s)
     ;; Return a procedure that produces void:
     (lambda (where line col pos) (values (void) 1)))
   #f
   void))
(read-char infinite-voids) ; => error: non-char in an unsupported context
(read-char-or-special infinite-voids) ; => void

11.1.6.2  Custom Output

(make-custom-output-port waitable-or-false write-string-proc flush-proc close-proc) creates an output port. The port is immediately open for writing. If close-proc procedure has no side effects, then the port need not be explicitly closed. The port can buffer data within its write-string-proc.

Examples:

(define /dev/null (make-custom-output-port 
		   #f 
		   (lambda (s start end buffer-ok?) (- end start))
		   void void))
(display "hello" /dev/null)


(define accum-list null)
(define accumulator/not-thread-safe
  (make-custom-output-port 
   #f
   (lambda (s start end buffer-ok?)
     (set! accum-list
	   (append accum-list
		   (string->list (substring s start end))))
     (- end start))
   void void))
(display "hello" accumulator/not-thread-safe)
accum-list ; => '(#\h #\e #\l #\l #\o)

(define accum-list null)
(define accumulator 
  (let ([lock (make-semaphore 1)])
    (make-custom-output-port
     (lambda () (make-semaphore-peek lock))
     (lambda (s start end buffer-ok?)
       (if (semaphore-try-wait? lock)
	   (begin
	     (set! accum-list
		   (append accum-list
			   (string->list (substring s start end))))
	     (semaphore-post lock)
	     (- end start))
	   0))
     void
     void)))
(display "hello" accumulator)
accum-list ; => '(#\h #\e #\l #\l #\o)

(define (make-capitalize port)
  (make-custom-output-port
   ;; This port is ready when the original is ready:
   (lambda () port)
   ;; Writing procedure:
   (lambda (s start end buffer-ok?)
     (let ([s (list->string
	       (map char-upcase
		    (string->list (substring s start end))))])
       (if buffer-ok?
	   (begin
	     (display s port)
	     (string-length s))
	   (write-string-avail* s port))))
   ;; Flush procedure --- flush original port:
   (lambda () (flush-output port))
   ;; Close procedure --- close original port:
   (lambda () (close-output-port port))))
(define orig-port (open-output-string))
(define cap-port (make-capitalize orig-port))
(display "Hello" cap-port)
(get-output-string orig-port) ; => "HELLO"

11.2  Reading and Writing

11.2.1  Reading

In addition to the standard reading procedures, MzScheme provides block reading procedures such as read-line, read-string, and peek-string:

11.2.2  Writing

In addition to the standard printing procedures, MzScheme provides print, which outputs values to a port by calling the port's print handler (see section 11.2.5), plus the block-writing procedures such as write-string-avail:

The fprintf, printf, and format procedures create formatted output:

When an illegal format string is supplied to one of these procedures, the exn:application:type exception is raised. When the format string requires more additional arguments than are supplied, the exn:application:fprintf:mismatch exception is raised. When more additional arguments are supplied than are used by the format string, the exn:application:mismatch exception is raised.

For example,

(fprintf port "~a as a string is ~s.~n" '(3 4) "(3 4)")

prints this message to port:20

(3 4) as a string is "(3 4)".

followed by a newline.

11.2.3  Counting Positions, Lines, and Columns

MzScheme keeps track of the position in a port as the number of characters that have been read from any input port (independent of the read/write position, which is accessed or changed with file-position). In addition, MzScheme can track line locations and column locations when specifically enabled for a port via port-count-lines! or the port-count-lines-enabled parameter (see section 7.7.1.2). Position, line, and column locations for a port are used by read-syntax (see section 12.2 for more information). Position and line locations are numbered from 1; column locations are numbered from 0.

When counting lines, MzScheme treats linefeed, return, and return-linefeed combinations as a line terminator and as a single position (on all platforms). Each tab advances the column count to one before the next multiple of 8.

A position is known for any port as long as its value can be expressed as a fixnum (which is more than enough tracking for realistic applications in, say, syntax-error reporting). If the position for a port exceeds the value of the largest fixnum, then the position for the port becomes unknown, and line and column tacking is disabled. Return-linefeed combinations are treated as a single character position only when line and column counting is enabled.

11.2.4  Customizing Read

Each input port has its own port read handler. This handler is invoked to read S-expressions or syntax objects from the port when the built-in read or read-syntax procedure is applied to the port. A port read handler must accept both a single argument or three arguments:

A port's read handler is configured with port-read-handler:

The default port read handler reads standard Scheme expressions with MzScheme's built-in parser (see section 14.3).

11.2.5  Customizing Display, Write, and Print

Each output port has its own port display handler, port write handler, and port print handler. These handlers are invoked to output S-expressions to the port when the standard display, write or print procedure is applied to the port. A port display/write/print handler takes a two arguments: the value to be printed and the destination port. The handler's return value is ignored.

The default port display and write handlers print Scheme expressions with MzScheme's built-in printer (see section 14.4). The default print handler calls the global port print handler (the value of the global-port-print-handler parameter; see section 7.7.1.2); the default global port print handler is the same as the default write handler.

11.3  Filesystem Utilities

Additional filesystem utilities are in MzLib; see Chapter 18 in PLT MzLib: Libraries Manual.

11.3.1  Pathnames

File and directory paths are specified as strings. Since the syntax for pathnames can vary across platforms (e.g., under Unix, directories are separated by ``/'' while Mac OS Classic uses ``:''), MzScheme provides tools for portably constructing and deconstructing pathnames.

Most MzScheme primitives that take pathnames perform an expansion on the pathname before using it. (Procedures that build pathnames or merely check the form of a pathname do not perform this expansion.) Under Unix and Mac OS X, a user directory specification using ``~'' is expanded.21 Under Mac OS Classic, file and folder aliases are resolved to real pathnames.22 Under Windows, multiple slashes are converted to single slashes (except at the beginning of a shared folder name), and a slash is inserted after the colon in a drive specification (if it is missing). In a Windows pathname, slash and backslash are always equivalent (and can be mixed together in the same pathname).

A pathname string cannot be empty or contain a null character (#\nul). When an empty string or a string containing a null character is provided as a pathname to any procedure except absolute-path?, relative-path?, complete-path?, or normal-case-path, the exn:i/o:filesystem exception is raised.

The pathname utilities are:

11.3.2  Files

The file management utilities are:

11.3.3  Directories

The directory management utilities are:

11.4  Networking

Mzscheme supports networking with the TCP and UDP protocols.

11.4.1  TCP

For information about TCP in general, see TCP/IP Illustrated, Volume 1 by W. Richard Stevens.

11.4.2  UDP

For information about UDP in general, see TCP/IP Illustrated, Volume 1 by W. Richard Stevens (which discusses UDP in addition to TCP).


17 Flushing is performed by the default port read handler (see section 11.2.4) rather than by read itself.

18 More precisely, the procedure is used by the default port read handler; see also section 11.2.4.

19 A temporary string of size k is allocated while reading the input, even if the size of the result is less than k characters.

20 Assuming that the current port display and write handlers are the default ones; see section 11.2.5 for more information.

21 Under Unix and Mac OS X, expansion does not convert multiple adjacent slashes to a single slash. However, extra slashes in a pathname are always ignored.

22 Mac OS X follows the Unix behavior in its treatment of links, and Mac OS Classic aliases are simply zero-length files.

23 The problem is that the procedure ``normalizes'' based on the platform, but case sensitivity is instead a property of an individual filesystem mount, not the platform that performs the mount.

24 For MrEd, the executable path is the name of a MrEd executable.

25 Under Windows, file-exists? reports #t for all variations of the special filenames (e.g., "LPT1", "x:/baddir/LPT1").

26 For FAT filesystems under Windows, directories do not have modification dates. Therefore, the creation date is returned for a directory (but the modification date is returned for a file).

27 The name "localhost" generally specifies the local machine.

28 The TCP protocol does not include a ``no longer reading'' state on connections, so tcp-abandon-port is equivalent to close-input-port on input TCP ports.

29 For most machines, the answer corresponds to the current machine's only internet address. But when a machine serves multiple addresses, the result is connection-specific.