C Types
C types are the main concept of the FFI, either primitive types or user-defined types. The FFI deals with primitive types internally, converting them to and from C types. A user type is define in terms of existing primitive and user types, along with conversion functions to and from the existing types.
(make-ctype ctype scheme-to-C-proc C-to-scheme-proc) PROCEDURE
Creates a new C type object, with the given conversions functions.
The conversion functions can be #f meaning that there is no
conversion for the corresponding direction. If both functions are
#f, ctype is returned.
Returns #t if v is a C type (primitive or
user-defined), #f otherwise.
(ctype-sizeof ctype) PROCEDURE
(ctype-alignof ctype) PROCEDURE
Return the size and alignment of a given ctype for the current
platform.
(compiler-sizeof symbol) PROCEDURE
Possible values for symbol are 'int, 'char,
'short, 'long, '*, 'void,
'float, 'double. The result is the size of the
correspond type according to the C sizeof operator for the
current platform. The compiler-sizeof operation should be
used to gather information about the current platform, such as
defining alias type like _int to a known type like
_int32.
3.1 Numeric Types
There are basic integer types at various sizes. These are: _int8,
_sint8, _uint8, _int16,
_sint16, _uint16, _int32,
_sint32, _uint32, _int64,
_sint64, and _uint64. The `s' or `u' prefix specifies
a signed or an unsigned integer respectively; the ones with no prefix are
signed.
In addition, there are several type `aliases' (extra bindings for some of the above types):
_byte,_ubyteand_sbyte: aliases for_uint8and_sint8(_byteis unsigned),_word,_uwordand_sword: aliases for_uint16and_sint16(_wordis unsigned),_short,_ushortand_sshort: aliases for the integer type that correspond to the platform's short type (_shortis signed),_int,_uintand_sint: aliases for the integer type that correspond to the platform's int type (_intis signed),_long,_ulongand_slong: aliases for the integer type that correspond to the platform's long type (_longis signed),
In cases where speed matters, and you know that the integer is small
enough, use the types _fixnum and _ufixnum,
which are similar to _long and _ulong but assume
that the quantities fit in MzScheme's immediate integers (not
bignums). If you need this capability, but you want to be sure that
the C level integer is a 32-bit size (as opposed to a long integer on
some platforms), then use _fixint and
_ufixint.
Finally, there are two floating point types, _float and
_double for the corresponding C types, and the type
_double* that implicitly coerces any non-complex number to
a C double.
3.2 Other Atomic Types
Translates #f to a 0 _int, and any other value to 1.
This type indicates a Scheme void return value, and it cannot be used to translate values to C (i.e., this type cannot be used for function inputs).
3.3 String Types
3.3.1 Primitive String Types
The five primitive string types corerspond to cases where a C representation matches MzScheme's representation without encodings.
A type for Scheme byte strings, which corresponds to C's char*
type. In addition to translating byte strings, #f
corresponds to the NULL pointer. Note that this is also a
custom-type macro; see section 3.5.1 below.
A type for MzScheme's native Unicode strings, which is in UCS-4 format. These correspond to the C mzchar* type used by MzScheme.
Unicode strings in UTF-16 format.
Simple char* strings, corresponding to MzScheme's paths.
Simple char* strings as Scheme symbols (encoded in UTF-8). Return values using this type are interned as symbols.
3.3.2 Fixed Auto-Converting String Types
Several addition string types correspond to encoring conversions. The
_string/utf-8, _string/locale, and
_string/latin-1 types all correspond to (character)
strings on the Scheme side and char* strings on the C side. The
brige between the two requires a transformation on the content of the
string. As usual, the types treat #f as NULL and
vice-versa.
The _string*/utf-8, _string*/locale, and
_string*/latin-1 types are similar, but they accept a
wider range of values: Scheme byte strings are allowed passed as is,
and Scheme paths are converted using path->bytes.
3.3.3 Variable Auto-Converting String Type
The _string/ucs-4 type is rarely useful when interacting
with foreign code. Using _bytes is somewhat unnatural, since
it forces Scheme programmers to use byte strings. Using
_string/utf-8, etc. may prematurely commit to a particular encoding
of strings as bytes. The _string type supports conversion
between Scheme strings and char* strings using a
parameter-determined conversion.
Expands to a use of the default-_string-type parameter. The
parameter's value is consulted when _string is evaluated, so
the parameter should be set before any interface definition that uses
_string.
(default-_string-type [ctype]) PROCEDURE
A parameter that determines the current meanging of _string.
It is initially set to _string/*utf-8. If you change it, do
so before interfaces are defined.
3.3.4 Other String Types
Like _path, but when values go from Scheme to C,
expand-path is used on the given value. As an output value,
it is identical to _path.
Similar to the _bytes type, except that a foreign return
value of NULL is translated to a Scheme eof value.
Similar to the _string type, except that a foreign return
value of NULL is translated to a Scheme eof value.
3.4 Pointer Types
Corresponds to Scheme ``C pointer'' objects. These pointers can have an arbitrary Scheme object attached as a type tag. The tag is ignored by built-in functionality; it is intended to be used by interfaces. See section 5.1 for creating pointer types that use these tags for safety.
This type can be used with any Scheme object; it corresponds to the Scheme_Object* type of MzScheme's C API (see Inside PLT MzScheme). It is useful only for libraries that are aware of MzScheme's C API.
Similar to _pointer, except that it should be used with
function pointers. Using these pointers avoids one dereferencing,
which is the proper way of dealing with function pointers. This type
should be used only in rare situations where you need to pass a
foreign function pointer to a foreign function; using a
_cprocedure type is possible for such situations, but
inefficient, as every call will go through Scheme unnecessarily.
Otherwise, _cprocedure should be used (it is based on
_fpointer).
3.5 Function Types
(_cprocedure input-types output-type [wrapper-proc]) PROCEDURE
A type constructor that creates a new function type, which is
specified by the given input-types list and output-type.
Usually, the _fun syntax (described below) should be used
instead, since it manages a wide range of complicated cases.
The resulting type can be used to reference foreign functions (usually
ffi-objs, but any pointer object can be referenced with this type),
generating a matching foreign callout object. Such objects are new primitive
procedure objects that can be used like any other Scheme procedure.
A type created with _cprocedure can also be used for passing
Scheme procedures to foreign functions, which will generate a foreign
function pointer that calls the given Scheme procedure when it is
used. There are no restrictions on the Scheme procedure; in
particular, its lexical context is properly preserved.
The optional wrapper-proc, if provided, is expected to be a function that
can change a callout procedure: when a callout is generated, the wrapper is
applied on the newly created primitive procedure, and its result is used as the
new function. Thus, wrapper-proc is a hook that can perform various argument
manipulations before the foreign function is invoked, and return different
results (for example, grabbing a value stored in an `output' pointer and
returning multiple values). It can also be used for callbacks, as an
additional layer that tweaks arguments from the foreign code before they reach
the Scheme procedure, and possibly changes the result values too.
(
_fun
[args ::]
input-type ···
)-> output-type
[-> output-expr]
Creates a new function type. This is a convenient syntax for the
_cprocedure type constructor that can handle complicated
cases of argument handling. In its simplest form, only the
input-types and the output-type are specified and each one
is a simple expression, which creates a straightforward function type.
In its full form, the _fun syntax provides an IDL-like language that
can be used to create a wrapper function around the primitive foreign function.
These wrappers can implement complex foreign interfaces given simple
specifications. First, the full form of each of the types can include an
optional label and an expression:
type-spec is one of type (label : type) (type = expr) (label : type = expr)
If an expression is provided, then the resulting function will be a wrapper
that calculates the argument for that position itself, meaning that it does not
expect an argument for that position. The expression can use previous
arguments if they were labeled. In addition, the result of a function call
need not be the value returned from the foreign call: if the optional
output-expr is specified, or if an expression is provided for the output
type, then this specifies an expression that will be used as a return value.
This expression can use any of the previous labels, including a label given for
the output which can be used to access the actual foreign return value.
In rare cases where complete control over the input arguments is needed, the
wrapper's argument list can be specified as args, in any form (including
a `rest' argument). Identifiers in this place are related to type labels, so
if an argument is there is no need to use an expression, for example:
(_fun (n s) :: (s : _string) (n : _int) -> _int)
specifies a function that receives an integer and a string, but the foreign function will get the string first.
3.5.1 Custom Function Types
The behavior of the _fun type can be customized via custom function
types. These are pieces of syntax that can behave as C types and C type
constructors, but they can interact with function calls in several ways that
are not possible otherwise. When the _fun form is expanded, it tries
to expand each of the given type expressions, and ones that expand to certain
keyword-value lists interact with the generation of the foreign function
wrapper. This makes it possible to construct a single wrapper function,
avoiding the costs involved in compositions of higher-order functions.
Custom function types are macros that expand to a list that looks like:
`(key: val ...)', where all of the `key:'s are from a short
list of known keys. Each key interacts with generated wrapper functions in a
different way, which affects how its corresponding argument is treated:
type:specifies the foreign type that should be used, if it is#fthen this argument does not participate in the foreign call.expr:specifies an expression to be used for arguments of this type, removing it from wrapper arguments.bind:specifies a name that is bound to the original argument if it is required later (e.g.,_boxconverts its associated value to a C pointer, and later needs to refer back to the original box).1st-arg:specifies a name that can be used to refer to the first argument of the foreign call (good for common cases where the first argument has a special meaning, e.g., for method calls).prev-arg:similar to1st-arg:, but refers to the previous argument.pre:a pre-foreign code chunk that is used to change the argument's value.post:a similar post-foreign code chunk.
The pre: and post: bindings can be of the form
(id => expr) to use the existing value. Note that if the
pre: expression is not (id => expr), then it means that there
is no input for this argument to the _fun-generated procedure. Also
note that if a custom type is used as an output type of a function, then only
the post: code is used.
All of the special custom types that are described here are defined this way.
Most custom types are meaningful only in a _fun context, and will
raise a syntax error if used elsewhere. A few such types can be used in
non-_fun contexts: types which use only type:, pre:,
post:, and no others. Such custom types can be used outside a
_fun by expanding them into a usage of make-ctype, using
other keywords makes this impossible -- it means that the type has specific
interaction with a function call.
( SYNTAX
define-fun-syntax identifier transformer)
The results of expanding custom type macros is taken apart by the
_fun macro, which will lead to code certificate problems. To solve
this, do use define-fun-syntax instead of define-syntax. It
is used in the same way, but will avoid such problems.
Not a conventional C type, but a marker for expressions that should not be sent to the ffi function. Use this to bind local values in a computation that is part of an ffi wrapper interface, or to specify wrapper arguments that are not sent to the foreign function (e.g., an argument that is used for processing the foreign output).
( CUSTOM C TYPE
_ptr mode type)
For C pointers, where mode indicates input or output pointers (or
both). mode can be one of the following:
`
i', indicating an input pointer argument: the wrapper will arrange for the function call to receive a value that can be used with thetypeand to send a pointer to this value to the foreign function. After the call the value is discarded.`
o', indicating an output pointer argument: the foreign function expects a pointer to a place where it will save some value, and this value is accessible after the call, to be used by an extra return expression. If_ptris used in this mode, then the generated wrapper does not expect an argument since one will be freshly allocated before the call.`
io' combines the above into an input/output pointer argument: the wrapper will get the Scheme value, allocate and set a pointer using this value, and reference the value after the call. The `_ptr' can be confusing here: it means that the foreign function expects a pointer, but the generated wrapper uses an actual value. (Note that if this is used with structs, a struct is created when calling the function, and a copy of the return value is made too -- inefficient, but ensures that structs are not modified by C code.)
For example, the _ptr type can be used in output mode to create a
foreign function wrapper that returns more than a single argument. The
following type:
(_fun (i : (_ptr o _int))
-> (d : _double)
-> (values d i))
will create a function that calls the foreign function with a fresh integer pointer, and use the value that is placed there as a second return value.
Custom C Type
This is similar to a (_ptr io argument, where the input
is expected to be a box holding an appropriate value, which is unboxed on entry
and modified accordingly on exit.type)
( CUSTOM C TYPE
_list mode type [len])
Similar to _ptr, except that it is used for converting lists to/from
C vectors. The optional len argument is needed for output values where
it is used in the post code, and in the pre code of an output mode to allocate
the block. In any case it can refer to a previous binding for the length of
the list which the C function will most likely require.
Same as _list, except that it uses Scheme vectors instead of lists.
_bytes can be used by itself as a simple type that uses a byte string
as a C pointer. Alternatively, it can be used as a `(_bytes o
' form is for a pointer return value, where the size should be
explicitly specified. There is no need for other modes: input or input/output
would be just like len)_bytes since the string carries its size
information (there is no real need for the `o' part of the syntax, but
it's there for consistency with the above macros).
Like _bytes, _cvector can be used as a simple type that
corresponds to a pointer that is managed as a safe C vector on the Scheme side
-- this is described in section 5.2 above. The syntax
specified here is an alternative that makes it behave similarly to the
_list and _vector custom types, except that this is more
efficient since no Scheme list or vector are needed. (It can be used with all
three modes.)
3.6 C Struct Types
(make-cstruct-type ctypes) PROCEDURE
The primitive type constructor for creating new C struct types. These
types are actually new primitive types -- they don't have any conversion
functions associated. The corresponding Scheme objects that are used for
structs are pointers, but when these types are used, the value that the pointer
refers to is used rather than the pointer itself. This value is basically
made of a number of bytes that is known according to the given list of
ctypes list.
(_list-struct ctypes ···1) PROCEDURE
A type constructor that builds a struct type using the above
make-cstruct-type function and wraps it in a type that
marshals a struct as a list of its components. Note that space for
structs needs to be allocated; the converter for a _list-struct type immediately allocates and
uses a list from the allocated space, so it is
inefficient. Use define-cstruct below for a more efficient
approach.
( SYNTAX
define-cstruct _id ((field-id ctype) ···))
Defines a new C struct type, but unlike _list-struct,
the resulting type deals with C structs in binary form rather than marshaling them
to Scheme values. It uses a define-struct-like approach, providing
accessor functions for raw struct values (which are pointer objects). The new
type uses pointer tags to guarantee that only proper struct objects are used.
The name must have a form of _. The form and the generated
bindings are intentionally similar to iddefine-struct only with type
specification for the fields -- the identifiers that will be bound as a result
are:
_: the new C type for this structid_: a pointer type that should be used when a pointer to values of this struct are usedid-pointer: a predicate for the new typeid?: the tag string object that is used with these valuesid-tagmake-: a constructor, expects an argument for each typeid...: an accessor function for eachid-field-idfield-idset-...: a mutator function for eachid-field-id!field-id
Objects of this new type are actually cpointers, with a type tag that is a list
that contains ". Since structs are implemented as pointers, they
can be used for a id"_pointer input to a foreign function: their address
will be used. To make this a little safer, the corresponding cpointer type is
defined as _. The id-pointer_ type should not be used
when a pointer is expected, since it will cause the struct to be copied rather
than use the pointer value, leading to memory corruption.id
If the first field is itself a cstruct type, its tag will be used in addition to the new tag. This feature supports common cases of object inheritance, where a sub-struct is made by having a first field that is its super-struct. Instances of the sub-struct can be considered as instances of the super-struct, since they share the same initial layout. Using the tag of an initial cstruct field means that the same behavior is implemented in Scheme; for example, accessors and mutators of the super-cstruct can be used with the new sub-cstruct. See Section 3.6.1 for an example.
Note that structs are allocated as atomic blocks, which means that the garbage collector ignores their content. Currently, there is no safe way to store pointers to GC-managed objects in structs (even if you keep a reference to avoid collecting the referenced objects, a the 3m variant's GC will invalidate the pointer's value). Thus, only non-pointer values and pointers to memory that is outside the GC's control can be lpaced into struct fields.
( SYNTAX
define-cstruct (_id _super-id) ((field-id ctype) ···))
This alternative form of define-cstruct, is shorthand for using an
initial field named super-id using _ as its type. Remember
that the new struct will use super-id_'s tag in addition to its own tag,
meaning that instances of super-id_ can be used as instances of
id_. Aside from the syntactic sugar, the constructor function will
be different when this syntax is used: instead of expecting a first argument
which is an instance of super-id_, it will expect arguments for each of
super-id_'s fields in addition for the new fields. This is, again, in
analogy to using a super-struct with super-iddefine-struct.
3.6.1 C Struct Examples
A few examples will help understanding how to use structs. Assuming the following C code:
typedef struct { int x; char y; } A;
typedef struct { A a; int z; } B;
A* makeA() {
A *p = malloc(sizeof(A));
p->x = 1;
p->y = 2;
return p;
}
B* makeB() {
B *p = malloc(sizeof(B));
p->a.x = 1;
p->a.y = 2;
p->z = 3;
return p;
}
char gety(A* a) {
return a->y;
}
First, using the simple _list-struct, you might expect this code to
work:
(define makeB (get-ffi-obj 'makeB "foo.so" (_fun -> (_list-struct (_list-struct _int _byte) _int)))) (makeB) ; should return ((1 2) 3)
The problem here is that makeB returns a pointer to the struct rather than the struct itself. The following works as expected:
(define makeB (get-ffi-obj 'makeB "foo.so" (_fun -> _pointer))) (ptr-ref (makeB) (_list-struct (_list-struct _int _byte) _int))
As described above, _list-structs should be used in cases where
efficiency is not an issue. We continue using define-cstruct, first
define a type for `A' which makes it possible to use `makeA':
(define-cstruct _A ([x _int] [y _byte]))
(define makeA
(get-ffi-obj 'makeA "foo.so"
(_fun -> _A-pointer))) ; using _A is a memory-corrupting bug!
(define a (makeA))
(list a (A-x a) (A-y a))
; -> (#<cpointer:A> 1 2)
`gety' is also simple to use from Scheme:
(define gety (get-ffi-obj 'gety "foo.so" (_fun _A-pointer -> _byte))) (gety a) ; -> 2
We now define another C struct for `B', and expose `makeB' using it:
(define-cstruct _B ([a _A] [z _int])) (define makeB (get-ffi-obj 'makeB "foo.so" (_fun -> _B-pointer))) (define b (makeB))
We can access all values of b using a naive approach:
(list (A-x (B-a b)) (A-y (B-a b)) (B-z b))
but this is inefficient as it allocates and copies an instance of `A' on
every access. Inspecting the tags (cpointer-tag b) we can see that
A's tag is included, so we can simply use its accessors and mutators, as
well as any function that is defined to take an A pointer:
(list (A-x b) (A-y b) (B-z b))
(gety b)
Constructing a B instance in Scheme requires allocating a temporary A struct:
(define b (make-B (make-A 1 2) 3))
To make this more efficient, we switch to the alternative
define-cstruct syntax, which creates a constructor that expects
arguments for both the super fields ands the new ones:
(define-cstruct (_B _A) ([z _int])) (define b (make-B 1 2 3))
3.7 Enumerations and Masks
Although the constructors below are describes as procedures,they are implemented as syntax, so that error messages can report a type name where the syntactic context implies one.
(_enum symbols [basetype]) PROCEDURE
Takes a list of symbols and generates an enumeration type. The
enumeration maps between the given symbols and integers,
counting from 0. The list of symbols can also set the values
of symbols, if a symbol is followed by '= and an integer. For
example, the list '(x y = 10 z) maps 'x to 0,
'y to 10, and 'z to 11.
The optional basetype argument specifies the base type to use,
defaulting to _ufixint.
(_bitmask symbols [basetype]) PROCEDURE
Similar to _enum, but the resulting mapping translates a
list of symbols to a number and back, using a logical or. A single
symbol can be given as an input to make things a little more
convenient. The default basetype is _uint, since high
bits are often used for flags.