% The Haskell 98 Foreign Function Interface
% [An Addendum to the Definition of Haskell 98]
%
% Editor: Manuel M T Chakravarty
%
% Copyright [2002..2003] Manuel M T Chakravarty
%
% The authors intend this Report to belong to the entire Haskell community, and
% so we grant permission to copy and distribute it for any purpose, provided
% that it is reproduced in its entirety, including this Notice.  Modified
% versions of this Report may also be copied and distributed for any purpose,
% provided that the modified version is clearly presented as such, and that it
% does not claim to be a definition of the Haskell 98 Foreign Function
% Interface. 

% Changes since RC15:
% * 6.3: Footnote regarding __STDC_ISO_10646__ added to text introducing
%        `CWString'.
%
% Changes since RC14:
% * 6.2: CWChar -> CWchar
% * 6.3: - CWChar -> CWchar
%        - Stated explicitly that memory allocated by `newCString' and friends
%          can be deallocated by `MarshalAlloc.free'
%        - Improved documentation
%
% Changes since RC13:
% * 5.3: Fixed typo
% * 5.7: Fixed a mistake in the type of `peekByteOff' and `pokeByteOff' (the
%        type variable constrained by `Storable' must be different from the
%        parameter of the `Ptr')
% * 6.3: Improved documentation
%
% Changes since RC12:
% * Acks : Added John Meacham
% * 4.1.5: Bug fix courtesy of Wolfgang Thaller
% * 5.5  : Added `FinalizerEnvPtr', `newForeignPtrEnv', and
%          `addForeignPtrFinalizerEnv'
% * 6.3  : Added John Meacham proposal for `wchar_t' support as well localised
%          string marshalling; in particular, this adds `CWString' and
%          `CWStringLen' as well as the `CWString' and the `CAString' family
%          of marshalling routines.  In addition, `charIsRepresentable' was
%          added. 
%
% Changes since RC11:
% * 5.5: Swapped argument order of `newForeignPtr' and `addForeignPtrFinalizer'
%
% Changes since RC10:
% * 3.3  : Clarified use of foreign functions of pure type
% * 4.1.1: Clarified the meaning of foreign imports without a "&" that have a
%          non-functional type in Haskell
% * 5.1  : Clarified the scope of safe use of unsafePerformIO
% * 5.5  : "pre-emptive" dropped in footnote regarding finalizers; added
%          `newForeignPointer_' and renamed `foreignPtrToPtr' to
%          `unsafeForeignPtrToPtr'
% * Typos throughout
%
% Changes since RC9:
% * 1:     Mentioning interaction with foreign threads as an open problem.
% * 2 & 3: Removed `threadsafe' again, as the proposal for thread support is
%          still evolving and it is not yet clear whether a new safety level
%          is required.
% * 5.5:   Added the type synonym `FinalizerPtr' and rewrote the documentation
%          of finalizers.
% * 5.6:   Clarified the description of `StablePtr'
% * 5.8:   Added `finalizerFree'
% * 6.2:   All the types in CTypes must be newtypes that are exported
%          abstractly. 
%
% Changes since RC8:
% * 5.8: `MarshallAlloc.reallocBytes' is no longer permitted on memory
%        allocated with `alloca' or `allocaBytes'. 
% * 6.1: Deinitialisation of the RTS via `hs_exit()' followed by
%        (re)initialisation with `hs_init()' must be supported.
%
% Changes since RC7:
% * Clarified the lexis of C identifiers and C header file names
% * In `ForeignPtr', added `mallocForeignPtrArray' and `mallocForeignPtrArray0'
% * Clarified spec of allocations functions adding constraints taken from the
%   corresponding C routines
% * `mallocBytes' and `allocaBytes' must align memory sufficiently for any
%   basic foreign type that fits into the allocated block
% * Removed typos in the description of the module `ForeignPtr'
% * Added Peter Gammie to the list of acknowledged people
% * `addForeignPtrFinalizer' guarantees that finalizers for a single foreign
%   pointer are executed in the opposite order as they were added.
% * `Storable': Require that the size is divisible by the alignment
% * Added Ross Paterson to the list of acknowledged people
% * Added hs_free_fun_ptr() and hs_free_stable_ptr()
% * Changed order of arguments of `mkIOError' and `annotateIOError' to match
%   with the current implementation in GHC's FFI libraries.
%
% Changes since RC6:
% * Fixed typos
%
% Changes since RC5:
% * Author list: changed Alastair Reid's institution
% * 1.4:   Clarified the wording
% * 4.1:   Explicitly stated that access to pre-processor symbols is not
%          provided by the FFI
% * 4.1.1: Removed [lib] from impent syntax and discussion
% * 4.1.3: Added parentheses round FunPtr ft to make it easier to 
%          understand a tolerably complex type.
% * 4.1.4: Removed all mention of library objects; clarified that header files
%          do not impact the semantics of foreign calls, but may be required
%          for correct code generation by some systems
% * 5.2:   Clarified that all operations in Bits are member functions of the
%          type class.  Reverse the meaning of the sign of the second argument
%          for `rotate' and `shift' (this makes it the same as GHC used all
%          the time).  `bitSize' on `Integer' etc is now undefined.
% * 5.5:   Finalisers must be external functions to facilitate the
%          implementation on Haskell systems that do not support pre-emptive
%          concurrency.
%          Added mallocForeignPtr and mallocForeignPtrBytes.
% * 6:     Specified that HsBool==int in table2
%          Relabelled column 1 in table 3 (C symbol -> CPP symbol)
%          Replaced 0 and 1 with HS_BOOL_FALSE/TRUE
% * 6.1:   Clarified that nullPtr (nullFunPtr) coincides with (HsPtr) NULL and
%          (HsFunPtr) NULL, respectively.
%          Allowing multiple calls to hs_init() and clarified the constraints
%          on the relative timing between hs_set_argv() and
%          getProgName/getArgs. 
%          Added hs_perform_gc().
%
% Changes since RC4:
% * 5.6: Clarified documentation of `StablePtr's (RC 5)
%
% Changes between RC2 and RC4:
%
% * 5.8: Clarified documentation for `MarshalAlloc.free'.
% * 5.8: Added `MarshalAlloc.realloc'.
% * 3: Added the new safety level `threadsafe' with an explanation at the end
%   of 3.3.
% * 3: Replaced the nontermional `entity' by `impent' and `expent' to
%   distinguish between import and export entities (as they are defined
%   differently in later sections).
% * 3.2: Clarified the description of foreign types; so far, `IO ()' was
%   strictly speaking not included as a valid return type.  Currently,
%   functions of type `a -> ()' are included.  Do we want this?  Their use
%   might not be portable if they include side effects.
% * 4.1.5: New section discussing the traps & pitfalls of type promotion with
%   C bindings.

% TODO:
% * Implement HTMLization.  (Malcolm suggests using
%   <http://pauillac.inria.fr/~maranget/hevea/>)

% TODO after Version 1.0:
%
% * Review suggestions by Antony Courtney <antony@apocalypse.org> re FFI
%   support for Java.

 
\documentclass[a4paper,twoside]{article}

\usepackage{a4wide}
\usepackage{grammar}  % Get it from 
                      %   http://www.cse.unsw.edu.au/~chak/haskell/grammar.sty
\usepackage{version}
\usepackage{url}
\usepackage[fleqn]{amsmath}


% version control
%
%\includeversion{DRAFT}
\excludeversion{DRAFT}
\excludeversion{FUTURE}  % material for future extensions

\def\Version{\relax}
%\def\Version{\\(Release Candidate 16)}
\begin{DRAFT}%
{
  \gdef\Version{%
    \\
    \textbf{--- DRAFT ---}\\[1ex]
    \ttfamily\scriptsize
    $\relax$Id: ffi.tex,v 1.55 2005/04/23 14:27:04 chak Exp $\relax$%
    \ignorespaces}
  }
\end{DRAFT}

% setting of code
%
\newcommand{\code}[1]{\texttt{#1}}      % inline code fragment
\makeatletter
\newenvironment{codedesc}{%             % description of code pieces
  \list{}{\labelwidth\z@
    \let\makelabel\codedesclabel}
  }{%
  \endlist
  }
\newcommand*{\codedesclabel}[1]{%
  \hspace{-\leftmargin}
  \parbox[b]{\labelwidth}{\makebox[0pt][l]{\code{#1}}\\}\hfil\relax
  }
\makeatother
\newcommand{\combineitems}{\vspace*{-\itemsep}\vspace*{-\parsep}\vspace*{-1em}}
\newcommand{\us}{\char"5F}

% general
%
\newcommand{\clearemptydoublepage}{%
  \newpage{\pagestyle{empty}\cleardoublepage}}


\begin{document}
\pagestyle{headings}

\title{%
  The Haskell 98 Foreign Function Interface 1.0\\
  An Addendum to the Haskell 98 Report%
  \Version}
\author{
  Manuel Chakravarty [editor], University of New South Wales\\
  Sigbjorn Finne, Galois Connections, Inc.\\
  Fergus Henderson, University of Melbourne\\
  Marcin Kowalczyk, Warsaw University\\
  Daan Leijen, University of Utrecht\\
  Simon Marlow, Microsoft Research, Cambridge\\
  Erik Meijer, Microsoft Corporation\\
  Sven Panne, BetaResearch GmbH\\
  Simon Peyton Jones, Microsoft Research, Cambridge\\
  Alastair Reid, Reid Consulting (UK) Ltd.\\
  Malcolm Wallace, University of York\\
  Michael Weber, University of Aachen
  }
\date{}
\maketitle
\par\vfill
\noindent
Copyright (c) [2002..2003] Manuel M. T. Chakravarty
\par\noindent
\emph{The authors intend this Report to belong to the entire Haskell
  community, and so we grant permission to copy and distribute it for any
  purpose, provided that it is reproduced in its entirety, including this
  Notice.  Modified versions of this Report may also be copied and distributed
  for any purpose, provided that the modified version is clearly presented as
  such, and that it does not claim to be a definition of the Haskell 98
  Foreign Function Interface.}
\par\bigskip\noindent
The master version of the Haskell FFI Report is at \url{haskell.org}. Any
corrections or changes in the report are found there.
\thispagestyle{empty}


\clearemptydoublepage
\pagenumbering{roman}
\tableofcontents

\clearemptydoublepage
\section*{Preface}

The definition of Haskell 98~\cite{haskell98}, while being comprehensive with
respect to the functional core language, does lack a range of features of more
operational flavour, such as a foreign language interface, concurrency
support, and fully fledged exception handling.  As these features are of
central importance to many real world applications of the language, there is a
danger that different implementations become de facto incompatible for such
applications due to system-specific extensions of the core language.  The
present FFI specification is aimed at reducing this risk by defining a simple,
yet comprehensive extension to Haskell 98 for the purpose of interfacing to
program components implemented in a language other than Haskell.

The goal behind this foreign function interface (FFI) specification is
twofold: It enables (1) to describe in Haskell the interface to foreign
functionality and (2) to use from foreign code Haskell routines.  More
precisely, its aim is to support the implementation of programs in a mixture
of Haskell and other languages such that the source code is portable across
different implementations of Haskell and non-Haskell systems as well as
independent of the architecture and operating system.

The design as presented in this report builds on experiences with a number of
foreign function interfaces that, over time, have been provided by the major
Haskell implementations.  Central in the final design was the goal to be
comprehensive while being simple and minimising changes with respect to
Haskell 98; the latter includes to avoid pollution of the name space with new
keywords.  Consequently, as much as possible of the FFI functionality is
realised in the form of libraries.  Simplicity generally overruled maximum
convenience for the programmer as a design goal.  Thus, support for more
convenient interface specifications is the domain of system-independent tools
that generate code following the present specification.

\subsection*{Acknowledgements}

We heartily thank the kind people who assisted us with their comments and
suggestions on the \code{ffi@haskell.org} and \code{haskell@haskell.org}
mailing lists as well as all the users of previous versions of the FFI who
helped to shape the development by their feedback.  We thank Olaf Chitil,
Peter Gammie, Wolfram Kahl, Martin D. Kealey, Ian Lynagh, John Meacham, Ross
Paterson, George Russell, and Wolfgang Thaller for errata and additions to
previous versions of this report.


\clearemptydoublepage
\pagenumbering{arabic}
\section{Introduction}

The extension of Haskell 98 defined in this report facilitates the use of
non-Haskell code from Haskell and vice versa in a portable manner.  Intrusion
into Haskell 98 has been kept to a minimum and the defined facilities have
been extensively tested with large libraries.

The present Version 1.0 of the FFI report does only fully specify the
interaction between Haskell code with code that follows the C calling
convention.  However, the design of the FFI is such that it enables the
modular extension of the present definition to include the calling conventions
of other programming languages, such as C++ and Java.  A precise definition of
the support for those languages is expected to be included in later versions
of this report.  The second major omission from the present report is the
definition of the interaction with multithreading in the foreign language and,
in particular, the treatment of thread-local state.  Work on this problem is
not sufficiently mature to be included into Version 1.0 of the report.

\subsection{Embedding Into Haskell 98}

The present report is to be regarded as an addendum to the Haskell 98
Report~\cite{haskell98}.  As such, syntactic and semantic definitions refer to
names and definitions in the Haskell 98 Report where appropriate without
further explanation.  Care has been taken to invalidate as few as possible
legal Haskell 98 programs in the process of adding FFI support.  In
particular, only a single addition to the set of reserved identifiers, namely
\code{foreign}, has been made.

Moreover, it is expected that the present FFI specification will be considered
for inclusion into future revisions of the Haskell standard.

\subsection{Language-Specific FFI Support}

The core of the present specification is independent of the foreign language
that is used in conjunction with Haskell.  However, there are two areas where
FFI specifications must become language specific: (1) the specification of
external names and (2) the marshalling of the basic types of a foreign
language.  As an example of the former, consider that in C~\cite{C} a simple
identifier is sufficient to identify an object, while
Java~\cite{gosling-etal:Java}, in general, requires a qualified name in
conjunction with argument and result types to resolve possible overloading.
Regarding the second point, consider that many languages do not specify the
exact representation of some basic types.  For example the type \code{int} in
C may be 16, 32, or 64 bits wide.  Similarly, the Haskell report guarantees
only that \code{Int} covers at least the range \([-2^{29}, 2^{29} - 1]\).  As
a consequence, to reliably represent values of C's \code{int} in Haskell, we
have to introduce a new type \code{CInt}, which is guaranteed to match the
representation of \code{int}.

The specification of external names, dependent on a calling convention, is
described in Section~\ref{sec:extent}, whereas the marshalling of the basic
types in dependence on a foreign language is described in
Section~\ref{sec:marshalling}.

\subsection{Contexts}

For a given Haskell system, we define the \emph{Haskell context} to be the
execution context of the abstract machine on which the Haskell system is
based.  This includes the heap, stacks, and the registers of the abstract
machine and their mapping onto a concrete architecture.  We call any other
execution context an \emph{external context.}  Generally, we cannot assume any
compatibility between the data formats and calling conventions between the
Haskell context and a given external context, except where the Haskell 98
report explicitly prescribes a specific data format.

The principal goal of a foreign function interface is to provide a
programmable interface between the Haskell context and external contexts.  As
a result Haskell threads can access data in external contexts and invoke
functions that are executed in an external context as well as vice versa.  In
the rest of this report, external contexts are usually identified by a calling
convention. 

\subsection{Cross Language Type Consistency}

Given that many external languages support static types, the question arises
whether the consistency of Haskell types with the types of the external
language can be enforced for foreign functions.  Unfortunately, this is, in
general, not possible without a significant investment on the part of the
implementor of the Haskell system (i.e., without implementing a dedicated type
checker).  For example, in the case of the C calling convention, the only
other approach would be to generate a C prototype from the Haskell type and
leave it to the C compiler to match this prototype with the prototype that is
specified in a C header file for the imported function.  However, the Haskell
type is lacking some information that would be required to pursue this route.
In particular, the Haskell type does not contain any information as to when
\code{const} modifiers have to be emitted.  

As a consequence, this report does not require the Haskell system to check
consistency with foreign types.  Nevertheless, Haskell systems are encouraged
to provide any cross language consistency checks that can be implemented with
reasonable effort.


\newpage
\section{Lexical Structure}

In the following, all formal grammatical definitions are based on the same
notation as that defined in the Haskell 98
Report~\cite[Section~2.1]{haskell98} and we make free use of all nonterminals
defined in the Haskell 98 Report.  The only addition to the lexical structure
of Haskell 98~\cite[Section~2]{haskell98} is a single new reserved identifier
(namely, \code{foreign}) and a set of special identifiers.  The latter have a
special meaning only within foreign declarations, but may be used as ordinary
identifiers elsewhere.

The following productions are added:
%
\begin{grammar}
  \grule{reservedid}{%
    foreign}
  \grule{specialid}{%
    export \galt\ safe \galt\ unsafe \galt\ ccall}
  \gor{%
     cplusplus \galt\ dotnet \galt\ jvm \galt\ stdcall}
  \gor{%
    \gverbal{system-specific calling conventions}}
\end{grammar}
%
The special identifiers \code{ccall}, \code{cplusplus}, \code{dotnet},
\code{jvm}, and \code{stdcall} are defined to denote calling conventions.
However, a concrete implementation of the FFI is free to support additional,
system-specific calling conventions whose name is not explicitly listed here.

To refer to objects of an external C context, we introduce the following
phrases:
%
\begin{grammar}
  \grule[C header filename]{chname}{%
    \grepeat{\gnterm{chchar}} .\ h}
  \grule[C identifier]{cid}{%
    \gnterm{letter} \grepeat{\gnterm{letter} \galt\ \gnterm{ascDigit}}}
  \grule{chchar}{%
    \gnterm{letter} \galt\ \gnterm{ascSymbol}\gminus{\&}}
  \grule{letter}{%
    \gnterm{ascSmall} \galt\ \gnterm{ascLarge} \galt\ \_}
\end{grammar}
%
The range of lexemes that are admissible for \gnterm{chname} is a subset of
those permitted as arguments to the \code{\#{}include} directive in C.  In
particular, a file name \gnterm{chname} must end in the suffix \code{.h}.  The
lexemes produced by \gnterm{cid} coincide with those allowed as C identifiers,
as specified in~\cite{C}.

\newpage
\section{Foreign Declarations}

This section describes the extension of Haskell 98 by foreign declarations.
The following production for the nonterminal \gnterm{topdecl} extends the same
nonterminal from the Haskell 98 Report.  All other nonterminals are new.
%
\begin{grammar}
  \grule{topdecl}{%
    foreign \gnterm{fdecl}}
  \grule[define variable]{fdecl}{%
    import \gnterm{callconv} \gopt{\gnterm{safety}} \gnterm{impent}
    \gnterm{var} {::}\ \gnterm{ftype}}
  \gor[expose variable]{%
    export \gnterm{callconv} \gnterm{expent}
    \gnterm{var} {::}\ \gnterm{ftype}}
  \grule[calling convention]{callconv}{%
    ccall \galt\ stdcall \galt\ cplusplus \galt\ jvm \galt\ dotnet}
  \gor{%
    \gverbal{system-specific calling conventions}}
  \grule[imported external entity]{impent}{%
    \gopt{\gnterm{string}}}
  \grule[exported entity]{expent}{%
    \gopt{\gnterm{string}}}
  \grule{safety}{%
    unsafe \galt\ safe}
\end{grammar}
%
There are two flavours of foreign declarations: import and export
declarations.  An import declaration makes an \emph{external entity,} i.e., a
function or memory location defined in an external context, available in the
Haskell context.  Conversely, an export declaration defines a function of the
Haskell context as an external entity in an external context.  Consequently,
the two types of declarations differ in that an import declaration defines a
new variable, whereas an export declaration uses a variable that is already
defined in the Haskell module.

The external context that contains the external entity is determined by the
calling convention given in the foreign declaration.  Consequently, the exact
form of the specification of the external entity is dependent on both the
calling convention and on whether it appears in an import declaration (as
\gnterm{impent}) or in an export declaration (as \gnterm{expent}).  To provide
syntactic uniformity in the presence of different calling conventions, it is
guaranteed that the description of an external entity lexically appears as a
Haskell string lexeme.  The only exception is where this string would be the
empty string (i.e., be of the form \code{""}); in this case, the string may be
omitted in its entirety.

\subsection{Calling Conventions}
\label{sec:call-conv}

The binary interface to an external entity on a given architecture is
determined by a calling convention.  It often depends on the programming
language in which the external entity is implemented, but usually is more
dependent on the system for which the external entity has been compiled.

As an example of how the calling convention is dominated by the system rather
than the programming language, consider that an entity compiled to byte code
for the Java Virtual Machine (JVM)~\cite{lindholm-etal:JVM} needs to be
invoked by the rules of the JVM rather than that of the source language in
which it is implemented (the entity might be implemented in Oberon, for
example).

Any implementation of the Haskell 98 FFI must at least implement the C calling
convention denoted by \code{ccall}.  All other calling conventions are
optional.  Generally, the set of calling conventions is open, i.e., individual
implementations may elect to support additional calling conventions.  In
addition to \code{ccall}, Table~\ref{tab:callconv} specifies a range of
identifiers for common calling conventions.
%
\begin{table}[tbp]
  \begin{center}
    \begin{tabular}{|l|l|}
      \hline
      Identifier & Represented calling convention\\
      \hline\hline
      \code{ccall} 
      & Calling convention of the standard C compiler on a system\\
      \code{cplusplus}
      & Calling convention of the standard C{+}{+} compiler on a system\\
      \code{dotnet}
      & Calling convention of the \textsc{.net} platform\\
      \code{jvm} 
      & Calling convention of the Java Virtual Machine\\
      \code{stdcall}
      & Calling convention of the Win32 API (matches Pascal conventions)\\
      \hline
    \end{tabular}
    \caption{Calling conventions}
    \label{tab:callconv}
  \end{center}
\end{table}
%
Implementations need not implement all of these conventions, but if any is
implemented, it must use the listed name.  For any other calling convention,
implementations are free to choose a suitable name.

The present report does only define the semantics of the calling conventions
\code{ccall} and \code{stdcall}.  Later versions of the report are expected to
cover more calling conventions.

It should be noted that the code generated by a Haskell system to implement a
particular calling convention may vary widely with the target code of that
system.  For example, the calling convention \code{jvm} will be trivial to
implement for a Haskell compiler generating Java code, whereas for a Haskell
compiler generating C code, the Java Native Interface (JNI)~\cite{liang:JNI}
has to be targeted.

\subsection{Foreign Types}
\label{sec:foreign-types}

The following types constitute the set of \emph{basic foreign types}:
%
\begin{itemize}
\item \code{Char}, \code{Int}, \code{Double}, \code{Float}, and \code{Bool} as
  exported by the Haskell 98 \code{Prelude} as well as
\item \code{Int8}, \code{Int16}, \code{Int32}, \code{Int64}, \code{Word8},
  \code{Word16}, \code{Word32}, \code{Word64}, \code{Ptr a}, \code{FunPtr a},
  and \code{StablePtr a}, for any type \code{a}, as exported by \code{Foreign}
  (Section~\ref{sec:Foreign}).
\end{itemize}
%
A Haskell system that implements the FFI needs to be able to pass these types
between the Haskell and the external context as function arguments and
results.

Foreign types are produced according to the following grammar:
%
\begin{grammar}
  \grule{ftype}{%
    \gnterm{frtype}}
  \gor{%
    \gnterm{fatype} -> \gnterm{ftype}}
  \grule{frtype}{%
    \gnterm{fatype}}
  \gor{%
    ()}
  \grule[$k\geq0$]{fatype}{%
    \gnterm{qtycon} \gnterm[1]{atype} \gellipse\ \gnterm[k]{atype}}
\end{grammar}
%
A foreign type is the Haskell type of an external entity.  Only a subset of
Haskell's types are permissible as foreign types, as only a restricted set of
types can be canonically transferred between the Haskell context and an
external context.  A foreign type generally has the form
\[
\textit{at}_1\code{ -> }\cdots\code{ -> }\textit{at}_n\code{ -> }\textit{rt}
\]
where \(n\geq0\).  It implies that the arity of the external entity is $n$.

The argument types \(\textit{at}_i\) produced by \gnterm{fatype} must be
\emph{marshallable foreign types;} that is, each \(\textit{at}_i\) is either
(1) a basic foreign type or (2) a type synonym or renamed datatype of a
marshallable foreign type.  Moreover, the result type \textit{rt} produced by
\gnterm{frtype} must be a \emph{marshallable foreign result type;} that is, it
is either a marshallable foreign type, the type \code{()}, or a type matching
\code{Prelude.IO }$t$, where $t$ is a marshallable foreign type or \code{()}.

External functions are generally strict in all arguments.

\subsection{Import Declarations}
\label{sec:import}

Generally, an import declaration has the form
%
\[
\code{foreign}~\code{import}~c~e~v~\code{{::}}~t
\]
%
which declares the variable $v$ of type $t$ to be defined externally.
Moreover, it specifies that $v$ is evaluated by executing the external entity
identified by the string $e$ using calling convention $c$.  The precise form
of $e$ depends on the calling convention and is detailed in
Section~\ref{sec:extent}.  If a variable $v$ is defined by an import
declaration, no other top-level declaration for $v$ is allowed in the same
module.  For example, the declaration
%
\begin{quote}
\begin{verbatim}
foreign import ccall "string.h strlen" cstrlen :: Ptr CChar -> IO CSize
\end{verbatim}
\end{quote}
%
introduces the function \code{cstrlen}, which invokes the external function
\code{strlen} using the standard C calling convention.  Some external entities
can be imported as pure functions; for example,
%
\begin{quote}
\begin{verbatim}
foreign import ccall "math.h sin" sin :: CDouble -> CDouble.
\end{verbatim}
\end{quote}
%
Such a declaration asserts that the external entity is a true function; i.e.,
when applied to the same argument values, it always produces the same result.

Whether a particular form of external entity places a constraint on the
Haskell type with which it can be imported is defined in
Section~\ref{sec:extent}.  Although, some forms of external entities restrict
the set of Haskell types that are permissible, the system can generally not
guarantee the consistency between the Haskell type given in an import
declaration and the argument and result types of the external entity.  It is
the responsibility of the programmer to ensure this consistency.

Optionally, an import declaration can specify, after the calling convention,
the safety level that should be used when invoking an external entity.  A
\code{safe} call is less efficient, but guarantees to leave the Haskell system
in a state that allows callbacks from the external code.  In contrast, an
\code{unsafe} call, while carrying less overhead, must not trigger a callback
into the Haskell system.  If it does, the system behaviour is undefined.  The
default for an invocation is to be \code{safe}.  Note that a callback into
the Haskell system implies that a garbage collection might be triggered after
an external entity was called, but before this call returns.  Consequently,
objects other than stable pointers (cf.\ Section~\ref{sec:StablePtr}) may be
moved or garbage collected by the storage manager.

\subsection{Export Declarations}

The general form of export declarations is
%
\[
\code{foreign}~\code{export}~c~e~v~\code{{::}}~t
\]
%
Such a declaration enables external access to $v$, which may be a value, field
name, or class method that is declared on the top-level of the same module or
imported.  Moreover, the Haskell system defines the external entity described
by the string $e$, which may be used by external code using the calling
convention $c$; an external invocation of the external entity $e$ is
translated into evaluation of $v$.  The type $t$ must be an instance of the
type of $v$.  For example, we may have
%
\begin{quote}
\begin{verbatim}
foreign export ccall "addInt"   (+) :: Int   -> Int   -> Int
foreign export ccall "addFloat" (+) :: Float -> Float -> Float
\end{verbatim}
\end{quote}

If an evaluation triggered by an external invocation of an exported Haskell
value returns with an exception, the system behaviour is undefined.  Thus,
Haskell exceptions have to be caught within Haskell and explicitly marshalled
to the foreign code.


\section{Specification of External Entities}
\label{sec:extent}

Each foreign declaration has to specify the external entity that is accessed
or provided by that declaration.  The syntax and semantics of the notation
that is required to uniquely determine an external entity depends heavily on
the calling convention by which this entity is accessed.  For example, for the
calling convention \code{ccall}, a global label is sufficient.  However, to
uniquely identify a method in the calling convention \code{jvm}, type
information has to be provided.  For the latter, there is a choice between the
Java source-level syntax of types and the syntax expected by JNI---but,
clearly, the syntax of the specification of an external entity depends on the
calling convention and may be non-trivial.

Consequently, the FFI does not fix a general syntax for denoting external
entities, but requires both \gnterm{impent} and \gnterm{expent} to take the
form of a Haskell \gnterm{string} literal.  The formation rules for the values
of these strings depend on the calling convention and a Haskell system
implementing a particular calling convention will have to parse these strings
in accordance with the calling convention.

Defining \gnterm{impent} and \gnterm{expent} to take the form of a
\gnterm{string} implies that all information that is needed to statically
analyse the Haskell program is separated from the information needed to
generate the code interacting with the foreign language.  This is, in
particular, helpful for tools processing Haskell source code.  When ignoring
the entity information provided by \gnterm{impent} or \gnterm{expent}, foreign
import and export declarations are still sufficient to infer identifier
definition and use information as well as type information.

For more complex calling conventions, there is a choice between the user-level
syntax for identifying entities (e.g., Java or C{+}{+}) and the system-level
syntax (e.g., the type syntax of JNI or mangled C{+}{+}, respectively).  If
such a choice exists, the user-level syntax is preferred.  Not only because it
is more user friendly, but also because the system-level syntax may not be
entirely independent of the particular implementation of the foreign language.

The following defines the syntax for specifying external entities and their
semantics for the calling conventions \code{ccall} and \code{stdcall}.  Other
calling conventions from Table~\ref{tab:callconv} are expected to be defined
in future versions of this report.


\subsection{Standard C Calls}
\label{sec:ccall}

The following defines the structure of external entities for foreign
declarations under the \code{ccall} calling convention for both import and
export declarations separately.  Afterwards additional constraints on the type
of foreign functions are defined.

The FFI covers only access to C functions and global variables.  There are no
mechanisms to access other entities of C programs.  In particular, there is no
support for accessing pre-processor symbols from Haskell, which includes
\code{\#define}d constants.  Access from Haskell to such entities is the
domain of language-specific tools, which provide added convenience over the
plain FFI as defined in this report.

\subsubsection{Import Declarations}

For import declarations, the syntax for the specification of external entities
under the \code{ccall} calling convention is as follows:
%
\begin{grammar}
  \grule[static function or address]{impent}{%
    " \gopt{static} \gopt{\gnterm{chname}} \gopt{\&} 
    \gopt{\gnterm{cid}} "}
  \gor[stub factory importing addresses]{%
    " dynamic "}
  \gor[stub factory exporting thunks]{%
    " wrapper "}
\end{grammar}
%
The first alternative either imports a static function \gnterm{cid} or, if
\gterm\& precedes the identifier, a static address.  If \gnterm{cid} is
omitted, it defaults to the name of the imported Haskell variable.  The
optional filename \gnterm{chname} specifies a C header file, where the
intended meaning is that the header file declares the C entity identified by
\gnterm{cid}.  In particular, when the Haskell system compiles Haskell to C
code, the directive
%
\begin{quote}
  \gterm{\#include "\gnterm{chname}"}
\end{quote}
%
needs to be placed into any generated C file that refers to the foreign entity
before the first occurrence of that entity in the generated C file.

The second and third alternative, identified by the keywords \gterm{dynamic}
and \gterm{wrapper}, respectively, import stub functions that have to be
generated by the Haskell system.  In the case of \gterm{dynamic}, the stub
converts C function pointers into Haskell functions; and conversely, in the
case of \gterm{wrapper}, the stub converts Haskell thunks to C function
pointers.  If neither of the specifiers \code{static}, \code{dynamic}, or
\code{wrapper} is given, \code{static} is assumed.  The specifier
\code{static} is nevertheless needed to import C routines that are named
\code{dynamic} or \code{wrapper}.

It should be noted that a static foreign declaration that does not import an
address (i.e., where \gterm\& is not used in the specification of the external
entity) always refers to a C function, even if the Haskell type is
non-functional.  For example, 
%
\begin{quote}
\begin{verbatim}
foreign import ccall foo :: CInt
\end{verbatim}
\end{quote}
%
refers to a pure C function \code{foo} with no arguments that returns an
integer value.  Similarly, if the type is \code{IO CInt}, the declaration
refers to an impure nullary function.  If a Haskell program needs to access a
C variable \code{bar} of integer type,
%
\begin{quote}
\begin{verbatim}
foreign import ccall "&" bar :: Ptr CInt
\end{verbatim}
\end{quote}
%
must be used to obtain a pointer referring to the variable.  The variable can
be read and updated using the routines provided by the module \code{Storable}
(cf.\ Section~\ref{sec:Storable}).

\subsubsection{Export Declarations}

External entities in \gnterm{ccall} export declarations are of the form
%
\begin{grammar}
  \grule{expent}{%
    " \gopt{\gnterm{cid}} "}
\end{grammar}
%
The optional C identifier \gnterm{cid} defines the external name by which the
exported Haskell variable is accessible in C.  If it is omitted, the external
name defaults to the name of the exported Haskell variable.

\subsubsection{Constraints on Foreign Function Types}

In the case of import declaration, there are, depending on the kind of import
declaration, constraints regarding the admissible Haskell type that the
variable defined in the import may have.  These constraints are specified in
the following.
%
\begin{description}
\item[Static Functions.]  A static function can be of any foreign type; in
  particular, the result type may or may not be in the IO monad.  If a
  function that is not pure is not imported in the IO monad, the system
  behaviour is undefined.  Generally, no check for consistency with the C type
  of the imported label is performed.

  As an example, consider
  %
  \begin{quote}
\begin{verbatim}
foreign import ccall "static stdlib.h" system :: Ptr CChar -> IO CInt
\end{verbatim}
  \end{quote}
  %
  This declaration imports the \code{system()} function whose prototype is
  available from \code{stdlib.h}.

\item[Static addresses.]  The type of an imported address is constrained to be
  of the form \code{Ptr }\textit{a} or \code{FunPtr }\textit{a}, where
  \textit{a} can be any type.

  As an example, consider
  %
  \begin{quote}
\begin{verbatim}
foreign import ccall "errno.h &errno" errno :: Ptr CInt
\end{verbatim}
  \end{quote}
  %
  It imports the address of the variable \code{errno}, which is of the C type
  \code{int}.

\item[Dynamic import.]  The type of a \gnterm{dynamic} stub has to be of the
  form \code{(FunPtr }\textit{ft}\code{) -> }\textit{ft}, where \textit{ft} may
  be any foreign type.

  As an example, consider
  %
  \begin{quote}
\begin{verbatim}
foreign import ccall "dynamic" 
  mkFun :: FunPtr (CInt -> IO ()) -> (CInt -> IO ())
\end{verbatim}
  \end{quote}
  %
  The stub factory \code{mkFun} converts any pointer to a C function that gets
  an integer value as its only argument and does not have a return value into
  a corresponding Haskell function.

\item[Dynamic wrapper.]  The type of a \gnterm{wrapper} stub has to be of the
  form \textit{ft}\code{ -> }\code{IO (FunPtr }\textit{ft}\code), where
  \textit{ft} may be any foreign type.

  As an example, consider
  %
  \begin{quote}
\begin{verbatim}
foreign import ccall "wrapper" 
  mkCallback :: IO () -> IO (FunPtr (IO ()))
\end{verbatim}
  \end{quote}
  %
  The stub factory \code{mkCallback} turns any Haskell computation of type
  \code{IO ()} into a C function pointer that can be passed to C routines,
  which can call back into the Haskell context by invoking the referenced
  function.

\end{description}

\subsubsection{Specification of Header Files}

A C header specified in an import declaration is always included by
\gterm{\#include "\gnterm{chname}"}.  There is no explicit support for
\gterm{\#include <\gnterm{chname}>} style inclusion.  The ISO C99~\cite{C99}
standard guarantees that any search path that would be used for a
\gterm{\#include <\gnterm{chname}>} is also used for \gterm{\#include
  "\gnterm{chname}"} and it is guaranteed that these paths are searched after
all paths that are unique to \gterm{\#include "\gnterm{chname}"}.  Furthermore,
we require that \gnterm{chname} ends on \code{.h} to make parsing of the
specification of external entities unambiguous.
  
The specification of include files has been kept to a minimum on purpose.
Libraries often require a multitude of include directives, some of which may
be system-dependent.  Any design that attempts to cover all possible
configurations would introduce significant complexity.  Moreover, in the
current design, a custom include file can be specified that uses the standard
C preprocessor features to include all relevant headers.

Header files have no impact on the semantics of a foreign call, and whether an
implementation uses the header file or not is implementation-defined.
However, as some implementations may require a header file that supplies a
correct prototype for external functions in order to generate correct code,
portable FFI code must include suitable header files.

\subsubsection{C Argument Promotion}

The argument passing conventions of C are dependant on whether a function
prototype for the called functions is in scope at a call site.  In particular,
if no function prototype is in scope, \emph{default argument promotion} is
applied to integral and floating types.  In general, it cannot be expected
from a Haskell system that it is aware of whether a given C function was
compiled with or without a function prototype being in scope.  For the sake of
portability, we thus require that a Haskell system generally implements calls
to C functions as well as C stubs for Haskell functions as if a function
prototype for the called function is in scope.

This convention implies that the onus for ensuring the match between C and
Haskell code is placed on the FFI user.  In particular, when a C function that
was compiled without a prototype is called from Haskell, the Haskell signature
at the corresponding \code{foreign import} declaration must use the types
\emph{after} argument promotion.  For example, consider the following C
function definition, which lacks a prototype:
%
\begin{quote}
\begin{verbatim}
void foo (a)
float a;
{
  ...
}
\end{verbatim}
\end{quote}
%
The lack of a prototype implies that a C compiler will apply default argument
promotion to the parameter \code{a}, and thus, \code{foo} will expect to
receive a value of type \code{double}, \emph{not} \code{float}.  Hence, the
correct \code{foreign import} declaration is
%
\begin{quote}
\begin{verbatim}
foreign import ccall foo :: Double -> IO ()
\end{verbatim}
\end{quote}

In contrast, a C function compiled with the prototype
%
\begin{quote}
\begin{verbatim}
void foo (float a);
\end{verbatim}
\end{quote}
%
requires
%
\begin{quote}
\begin{verbatim}
foreign import ccall foo :: Float -> IO ()
\end{verbatim}
\end{quote}

A similar situation arises in the case of \code{foreign export} declarations
that use types that would be altered under the C default argument promotion
rules.  When calling such Haskell functions from C, a function prototype
matching the signature provided in the \code{foreign export} declaration must
be in scope; otherwise, the C compiler will erroneously apply the promotion
rules to all function arguments.

Note that for a C function defined to a accept a variable number of arguments,
all arguments beyond the explicitly typed arguments suffer argument promotion.
However, because C permits the calling convention to be different for such
functions; a Haskell system will, in general, not be able to make use of
variable argument functions.  Hence, their use is deprecated in portable code.


\subsection{Win32 API Calls}

The specification of external entities under the \code{stdcall} calling
convention is identical to that for standard C calls.  The two calling
conventions only differ in the generated code.


\begin{FUTURE} % ===== Material for future extension =======================

\subsection{C{+}{+} Calls}

The syntax for the specification of external entities under the
\code{cplusplus} calling convention is

\subsection{JVM Calls}

The syntax for the specification of external entities under the \code{jvm}
calling convention is 
%
\begin{grammar}
  \grule{impent}{%
    "\gnterm{jtype} \gnterm{jqid}(\gnterm{jtypes})"}
  \gor[constructor call]{%
    "new \gnterm{jqid}(\gnterm{jtypes})"}
  \grule[$n\geq0$]{jtypes}{%
    \gnterm[1]{jtype},\gellipse,\gnterm[n]{jtype}}
\end{grammar}
%
where \gnterm{jqid} is a qualified Java identifier and \gnterm{jtype} a Java
types as defined in~\cite{gosling-etal:Java}.

\begin{verbatim}
FIXME: 
- force the inclusion of the return type in case of "new"?
\end{verbatim}

\subsection{.NET Calls}

The syntax for the specification of external entities under the \code{dotnet}
calling convention is

\end{FUTURE}% =============================================================


\newpage
\section{Marshalling}
\label{sec:marshalling}

In addition to the language extension discussed in previous sections, the FFI
includes a set of standard libraries, which ease portable use of foreign
functions as well as marshalling of compound structures.  Generally, the
marshalling of Haskell structures into a foreign representation and vice versa
can be implemented in either Haskell or the foreign language.  At least where
the foreign language is at a significantly lower level, e.g.\ C, there are
good reasons for doing the marshalling in Haskell:
%
\begin{itemize}
\item Haskell's lazy evaluation strategy would require any foreign code that
  attempts to access Haskell structures to force the evaluation of these
  structures before accessing them. This would lead to complicated code in the
  foreign language, but does not need any extra consideration when coding the
  marshalling in Haskell.
\item Despite the fact that marshalling code in Haskell tends to look like C
  in Haskell syntax, the strong type system still catches many errors that
  would otherwise lead to difficult-to-debug runtime faults.
\item Direct access to Haskell heap structures from a language like
  C---especially, when marshalling from C to Haskell, i.e., when Haskell
  structures are created---carries the risk of corrupting the heap, which
  usually leads to faults that are very hard to debug.
\end{itemize}
%
Consequently, the Haskell FFI emphasises Haskell-side marshalling.

The interface to the marshalling libraries is provided by the module
\code{Foreign} plus a language-dependent module per supported language.  In
particular, the standard requires the availability of the module
\code{CForeign}, which simplifies portable interfacing with external C code.
Language-dependent modules, such as \code{CForeign}, generally provide Haskell
types representing the basic types of the foreign language using a
representation that is compatible with the foreign types as implemented by the
default implementation of the foreign language on the present architecture.
This is especially important for languages where the standard leaves some
aspects of the implementation of basic types open.  For example, in C, the
size of the various integral types is not fixed.  Thus, to represent C
interfaces faithfully in Haskell, for each integral type in C, we need to have
an integral type in Haskell that is guaranteed to have the same size as the
corresponding C type.

In the following, the interface of the language independent support is
defined.  The interface for C-specific support is discussed in
Section~\ref{sec:c-marshalling}. 

\subsection{\code{Foreign}}
\label{sec:Foreign}

The module \code{Foreign} combines the interfaces of all modules providing
language-independent marshalling support.  These modules are \code{Bits},
\code{Int}, \code{Word}, \code{Ptr}, \code{ForeignPtr}, \code{StablePtr},
\code{Storable}, \code{MarshalAlloc}, \code{MarshalArray},
\code{MarshalError}, and \code{MarshalUtils}.

Sometimes an external entity is a pure function, except that it passes
arguments and/or results via pointers.  To permit the packaging of
such entities as pure functions, \code{Foreign} provides the following
primitive:
%
\begin{codedesc}
\item[unsafePerformIO ::\ IO a -> a] Return the value resulting from executing
  the \code{IO} action.  This value should be independent of the environment;
  otherwise, the system behaviour is undefined.
  
  If the \code{IO} computation wrapped in \code{unsafePerformIO} performs side
  effects, then the relative order in which those side effects take place
  (relative to the main \code{IO} trunk, or other calls to
  \code{unsafePerformIO}) is indeterminate.  Moreover, the side effects may be
  performed several times or not at all, depending on lazy evaluation and
  whether the compiler unfolds an enclosing definition.
  
  Great care should be exercised in the use of this primitive.  Not only
  because of the danger of introducing side effects, but also because
  \code{unsafePerformIO} may compromise typing; to avoid this, the programmer
  should ensure that the result of \code{unsafePerformIO} has a monomorphic
  type.
\end{codedesc}

\subsection{\code{Bits}}

This module provides functions implementing typical bit operations overloaded
for the standard integral types \code{Int} and \code{Integer} as well as the
types provided by the modules \code{Int} and \code{Word} in
Section~\ref{sec:Int-Word}.  The overloading is implemented via a new type
class \code{Bits}, which is a subclass of \code{Num} and has the following
member functions:
%
\begin{codedesc}
\item[(.\&.), (.|.), xor ::\ Bits a => a -> a -> a]  Implement bitwise
  conjunction, disjunction, and exclusive or.  The infix operators have the
  following precedences:
  %
  \begin{quote}
\begin{verbatim}
infixl 7 .&.
infixl 6 `xor`
infixl 5 .|.
\end{verbatim}
  \end{quote}
  
\item[complement ::\ Bits a => a -> a] Calculate the bitwise complement of the
  argument.

\item[shift, rotate ::\ Bits a => a -> Int -> a] Shift or rotate the bit
  pattern to the left for a positive second argument and to the right for a
  negative argument.  The function \code{shift} performs sign extension on
  signed number types; i.e., right shifts fill the top bits with 1 if the
  number is negative and with 0 otherwise.  These operators have the following
  precedences as infix operators:
  %
  \begin{quote}
\begin{verbatim}
infixl 8 `shift`, `rotate`
\end{verbatim}
  \end{quote}
  %
  For unbounded types (e.g., \code{Integer}), \code{rotate} is equivalent to
  \code{shift}.  An instance can define either this unified \code{rotate} or
  \code{rotateL} and \code{rotateR}, depending on which is more convenient for
  the type in question.

\item[bit ::\ Bits a => Int -> a] Obtain a value where only the $n$th bit
  is set.
  
\item[setBit, clearBit, complementBit ::\ a -> Int -> a] Set, clear, or
  complement the bit at the given position.
  
\item[testBit ::\ Bits a => a -> Int -> Bool] Check whether the $n$th bit of
  the first argument is set.

\item[bitSize~~::\ Bits a => a -> Int]
\item[isSigned~::\ Bits a => a -> Bool]\combineitems Respectively, query the
  number of bits of values of type \code{a} and whether these values are
  signed.  These functions never evaluate their argument.  The function
  \code{bitSize} is undefined for unbounded types (e.g., \code{Integer}).

\item[shiftL,~~shiftR~~::\ Bits a => a -> Int -> a]
\item[rotateL,~rotateR~::\ Bits a => a -> Int -> a]\combineitems The functions
  \code{shiftR} and \code{rotateR} are synonyms for \code{shift} and
  \code{rotate}; \code{shiftL} and \code{rotateL} negate the second argument.
  These operators have the following precedences as infix operators:
  %
  \begin{quote}
\begin{verbatim}
infixl 8 `shiftL`, `shiftR`, `rotateL`, `rotateR`
\end{verbatim}
  \end{quote}

\end{codedesc}
%
Bits are numbered from 0 with bit 0 being the least significant bit.  A
minimal complete definition of the type class \code{Bits} must include
definitions for the following functions: \code{(.\&.)}, \code{(.|.)},
\code{xor}, \code{complement}, \code{shift}, \code{rotate}, \code{bitSize},
and \code{isSigned}. 

\subsection{\code{Int} and \code{Word}}
\label{sec:Int-Word}

The two modules \code{Int} and \code{Word} provide the following signed and
unsigned integral types of fixed size:
%
\begin{quote}
  \begin{tabular}{|l|l|l|}
    \hline
    Size in bits & Signed       & Unsigned\\\hline\hline
    8            & \code{Int8}  & \code{Word8}\\
    16           & \code{Int16} & \code{Word16}\\
    32           & \code{Int32} & \code{Word32}\\
    64           & \code{Int64} & \code{Word64}\\
    \hline
  \end{tabular}
\end{quote}
%
For these integral types, the modules \code{Int} and \code{Word} export class
instances for the class \code{Bits} and all type classes for which \code{Int}
has an instance in the Haskell 98 Prelude and standard libraries.  The
constraints on the implementation of these instances are also the same as
those outlined for \code{Int} in the Haskell Report.  There is, however, the
additional constraint that all arithmetic on the fixed-sized types is
performed modulo \(2^n\).

\subsection{\code{Ptr}}
\label{sec:Ptr}

The module \code{Ptr} provides typed pointers to foreign entities.  We
distinguish two kinds of pointers: pointers to data and pointers to functions.
It is understood that these two kinds of pointers may be represented
differently as they may be references to data and text segments, respectively.

\subsubsection{Data Pointers}

The interface defining data pointers and associated operations is as follows:
%
\begin{codedesc}
\item[data Ptr a] A value of type \code{Ptr a} represents a pointer to an
  object, or an array of objects, which may be marshalled to or from Haskell
  values of type \code{a}.  The type \code{a} will normally be an instance of
  class \code{Storable} (see Section~\ref{sec:Storable}), which provides the
  necessary marshalling operations.

  Instances for the classes \code{Eq}, \code{Ord}, and \code{Show} are
  provided. 
\item[nullPtr ::\ Ptr a] The constant \code{nullPtr} contains a distinguished
  value of \code{Ptr} that is not associated with a valid memory location.
\item[castPtr ::\ Ptr a -> Ptr b] The \code{castPtr} function casts a pointer
  from one type to another.
\item[plusPtr ::\ Ptr a -> Int -> Ptr b] Advances the given address by the
  given offset in bytes.
\item[alignPtr ::\ Ptr a -> Int -> Ptr a] Given an arbitrary address and an
  alignment constraint, \code{alignPtr} yields an address, the same or next
  higher, that fulfills the alignment constraint. An alignment constraint
  \code{x} is fulfilled by any address divisible by \code{x}. This operation
  is idempotent.
\item[minusPtr ::\ Ptr a -> Ptr b -> Int] Compute the offset required to get
  from the first to the second argument.  We have
  %
  \begin{quote}
\begin{verbatim}
p2 == p1 `plusPtr` (p2 `minusPtr` p1)
\end{verbatim}
  \end{quote}
\end{codedesc}
%
It should be noted that the use of \code{Int} for pointer differences
essentially forces any implementation to represent \code{Int} in as many bits
as used in the representation of pointer values.

\subsubsection{Function Pointers}

The interface defining function pointers and associated operations is as
follows:
%
\begin{codedesc}
\item[data FunPtr a] A value of type \code{FunPtr a} is a pointer to a piece
  of code.  It may be the pointer to a C function or to a Haskell function
  created using a wrapper stub as outlined in Section~\ref{sec:ccall}. For
  example,
  %
  \begin{quote}
\begin{verbatim}
type Compare = Int -> Int -> Bool
foreign import ccall "wrapper" 
  mkCompare :: Compare -> IO (FunPtr Compare)
\end{verbatim}
  \end{quote}
  
  Instances for the classes \code{Eq}, \code{Ord}, and \code{Show} are
  provided.
\item[nullFunPtr ::\ FunPtr a] The constant \code{nullFunPtr} contains a
  distinguished value of \code{FunPtr} that is not associated with a valid
  memory location.
\item[castFunPtr ::\ FunPtr a -> FunPtr b] Cast a \code{FunPtr} to a
  \code{FunPtr} of a different type.
\item[freeHaskellFunPtr ::\ FunPtr a -> IO ()] Release the storage associated
  with the given \code{FunPtr}, which must have been obtained from a wrapper
  stub.  This should be called whenever the return value from a foreign import
  wrapper function is no longer required; otherwise, the storage it uses will
  leak.
\end{codedesc}

Moreover, there are two functions that are only valid on architectures where
data and function pointers range over the same set of addresses.  Only where
bindings to external libraries are made whose interface already relies on this
assumption, should the use of \code{castFunPtrToPtr} and
\code{castPtrToFunPtr} be considered; otherwise, it is recommended to avoid
using these functions.
%
\begin{codedesc}
\item[castFunPtrToPtr ::\ FunPtr a -> Ptr b]
\item[castPtrToFunPtr ::\ Ptr a -> FunPtr b] \combineitems These two functions
  cast \code{Ptr}s to \code{FunPtr}s and vice versa.
\end{codedesc}

\subsection{\code{ForeignPtr}}
\label{sec:ForeignPtr}

The type \code{ForeignPtr} represents references to objects that are
maintained in a foreign language, i.e., objects that are not part of the data
structures usually managed by the Haskell storage manager.  The type
\code{ForeignPtr} is parameterised in the same way as \code{Ptr} (cf.\ 
Section~\ref{sec:Ptr}), but in contrast to vanilla memory references of type
\code{Ptr}, \code{ForeignPtr}s may be associated with finalizers.  A finalizer
is a routine that is invoked when the Haskell storage manager detects
that---within the Haskell heap and stack---there are no more references left
that are pointing to the \code{ForeignPtr}.  Typically, the finalizer will
free the resources bound by the foreign object.  Finalizers are generally
implemented in the foreign language\footnote{Finalizers in Haskell cannot be
  safely realised without requiring support for
  concurrency~\cite{boehm:finalizers}.} and have either of the following two
Haskell types:
%
\begin{quote}
\begin{verbatim}
type FinalizerPtr        a = FunPtr (           Ptr a -> IO ())
type FinalizerEnvPtr env a = FunPtr (Ptr env -> Ptr a -> IO ())
\end{verbatim}
\end{quote}
%
A foreign finalizer is represented as a pointer to a C function of type
\code{Ptr a -> IO ()} or a C function of type \code{Ptr env -> Ptr a -> IO
  ()}, where \code{Ptr env} represents an optional environment passed to the
finalizer on invocation.  That is, a foreign finalizer attached to a finalized
pointer \code{ForeignPtr a} gets the finalized pointer in the form of a raw
pointer of type \code{Ptr a} as an argument when it is invoked.  In addition,
a foreign finalizer of type \code{FinalizerEnvPtr env a} also gets an
environment pointer of type \code{Ptr env}.  There is no guarantee on how soon
the finalizer is executed after the last reference to the associated foreign
pointer was dropped; this depends on the details of the Haskell storage
manager.  The only guarantee is that the finalizer runs before the program
terminates.  Whether a finalizer may call back into the Haskell system is
system dependent.  Portable code may not rely on such callbacks.

Foreign finalizers that expect an environment are a means to model closures in
languages that do not support them natively, such as C.  They recover part of
the convenience lost by requiring finalizers to be defined in the foreign
languages rather than in Haskell.

The data type \code{ForeignPtr} and associated operations have the following
signature and purpose:
%
\begin{codedesc}
\item[data ForeignPtr a] A value of type \code{ForeignPtr a} represents a
  pointer to an object, or an array of objects, which may be marshalled to or
  from Haskell values of type \code{a}.  The type \code{a} will normally be an
  instance of class \code{Storable} (see Section~\ref{sec:Storable}), which
  provides the marshalling operations.
  
  Instances for the classes \code{Eq}, \code{Ord}, and \code{Show} are
  provided.  Equality and ordering of two foreign pointers are the same as for
  the plain pointers obtained with \code{unsafeForeignPtrToPtr} from those
  foreign pointers.
  
\item[newForeignPtr\_ ::\ Ptr a -> IO (ForeignPtr a)]
  Turn a plain memory reference into a foreign pointer that may be associated
  with finalizers by using \code{addForeignPtrFinalizer}.
  
\item[newForeignPtr ::\ FinalizerPtr a -> Ptr a -> IO (ForeignPtr a)] This is
  a convenience function that turns a plain memory reference into a foreign
  pointer and immediately adds a finalizer.  It is defined as
  %
  \begin{quote}
\begin{verbatim}
newForeignPtr finalizer ptr = 
  do
    fp <- newForeignPtr_ ptr
    addForeignPtrFinalizer finalizer fp
    return fp
\end{verbatim}
  \end{quote}
   
\item[newForeignPtrEnv ::\ FinalizerEnvPtr env a -> Ptr env -> Ptr a -> IO
  (ForeignPtr a)] This variant of \code{newForeignPtr} adds a finalizer that
  expects an environment in addition to the finalized pointer.  The
  environment that will be passed to the finalizer is fixed by the second
  argument to \code{newForeignPtrEnv}.

\item[addForeignPtrFinalizer ::\ FinalizerPtr a -> ForeignPtr a -> IO
  ()] Add a finalizer to the given foreign pointer.  All finalizers
  associated with a single foreign pointer are executed in the opposite order
  of their addition---i.e., the finalizer added last will be executed first.
  
\item[addForeignPtrFinalizerEnv ::\ FinalizerEnvPtr env a -> Ptr env ->
  ForeignPtr a]
\item[~~~~~~~~~~~~~~~~~~~~~~~~~~-> IO ()]\combineitems Add a finalizer that
  expects an environment to an existing foreign pointer.

\item[mallocForeignPtr ::\ Storable a => IO (ForeignPtr a)] Allocate a block
  of memory that is sufficient to hold values of type \code{a}.  The size of
  the memory area is determined by the function \code{Storable.sizeOf}
  (Section~\ref{sec:Storable}).  This corresponds to
  \code{MarshalAlloc.malloc} (Section~\ref{sec:MarshalAlloc}), but
  automatically attaches a finalizer that frees the block of memory as soon as
  all references to that block of of memory have been dropped.  It is
  \emph{not} guaranteed that the block of memory was allocated by
  \code{MarshalAlloc.malloc}; so, \code{MarshalAlloc.realloc} must not be
  applied to the resulting pointer.

\item[mallocForeignPtrBytes ::\ Int -> IO (ForeignPtr a)] Allocate a block of
  memory of the given number of bytes with a finalizer attached that frees the
  block of memory as soon as all references to that block of memory have
  been dropped.  As for \code{mallocForeignPtr}, \code{MarshalAlloc.realloc}
  must not be applied to the resulting pointer.

\item[mallocForeignPtrArray~ ::\ Storable a => Int -> IO (ForeignPtr a)]
\item[mallocForeignPtrArray0 ::\ Storable a => Int -> IO (ForeignPtr a)]%
  \combineitems These functions correspond to \code{MarshalArray}'s
  \code{mallocArray} and \code{mallocArray0}, respectively, but yield a memory
  area that has a finalizer attached that releases the memory area.  As with
  the previous two functions, it is not guaranteed that the block of memory
  was allocated by \code{MarshalAlloc.malloc}.
  
\item[withForeignPtr ::\ ForeignPtr a -> (Ptr a -> IO b) -> IO b]
  This is a way to obtain the pointer living inside a foreign pointer. This
  function takes a function which is applied to that pointer. The resulting
  \code{IO} action is then executed. The foreign pointer is kept alive at least
  during the whole action, even if it is not used directly inside. Note that
  it is not safe to return the pointer from the action and use it after the
  action completes.  All uses of the pointer should be inside the
  \code{withForeignPtr} bracket.

  More precisely, the foreign pointer may be finalized after
  \code{withForeignPtr} is finished if the first argument was the last
  occurrence of that foreign pointer.  Finalisation of the foreign pointer
  might render the pointer that is passed to the function useless.
  Consequently, this pointer cannot be used safely anymore after the
  \code{withForeignPtr} is finished, unless the function
  \code{touchForeignPtr} is used to explicitly keep the foreign pointer alive.
  
  This function is normally used for marshalling data to or from the object
  pointed to by the \code{ForeignPtr}, using the operations from the
  \code{Storable} class.

\item[unsafeForeignPtrToPtr ::\ ForeignPtr a -> Ptr a]
  Extract the pointer component of a foreign pointer. This is a potentially
  dangerous operation.  If the argument to \code{unsafeForeignPtrToPtr} is the
  last usage occurrence of the given foreign pointer, then its finalizer(s)
  will be run, which potentially invalidates the plain pointer just obtained.
  Hence, \code{touchForeignPtr} must be used wherever it has to be guaranteed
  that the pointer lives on---i.e., has another usage occurrence.
  
  It should be noticed that this function does not need to be monadic when
  used in combination with \code{touchForeignPtr}.  Until the
  \code{unsafeForeignPtrToPtr} is executed, the thunk representing the
  suspended call keeps the foreign pointer alive.  Afterwards, the
  \code{touchForeignPtr} keeps the pointer alive.
  
  To avoid subtle coding errors, hand written marshalling code should
  preferably use the function \code{withForeignPtr} rather than
  \code{unsafeForeignPtrToPtr} and \code{touchForeignPtr}. However, the later
  routines are occasionally preferred in tool-generated marshalling code.
  
\item[touchForeignPtr ::\ ForeignPtr a -> IO ()] Ensure that the foreign
  pointer in question is alive at the given place in the sequence of \code{IO}
  actions. In particular, \code{withForeignPtr} does a \code{touchForeignPtr}
  after it executes the user action.
  
  This function can be used to express liveness dependencies between
  \code{ForeignPtr}s: For example, if the finalizer for one \code{ForeignPtr}
  touches a second \code{ForeignPtr}, then it is ensured that the second
  \code{ForeignPtr} will stay alive at least as long as the first. This can be
  useful when you want to manipulate interior pointers to a foreign structure:
  You can use \code{touchForeignPtr} to express the requirement that the
  exterior pointer must not be finalized until the interior pointer is no
  longer referenced.
    
\item[castForeignPtr ::\ ForeignPtr a -> ForeignPtr b] Cast a
  \code{ForeignPtr} parameterised by one type into another type.
\end{codedesc}

\subsection{\code{StablePtr}}
\label{sec:StablePtr}

A \emph{stable pointer} is a reference to a Haskell expression that is
guaranteed not to be affected by garbage collection, i.e., it will neither be
deallocated nor will the value of the stable pointer itself change during
garbage collection (ordinary references may be relocated during garbage
collection).  Consequently, stable pointers can be passed to foreign code,
which can treat it as an opaque reference to a Haskell value.

The data type and associated operations have the following signature and
purpose:
%
\begin{codedesc}
\item[data StablePtr a] Values of this type represent a stable reference to a
  Haskell value of type \code{a}.
  
\item[newStablePtr ::\ a -> IO (StablePtr a)] Create a stable pointer
  referring to the given Haskell value.
  
\item[deRefStablePtr ::\ StablePtr a -> IO a] Obtain the Haskell value
  referenced by a stable pointer, i.e., the same value that was passed to the
  corresponding call to \code{makeStablePtr}.  If the argument to
  \code{deRefStablePtr} has already been freed using \code{freeStablePtr}, the
  behaviour of \code{deRefStablePtr} is undefined.
  
\item[freeStablePtr ::\ StablePtr a -> IO ()] Dissolve the association between
  the stable pointer and the Haskell value. Afterwards, if the stable pointer
  is passed to \code{deRefStablePtr} or \code{freeStablePtr}, the behaviour is
  undefined.  However, the stable pointer may still be passed to
  \code{castStablePtrToPtr}, but the \code{Ptr ()} value returned by
  \code{castStablePtrToPtr}, in this case, is undefined (in particular, it may
  be \code{Ptr.nullPtr}).  Nevertheless, the call to \code{castStablePtrToPtr}
  is guaranteed not to diverge.
  
\item[castStablePtrToPtr ::\ StablePtr a -> Ptr ()] Coerce a stable pointer to
  an address. No guarantees are made about the resulting value, except that
  the original stable pointer can be recovered by \code{castPtrToStablePtr}.
  In particular, the address may not refer to an accessible memory location and
  any attempt to pass it to the member functions of the class \code{Storable}
  (Section~\ref{sec:Storable}) leads to undefined behaviour.
  
\item[castPtrToStablePtr ::\ Ptr () -> StablePtr a] The inverse of
  \code{castStablePtrToPtr}, i.e., we have the identity
  %
  \begin{quote}
\begin{verbatim}
sp == castPtrToStablePtr (castStablePtrToPtr sp)
\end{verbatim}
  \end{quote}
  %
  for any stable pointer \code{sp} on which \code{freeStablePtr} has not been
  executed yet.  Moreover, \code{castPtrToStablePtr} may only be applied to
  pointers that have been produced by \code{castStablePtrToPtr}.
\end{codedesc}

It is important to free stable pointers that are no longer required by using
\code{freeStablePtr}.  Otherwise, the object referenced by the stable pointer
will be retained in the heap.


\subsection{\code{Storable}}
\label{sec:Storable}

To code marshalling in Haskell, Haskell data structures need to be translated
into the binary representation of a corresponding data structure of the
foreign language and vice versa.  To this end, the module \code{Storable}
provides routines that manipulate primitive data types stored in unstructured
memory blocks.  The class \code{Storable} is instantiated for all primitive
types that can be stored in raw memory.  Reading and writing these types to
arbitrary memory locations is implemented by the member functions of the
class.  The member functions, furthermore, encompass support for computing the
storage requirements and alignment restrictions of storable types.

Memory addresses are represented as values of type \code{Ptr a}
(Section~\ref{sec:Ptr}), where \code{a} is a storable type.  The type argument
to \code{Ptr} provides some type safety in marshalling code, as pointers to
different types cannot be mixed without an explicit cast.  Moreover, it
assists in resolving overloading.

The class \code{Storable} is instantiated for all standard basic types of
Haskell, the fixed size integral types of the modules \code{Int} and
\code{Word} (Section~\ref{sec:Int-Word}), data and function pointers
(Section~\ref{sec:Ptr}), and stable pointers (Section~\ref{sec:StablePtr}).
There is no instance of \code{Storable} for foreign pointers.  The intention
is to ensure that storing a foreign pointer requires an explicit cast to a
plain \code{Ptr}, which makes it obvious that the finalizers of the foreign
pointer may be invoked at this point if no other reference to the pointer
exists anymore.

The signatures and behaviour of the member functions of the class
\code{Storable} are as follows:
%
\begin{codedesc}
\item[sizeOf~~~~::\ Storable a => a -> Int]
\item[alignment~::\ Storable a => a -> Int]\combineitems The function
  \code{sizeOf} computes the storage requirements (in bytes) of the argument,
  and alignment computes the alignment constraint of the argument.  An
  alignment constraint \code{x} is fulfilled by any address divisible by
  \code{x}. Both functions do not evaluate their argument, but compute the
  result on the basis of the type of the argument alone.  We require that the
  size is divisible by the alignment.  (Thus each element of a contiguous
  array of storable values will be properly aligned if the first one is.)

\item[peekElemOff ::\ Storable a => Ptr a -> Int -> IO a] Read a value from a
  memory area regarded as an array of values of the same kind. The first
  argument specifies the start address of the array and the second the index
  into the array (the first element of the array has index 0).
  
\item[pokeElemOff ::\ Storable a => Ptr a -> Int -> a -> IO ()] Write a value
  to a memory area regarded as an array of values of the same kind.  The first
  and second argument are as for \code{peekElemOff}.
  
\item[peekByteOff ::\ Storable a => Ptr b -> Int -> IO a] Read a value from a
  memory location given by a base address and byte offset from that base
  address.
  
\item[pokeByteOff ::\ Storable a => Ptr b -> Int -> a -> IO ()] Write a value
  to a memory location given by a base address and offset from that base
  address.
  
\item[peek ::\ Storable a => Ptr a -> IO a] Read a value from the given memory
  location.
  
\item[poke ::\ Storable a => Ptr a -> a -> IO ()] Write the given value to the
  given memory location.
\end{codedesc}
%
On some architectures, the \code{peek} and \code{poke} functions might require
properly aligned addresses to function correctly.  Thus, portable code should
ensure that when peeking or poking values of some type \code{a}, the alignment
constraint for \code{a}, as given by the function \code{alignment} is
fulfilled.

A minimal complete definition of \code{Storable} needs to define
\code{sizeOf}, \code{alignment}, one of \code{peek}, \code{peekElemOff}, or
\code{peekByteOff}, and one of \code{poke}, \code{pokeElemOff}, and
\code{pokeByteOff}.

\subsection{\code{MarshalAlloc}}
\label{sec:MarshalAlloc}

The module \code{MarshalAlloc} provides operations to allocate and deallocate
blocks of raw memory (i.e., unstructured chunks of memory outside of the area
maintained by the Haskell storage manager).  These memory blocks are commonly
used to pass compound data structures to foreign functions or to provide space
in which compound result values are obtained from foreign functions.  For
example, Haskell lists are typically passed as C arrays to C functions; the
storage space for such an array can be allocated by the following functions:
%
\begin{codedesc}
\item[malloc ::\ Storable a => IO (Ptr a)] Allocate a block of memory that is
  sufficient to hold values of type \code{a}.  The size of the memory area is
  determined by the function \code{Storable.sizeOf}
  (Section~\ref{sec:Storable}).

\item[mallocBytes ::\ Int -> IO (Ptr a)] Allocate a block of memory of the
  given number of bytes.  The block of memory is sufficiently aligned for any
  of the basic foreign types (see Section~\ref{sec:foreign-types}) that fits
  into a memory block of the allocated size.
  
\item[alloca ::\ Storable a => (Ptr a -> IO b) -> IO b] Allocate a block of
  memory of the same size as \code{malloc}, but the reference is passed to a
  computation instead of being returned.  When the computation terminates,
  free the memory area again.  The operation is exception-safe; i.e.,
  allocated memory is freed if an exception is thrown in the marshalling
  computation.

\item[allocaBytes ::\ Int -> (Ptr a -> IO b) -> IO b] As \code{alloca}, but
  allocate a memory area of the given size.  The same alignment constraint as
  for \code{mallocBytes} holds.
  
\item[realloc ::\ Storable b => Ptr a -> IO (Ptr b)] Resize a memory area that
  was allocated with \code{malloc} or \code{mallocBytes} to the size needed to
  store values of type \code{b}.  The returned pointer may refer to an
  entirely different memory area, but will be suitably aligned to hold values
  of type \code{b}.  The contents of the referenced memory area will be the
  same as of the original pointer up to the minimum of the size of values of
  type \code{a} and \code{b}.  If the argument to \code{realloc} is
  \code{Ptr.nullPtr}, \code{realloc} behaves like \code{malloc}.
  
\item[reallocBytes ::\ Ptr a -> Int -> IO (Ptr a)] As \code{realloc}, but
  allocate a memory area of the given size.  In addition, if the requested size
  is 0, \code{reallocBytes} behaves like \code{free}.
  
\item[free ::\ Ptr a -> IO ()] Free a block of memory that was allocated with
  \code{malloc}, \code{mallocBytes}, \code{realloc}, \code{reallocBytes}, or
  any of the allocation functions from \code{MarshalArray} (see
  Section~\ref{sec:MarshalArray}).

\item[finalizerFree ::\ FinalizerPtr a] Foreign finalizer that performs the
  same operation as \code{free}.
\end{codedesc}
%
If any of the allocation functions fails, a value of \code{Ptr.nullPtr} is
produced.  If \code{free} or \code{reallocBytes} is applied to a memory area
that has been allocated with \code{alloca} or \code{allocaBytes}, the
behaviour is undefined.  Any further access to memory areas allocated with
\code{alloca} or \code{allocaBytes}, after the computation that was passed to
the allocation function has terminated, leads to undefined behaviour.  Any
further access to the memory area referenced by a pointer passed to
\code{realloc}, \code{reallocBytes}, or \code{free} entails undefined
behaviour.

\subsection{\code{MarshalArray}}
\label{sec:MarshalArray}

The module \code{MarshalArray} provides operations for marshalling Haskell
lists into monolithic arrays and vice versa.  Most functions come in two
flavours: one for arrays terminated by a special termination element and one
where an explicit length parameter is used to determine the extent of an
array.  The typical example for the former case are C's NUL terminated
strings.  However, please note that C strings should usually be marshalled
using the functions provided by \code{CString} (Section~\ref{sec:CString}) as
the Unicode encoding has to be taken into account.  All functions specifically
operating on arrays that are terminated by a special termination element have
a name ending on \code{0}---e.g., \code{mallocArray} allocates space for an
array of the given size, whereas \code{mallocArray0} allocates space for one
more element to ensure that there is room for the terminator.

The following functions are provided by the module:
%
\begin{codedesc}
\item[mallocArray~~::\ Storable a => Int -> IO (Ptr a)]
\item[allocaArray~~::\ Storable a => Int -> (Ptr a -> IO b) -> IO b]
  \combineitems
\item[reallocArray~::\ Storable a => Ptr a -> Int -> IO (Ptr a)]\combineitems
  The functions behave like the functions \code{malloc}, \code{alloca}, and
  \code{realloc} provided by the module \code{MarshalAlloc}
  (Section~\ref{sec:MarshalAlloc}), respectively, except that they allocate a
  memory area that can hold an array of elements of the given length, instead
  of storage for just a single element.

\item[mallocArray0~~::\ Storable a => Int -> IO (Ptr a)]
\item[allocaArray0~~::\ Storable a => Int -> (Ptr a -> IO b) -> IO b]\combineitems
\item[reallocArray0~::\ Storable a => Ptr a -> Int -> IO (Ptr a)]\combineitems
  These functions are like the previous three functions, but reserve storage
  space for one additional array element to allow for a termination indicator.
  
\item[peekArray ::\ Storable a => Int -> Ptr a -> IO {[a]}] Marshal an array of
  the given length and starting at the address indicated by the pointer
  argument into a Haskell list using \code{Storable.peekElemOff} to obtain the
  individual elements.  The order of elements in the list matches the order in
  the array.
  
\item[pokeArray ::\ Storable a => Ptr a -> {[a]} -> IO ()] Marshal the elements
  of the given list into an array whose start address is determined by the
  first argument using \code{Storable.pokeElemOff} to write the individual
  elements.  The order of elements in the array matches that in the list.
  
\item[peekArray0 ::\ (Storable a, Eq a) => a -> Ptr a -> IO {[a]}] Marshal an
  array like \code{peekArray}, but instead of the length of the array a
  terminator element is specified by the first argument.  All elements of the
  array, starting with the first element, up to, but excluding the first
  occurrence of an element that is equal (as determined by \code{==}) to the
  terminator are marshalled.

\item[pokeArray0 ::\ Storable a => a -> Ptr a -> {[a]} -> IO ()]
  Marshal an array like \code{pokeArray}, but write a terminator value
  (determined by the first argument) after the last element of the list.  Note
  that the terminator must not occur in the marshalled list if it should be
  possible to extract the list with \code{peekArray0}.

\item[newArray~~::\ Storable a => {[a]} -> IO (Ptr a)]
\item[withArray~::\ Storable a => {[a]} -> (Ptr a -> IO b) -> IO b]\combineitems
  These two functions combine \code{mallocArray} and \code{allocaArray},
  respectively, with \code{pokeArray}; i.e., they allocate a memory area for
  an array whose length matches that of the list, and then, marshal the list
  into that memory area.

\item[newArray0~~::\ Storable a => a -> {[a]} -> IO (Ptr a)]
\item[withArray0~::\ Storable a => a -> {[a]} -> (Ptr a -> IO b) -> IO b]\combineitems
  These two functions combine \code{mallocArray0} and \code{allocaArray0},
  respectively, with the function \code{pokeArray0}; i.e., they allocate a
  memory area for 
  an array whose length matches that of the list, and then, marshal the list
  into that memory area.  The first argument determines the terminator.

\item[copyArray ::\ Storable a => Ptr a -> Ptr a -> Int -> IO ()]
\item[moveArray ::\ Storable a => Ptr a -> Ptr a -> Int -> IO ()]\combineitems
  These two functions copy entire arrays and behave like the routines
  \code{MarshalUtils.copyBytes} and \code{MarshalUtils.moveBytes},
  respectively (Section~\ref{sec:MarshalUtils}).  In particular,
  \code{moveArray} allows the source and destination array to overlap, whereas
  \code{copyArray} does not allow overlapping arrays.  Both functions take a
  reference to the destination array as their first, and a reference to the
  source as their second argument.  However, in contrast to the routines from
  \code{MarshalUtils} the third argument here specifies the number of array
  elements (whose type is specified by the parametrised pointer arguments)
  instead of the number of bytes to copy.
  
\item[lengthArray0 ::\ (Storable a, Eq a) => a -> Ptr a -> IO Int] Determine
  the length of an array whose end is marked by the first occurrence of the
  given terminator (first argument). The length is measured in array elements
  (not bytes) and does not include the terminator.
  
\item[advancePtr ::\ Storable a => Ptr a -> Int -> Ptr a] Advance a reference
  to an array by as many array elements (not bytes) as specified.
\end{codedesc}

\subsection{\code{MarshalError}}
\label{sec:MarshalError}

The module \code{MarshalError} provides language independent routines for
converting error conditions of external functions into Haskell \code{IO} monad
exceptions.  It consists out of two parts.  The first part extends the I/O
error facilities of the \code{IO} module of the Haskell 98 Library Report with
functionality to construct I/O errors.  The second part provides a set of
functions that ease turning exceptional result values into I/O errors.

\subsubsection{I/O Errors}
%
The following functions can be used to construct values of type
\code{IOError}.
%
\begin{codedesc}
\item[data IOErrorType] This is an abstract type that contains a value for
  each variant of \code{IOError}.

\item[mkIOError ::\ IOErrorType -> String -> Maybe Handle -> Maybe FilePath
  -> IOError] Construct an \code{IOError} of the given type where the second
  argument describes the error location and the third and fourth argument
  contain the file handle and file path of the file involved in the error if
  applicable. 
  
\item[alreadyExistsErrorType ::\ IOErrorType] I/O error where the operation
  failed because one of its arguments already exists.
  
\item[doesNotExistErrorType ::\ IOErrorType] I/O error where the operation
  failed because one of its arguments does not exist.
  
\item[alreadyInUseErrorType ::\ IOErrorType] I/O error where the operation
  failed because one of its arguments is a single-use resource, which is
  already being used.
  
\item[fullErrorType ::\ IOErrorType] I/O error where the operation failed
  because the device is full.
  
\item[eofErrorType ::\ IOErrorType] I/O error where the operation failed
  because the end of file has been reached.
  
\item[illegalOperationType ::\ IOErrorType] I/O error where the operation is
  not possible.
  
\item[permissionErrorType ::\ IOErrorType] I/O error where the operation failed
  because the user does not have sufficient operating system privilege to
  perform that operation.

\item[userErrorType ::\ IOErrorType] I/O error that is programmer-defined.

\item[annotateIOError ::\ IOError -> String -> Maybe Handle -> Maybe
  FilePath -> IOError] Adds a location description and maybe a file path and
  file handle to an I/O error.  If any of the file handle or file path is not
  given the corresponding value in the I/O error remains unaltered.
\end{codedesc}

\subsubsection{Result Value Checks}

The following routines are useful for testing return values and raising an I/O
exception in case of values indicating an error state.
%
\begin{codedesc}
\item[throwIf ::\ (a -> Bool) -> (a -> String) -> IO a -> IO a] Execute the
  computation determined by the third argument.  If the predicate provided in
  the first argument yields \code{True} when applied to the result of that
  computation, raise an \code{IO} exception that includes an error message
  obtained by applying the second argument to the result of the computation.
  If no exception is raised, the result of the computation is the result of
  the whole operation.

\item[throwIf\us ::\ (a -> Bool) -> (a -> String) -> IO a -> IO ()]
  Operate as \code{throwIf} does, but discard the result of the computation
  in any case.

\item[throwIfNeg~~::\ (Ord a, Num a) => (a -> String) -> IO a -> IO a]
\item[throwIfNeg\us~::\ (Ord a, Num a) => (a -> String) -> IO a -> IO ()]\combineitems
  These two functions are instances of \code{throwIf} and \code{throwIf\us},
  respectively, where the predicate is \code{(< 0)}.
  
\item[throwIfNull ::\ String -> IO (Ptr a) -> IO (Ptr a)] This is an instance
  of \code{throwIf}, where the predicate is \code{(== Ptr.nullPtr)} and the
  error message is constant.

\item[void ::\ IO a -> IO ()]
  Discard the result of a computation.
\end{codedesc}

\subsection{\code{MarshalUtils}}
\label{sec:MarshalUtils}

Finally, the module \code{MarshalUtils} provides a set of useful auxiliary
routines. 
%
\begin{codedesc}
\item[new ::\ Storable a => a -> IO (Ptr a)] This function first applies
  \code{MarshalAlloc.malloc} (Section~\ref{sec:MarshalAlloc}) to its
  argument, and then, stores the argument in the newly allocated memory area
  using \code{Storable.poke} (Section~\ref{sec:Storable}).
  
\item[with ::\ Storable a => a -> (Ptr a -> IO b) -> IO b] This function is
  like \code{new}, but uses \code{MarshalAlloc.alloca} instead of
  \code{MarshalAlloc.malloc}.

\item[fromBool~::\ Num a => Bool -> a]
\item[toBool~~~::\ Num a => a -> Bool]\combineitems These two functions
  implement conversions between Haskell Boolean values and numeric
  representations of Boolean values, where \code{False} is represented by
  \code{0} and \code{True} by any non-zero value.

\item[maybeNew ::\ (a -> IO (Ptr a)) -> (Maybe a -> IO (Ptr a))]
  Lift a function that marshals a value of type \code{a} to a function that
  marshals a value of type \code{Maybe a}.  In case, where the latter is
  \code{Nothing}, return \code{Ptr.nullPtr} (Section~\ref{sec:Ptr})

\item[maybeWith ::\ (a -> (Ptr b -> IO c) -> IO c)%
  -> (Maybe a -> (Ptr b -> IO c) -> IO c)] This function lifts a
  \code{MarshalAlloc.alloca} based marshalling function for \code{a} to
  \code{Maybe a}.  It marshals values \code{Nothing} in the same way as
  \code{maybeNew}. 
  
\item[maybePeek ::\ (Ptr a -> IO b) -> (Ptr a -> IO (Maybe b))] Given a
  function that marshals a value stored in the referenced memory area to a
  value of type \code{b}, lift it to producing a value of type \code{Maybe b}.
  If the pointer is \code{Ptr.nullPtr}, produce \code{Nothing}.
  
% Move to `Data.List.withMany' in new library spec.
%\item[withMany ::\ (a -> (b -> res) -> res) -> {[a]} -> ({[b]} -> res) -> res]
%  Lift a marshalling function of the \code{with} family to operate on a list
%  of values.

\item[copyBytes ::\ Ptr a -> Ptr a -> Int -> IO ()]
\item[moveBytes ::\ Ptr a -> Ptr a -> Int -> IO ()]\combineitems These two
  functions are Haskell variants of the standard C library routines
  \code{memcpy()} and \code{memmove()}, respectively.  As with their C
  counterparts, \code{moveBytes} allows the source and destination array to
  overlap, whereas \code{copyBytes} does not allow overlapping areas.  Both
  functions take a reference to the destination area as their first, and a
  reference to the source as their second argument---i.e., the argument order
  is as in an assignment.
\end{codedesc}

\newpage
\section{C-Specific Marshalling}
\label{sec:c-marshalling}

\subsection{\code{CForeign}}
\label{sec:CForeign}

The module \code{CForeign} combines the interfaces of all modules providing
C-specific marshalling support.  The modules are \code{CTypes},
\code{CString}, and \code{CError}.

\begin{table}
  \begin{center}
    \begin{tabular}{|l|l|l|}
      \hline
      C symbol          & Haskell symbol & Constraint on concrete C type\\
      \hline\hline
      \code{HsChar}     & \code{Char}    
      & integral type\\
      \hline
      \code{HsInt}      & \code{Int}
      & signed integral type, $\geq30$ bit\\
      \hline
      \code{HsInt8}     & \code{Int8}
      & signed integral type, 8 bit; \code{int8\_t} if available\\
      \hline
      \code{HsInt16}    & \code{Int16}
      & signed integral type, 16 bit; \code{int16\_t} if available\\
      \hline
      \code{HsInt32}    & \code{Int32}
      & signed integral type, 32 bit; \code{int32\_t} if available\\
      \hline
      \code{HsInt64}    & \code{Int64}
      & signed integral type, 64 bit; \code{int64\_t} if available\\ 
      \hline
      \code{HsWord8}    & \code{Word8}
      & unsigned integral type, 8 bit; \code{uint8\_t} if available\\
      \hline
      \code{HsWord16}   & \code{Word16}
      & unsigned integral type, 16 bit; \code{uint16\_t} if available\\
      \hline
      \code{HsWord32}   & \code{Word32}
      & unsigned integral type, 32 bit; \code{uint32\_t} if available\\
      \hline
      \code{HsWord64}   & \code{Word64}
      & unsigned integral type, 64 bit; \code{uint64\_t} if available\\
      \hline
      \code{HsFloat}    & \code{Float}
      & floating point type\\
     \hline
      \code{HsDouble}   & \code{Double}
      & floating point type\\
     \hline
      \code{HsBool}     & \code{Bool}
      & \code{int}\\
     \hline
      \code{HsPtr}      & \code{Ptr a}
      & \code{(void *)}\\
     \hline
      \code{HsFunPtr}   & \code{FunPtr a}
      & \code{(void (*)(void))}\\
     \hline
      \code{HsStablePtr}& \code{StablePtr a}
      & \code{(void *)}\\
     \hline
    \end{tabular}
    \caption{C Interface to Basic Haskell Types}
    \label{tab:c-haskell-types}
  \end{center}
\end{table}
%
\begin{table}
  \begin{center}
%    \begin{tabular}{|l|l|l|}
    \begin{tabular}{|l|l|p{30ex}|}
      \hline
      CPP symbol           & Haskell value & Description\\
      \hline\hline
      \code{HS\_CHAR\_MIN} & \code{minBound ::\ Char}
      & \\
      \hline
      \code{HS\_CHAR\_MAX} & \code{maxBound ::\ Char}
      & \\
      \hline
      \code{HS\_INT\_MIN} & \code{minBound ::\ Int}
      & \\
      \hline
      \code{HS\_INT\_MAX} & \code{maxBound ::\ Int}
      & \\
      \hline
      \code{HS\_INT8\_MIN} & \code{minBound ::\ Int8}
      & \\
      \hline
      \code{HS\_INT8\_MAX} & \code{maxBound ::\ Int8}
      & \\
      \hline
      \code{HS\_INT16\_MIN} & \code{minBound ::\ Int16}
      & \\
      \hline
      \code{HS\_INT16\_MAX} & \code{maxBound ::\ Int16}
      & \\
      \hline
      \code{HS\_INT32\_MIN} & \code{minBound ::\ Int32}
      & \\
      \hline
      \code{HS\_INT32\_MAX} & \code{maxBound ::\ Int32}
      & \\
      \hline
      \code{HS\_INT64\_MIN} & \code{minBound ::\ Int64}
      & \\
      \hline
      \code{HS\_INT64\_MAX} & \code{maxBound ::\ Int64}
      & \\
      \hline
      \code{HS\_WORD8\_MAX} & \code{maxBound ::\ Word8}
      & \\
      \hline
      \code{HS\_WORD16\_MAX} & \code{maxBound ::\ Word16}
      & \\
      \hline
      \code{HS\_WORD32\_MAX} & \code{maxBound ::\ Word32}
      & \\
      \hline
      \code{HS\_WORD64\_MAX} & \code{maxBound ::\ Word64}
      & \\
      \hline
      \code{HS\_FLOAT\_RADIX} & \code{floatRadix ::\ Float}
      & \\
      \hline
      \code{HS\_FLOAT\_ROUND} & n/a
      & rounding style as per~\cite{C99}\\
      \hline
      \code{HS\_FLOAT\_EPSILON} & n/a
      & difference between 1 and the least value greater
      than 1 as per~\cite{C99}\\
      \hline
      \code{HS\_DOUBLE\_EPSILON} & n/a
      & (as above)\\
      \hline
      \code{HS\_FLOAT\_DIG} & n/a
      & number of decimal digits as per~\cite{C99}\\
      \hline
      \code{HS\_DOUBLE\_DIG} & n/a
      & (as above)\\
      \hline
      \code{HS\_FLOAT\_MANT\_DIG} & \code{floatDigits ::\ Float}
      & \\
      \hline
      \code{HS\_DOUBLE\_MANT\_DIG} & \code{floatDigits ::\ Double}
      & \\
      \hline
      \code{HS\_FLOAT\_MIN} & n/a
      & minimum floating point number as per~\cite{C99}\\
      \hline
      \code{HS\_DOUBLE\_MIN} & n/a
      & (as above)\\
      \hline
      \code{HS\_FLOAT\_MIN\_EXP} & \code{fst .\ floatRange ::\ Float}
      & \\
      \hline
      \code{HS\_DOUBLE\_MIN\_EXP} & \code{fst .\ floatRange ::\ Double}
      & \\
      \hline
      \code{HS\_FLOAT\_MIN\_10\_EXP} & n/a
      & minimum decimal exponent as per~\cite{C99}\\
      \hline
      \code{HS\_DOUBLE\_MIN\_10\_EXP} & n/a
      & (as above)\\
      \hline
      \code{HS\_FLOAT\_MAX} & n/a
      & maximum floating point number as per~\cite{C99}\\
      \hline
      \code{HS\_DOUBLE\_MAX} & n/a
      & (as above)\\
      \hline
      \code{HS\_FLOAT\_MAX\_EXP} & \code{snd .\ floatRange ::\ Float}
      & \\
      \hline
      \code{HS\_DOUBLE\_MAX\_EXP} & \code{snd .\ floatRange ::\ Double}
      & \\
      \hline
      \code{HS\_FLOAT\_MAX\_10\_EXP} & n/a
      & maximum decimal exponent as per~\cite{C99}\\
      \hline
      \code{HS\_DOUBLE\_MAX\_10\_EXP} & n/a
      & (as above)\\
      \hline
      \code{HS\_BOOL\_FALSE} & False
      & \\
      \hline
      \code{HS\_BOOL\_TRUE} & True
      & \\
      \hline
    \end{tabular}
    \caption{C Interface to Range and Precision of Basic Types}
    \label{tab:c-haskell-values}
  \end{center}
\end{table}
%
Every Haskell system that implements the FFI needs to provide a C header file
named \code{HsFFI.h} that defines the C symbols listed in
Tables~\ref{tab:c-haskell-types} and~\ref{tab:c-haskell-values}.
Table~\ref{tab:c-haskell-types} table lists symbols that represent types
together with the Haskell type that they represent and any constraints that
are placed on the concrete C types that implement these symbols.  When a C
type \code{HsT} represents a Haskell type \code{T}, the occurrence of \code{T}
in a foreign function declaration should be matched by \code{HsT} in the
corresponding C function prototype.  Indeed, where the Haskell system
translates Haskell to C code that invokes \code{foreign} \code{import}ed C
routines, such prototypes need to be provided and included via the header that
can be specified in external entity strings for foreign C functions (cf.\ 
Section~\ref{sec:ccall}); otherwise, the system behaviour is undefined.  It is
guaranteed that the Haskell value \code{nullPtr} is mapped to \code{(HsPtr)
  NULL} in C and \code{nullFunPtr} is mapped to \code{(HsFunPtr) NULL} and
vice versa.

Table~\ref{tab:c-haskell-values} contains symbols characterising the range and
precision of the types from Table~\ref{tab:c-haskell-types}.  Where available,
the table states the corresponding Haskell values.  All C symbols, with the
exception of \code{HS\_FLOAT\_ROUND} are constants that are suitable for use in
\code{\#if} preprocessing directives.  Note that there is only one rounding
style (\code{HS\_FLOAT\_ROUND}) and one radix (\code{HS\_FLOAT\_RADIX}), as
this is all that is supported by ISO C~\cite{C99}.

Moreover, an implementation that does not support 64 bit integral types on the
C side should implement \code{HsInt64} and \code{HsWord64} as a structure.  In
this case, the bounds \code{HS\_INT64\_MIN}, \code{HS\_INT64\_MAX}, and
\code{HS\_WORD64\_MAX} are undefined.

In addition, to the symbols from Table~\ref{tab:c-haskell-types}
and~\ref{tab:c-haskell-values}, the header \code{HsFFI.h} must also contain
the following prototypes:
%
\begin{quote}
\begin{verbatim}
void hs_init     (int *argc, char **argv[]);
void hs_exit     (void);
void hs_set_argv (int argc, char *argv[]);

void hs_perform_gc (void);

void hs_free_stable_ptr (HsStablePtr sp);
void hs_free_fun_ptr    (HsFunPtr fp);
\end{verbatim}
\end{quote}
%
These routines are useful for mixed language programs, where the main
application is implemented in a foreign language that accesses routines
implemented in Haskell.  The function \code{hs\_init()} initialises the
Haskell system and provides it with the available command line arguments.
Upon return, the arguments solely intended for the Haskell runtime system are
removed (i.e., the values that \code{argc} and \code{argv} point to may have
changed).  This function must be called during program startup before any
Haskell function is invoked; otherwise, the system behaviour is undefined.
Conversely, the Haskell system is deinitialised by a call to
\code{hs\_exit()}.  Multiple invocations of \code{hs\_init()} are permitted,
provided that they are followed by an equal number of calls to
\code{hs\_exit()} and that the first call to \code{hs\_exit()} is after the
last call to \code{hs\_init()}.  In addition to nested calls to
\code{hs\_init()}, the Haskell system may be de-initialised with
\code{hs\_exit()} and be re-initialised with \code{hs\_init()} at a later
point in time.  This ensures that repeated initialisation due to multiple
libraries being implemented in Haskell is covered.

The Haskell system will ignore the command line arguments passed to the second
and any following calls to \code{hs\_init()}.  Moreover, \code{hs\_init()} may
be called with \code{NULL} for both \code{argc} and \code{argv}, signalling
the absence of command line arguments.

The function \code{hs\_set\_argv()} sets the values returned by the functions
\code{getProgName} and \code{getArgs} of the module \code{System} defined in
the Haskell 98 Library Report.  This function may only be invoked after
\code{hs\_init()}.  Moreover, if \code{hs\_set\_argv()} is called at all, this
call must precede the first invocation of \code{getProgName} and
\code{getArgs}.  Note that the separation of \code{hs\_init()} and
\code{hs\_set\_argv()} is essential in cases where in addition to the Haskell
system other libraries that process command line arguments during
initialisation are used.

The function \code{hs\_perform\_gc()} advises the Haskell storage manager to
perform a garbage collection, where the storage manager makes an effort to
releases all unreachable objects.  This function must not be invoked from C
functions that are imported \code{unsafe} into Haskell code nor may it be used
from a finalizer.

Finally, \code{hs\_free\_stable\_ptr()} and \code{hs\_free\_fun\_ptr()} are
the C counterparts of the Haskell functions \code{freeStablePtr} and
\code{freeHaskellFunPtr}.

\subsection{\code{CTypes}}
\label{sec:CTypes}

The modules \code{CTypes} provide Haskell types that represent basic C types.
They are needed to accurately represent C function prototypes, and so, to
access C library interfaces in Haskell.  The Haskell system is not required to
represent those types exactly as C does, but the following guarantees are
provided concerning a Haskell type \code{CT} representing a C type \code{t}:
%
\begin{itemize}
\item If a C function prototype has \code{t} as an argument or result type,
  the use of \code{CT} in the corresponding position in a foreign declaration
  permits the Haskell program to access the full range of values encoded by
  the C type; and conversely, any Haskell value for \code{CT} has a valid
  representation in C.
\item \code{Storable.sizeOf (undefined ::\ CT)} will yield the same value as
  \code{sizeof (t)} in C.
\item \code{Storable.alignment (undefined ::\ CT)} matches the alignment
  constraint enforced by the C implementation for \code{t}.
\item \code{Storable.peek} and \code{Storable.poke} map all values of
  \code{CT} to the corresponding value of \code{t} and vice versa.
\item When an instance of \code{Bounded} is defined for \code{CT}, the values
  of \code{minBound} and \code{maxBound} coincide with \code{t\_MIN} and
  \code{t\_MAX} in C.
\item When an instance of \code{Eq} or \code{Ord} is defined for \code{CT},
  the predicates defined by the type class implement the same relation as the
  corresponding predicate in C on \code{t}.
\item When an instance of \code{Num}, \code{Read}, \code{Integral},
  \code{Fractional}, \code{Floating}, \code{RealFrac}, or \code{RealFloat} is
  defined for \code{CT}, the arithmetic operations defined by the type class
  implement the same function as the corresponding arithmetic operations (if
  available) in C on \code{t}.
\item When an instance of \code{Bits} is defined for \code{CT}, the bitwise
  operation defined by the type class implement the same function as the
  corresponding bitwise operation in C on \code{t}.
\end{itemize}
%
All types exported by \code{CTypes} must be represented as \code{newtype}s of
basic foreign types as defined in Section~\ref{sec:foreign-types} and the
export must be abstract.

The module \code{CTypes} provides the following integral types, including
instances for \code{Eq}, \code{Ord}, \code{Num}, \code{Read}, \code{Show},
\code{Enum}, \code{Storable}, \code{Bounded}, \code{Real}, \code{Integral},
and \code{Bits}:
%
\begin{quote}
  \begin{tabular}{|l|l|l|}
    \hline
    Haskell type     & Represented C type\\\hline\hline
    \code{CChar}     & \code{char}\\\hline
    \code{CSChar}    & \code{signed char}\\\hline
    \code{CUChar}    & \code{unsigned char}\\\hline
    \code{CShort}    & \code{short}\\\hline
    \code{CUShort}   & \code{unsigned short}\\\hline
    \code{CInt}      & \code{int}\\\hline
    \code{CUInt}     & \code{unsigned int}\\\hline
    \code{CLong}     & \code{long}\\\hline
    \code{CULong}    & \code{unsigned long}\\\hline
    \code{CLLong}    & \code{long long}\\\hline
    \code{CULLong}   & \code{unsigned long long}\\\hline
  \end{tabular}
\end{quote}
%
Moreover, it provides the following floating point types, including instances
for \code{Eq}, \code{Ord}, \code{Num}, \code{Read}, \code{Show}, \code{Enum},
\code{Storable}, \code{Real}, \code{Fractional}, \code{Floating},
\code{RealFrac}, and \code{RealFloat}:
%
\begin{quote}
  \begin{tabular}{|l|l|l|}
    \hline
    Haskell type     & Represented C type\\\hline\hline
    \code{CFloat}    & \code{float}\\\hline
    \code{CDouble}   & \code{double}\\\hline
    \code{CLDouble}  & \code{long double}\\\hline
  \end{tabular}
\end{quote}
%
The module provides the following integral types, including instances for
\code{Eq}, \code{Ord}, \code{Num}, \code{Read}, \code{Show}, \code{Enum},
\code{Storable}, \code{Bounded}, \code{Real}, \code{Integral}, and
\code{Bits}:
%
\begin{quote}
  \begin{tabular}{|l|l|l|}
    \hline
    Haskell type     & Represented C type\\\hline\hline
    \code{CPtrdiff}  & \code{ptrdiff\_t}\\\hline
    \code{CSize}     & \code{size\_t}\\\hline
    \code{CWchar}    & \code{wchar\_t}\\\hline
    \code{CSigAtomic}& \code{sig\_atomic\_t}\\\hline
  \end{tabular}
\end{quote}
%
Moreover, it provides the following numeric types, including instances for
\code{Eq}, \code{Ord}, \code{Num}, \code{Read}, \code{Show}, \code{Enum}, and
\code{Storable}:
%
\begin{quote}
  \begin{tabular}{|l|l|l|}
    \hline
    Haskell type     & Represented C type\\\hline\hline
    \code{CClock}    & \code{clock\_t}\\\hline
    \code{CTime}     & \code{time\_t}\\\hline
  \end{tabular}
\end{quote}
%
And finally, the following types, including instances for \code{Eq} and
\code{Storable}, are provided:
%
\begin{quote}
  \begin{tabular}{|l|l|l|}
    \hline
    Haskell type     & Represented C type\\\hline\hline
    \code{CFile}     & \code{FILE}\\\hline
    \code{CFpos}     & \code{fpos\_t}\\\hline
    \code{CJmpBuf}   & \code{jmp\_buf}\\\hline
  \end{tabular}
\end{quote}

\subsection{\code{CString}}
\label{sec:CString}

The module \code{CString} provides routines marshalling Haskell into C strings
and vice versa.  The marshalling converts each Haskell character, representing
a Unicode code point, to one or more bytes in a manner that, by default, is
determined by the current locale.  As a consequence, no guarantees can be made
about the relative length of a Haskell string and its corresponding C string,
and therefore, all routines provided by \code{CString} combine memory
allocation and marshalling.  The translation between Unicode and the encoding
of the current locale may be lossy.  The function \code{charIsRepresentable}
identifies the characters that can be accurately translated; unrepresentable
characters are converted to `?'.
%
\begin{codedesc}
\item[type CString = Ptr CChar] A C string is a reference to an array of C
  characters terminated by NUL.
  
\item[type CStringLen = (Ptr CChar, Int)] In addition to NUL-terminated
  strings, the module \code{CString} also supports strings with explicit
  length information in bytes.

\item[peekCString~~~~::\ CString~~~~-> IO String]
\item[peekCStringLen~::\ CStringLen~-> IO String]\combineitems
  Marshal a C string to Haskell.  There are two variants of the routine, one
  for each supported string representation.

\item[newCString~~~~::\ String -> IO CString]
\item[newCStringLen~::\ String -> IO CStringLen] \combineitems Allocate a
  memory area for a Haskell string and marshal the string into its C
  representation.  There are two variants of the routine, one for each
  supported string representation.  The memory area allocated by these
  routines may be deallocated using \code{MarshalAlloc.free}.

\item[withCString~~~~::\ String -> (CString~~~~-> IO a) -> IO a]
\item[withCStringLen~::\ String -> (CStringLen~-> IO a) -> IO a] \combineitems
  These two routines operate as \code{newCString} and \code{newCStringLen},
  respectively, but handle memory allocation and deallocation like
  \code{MarshalAlloc.alloca} (Section~\ref{sec:MarshalAlloc}).
  
\item[charIsRepresentable ::\ Char -> IO Bool] Determine whether the argument
  can be represented in the current locale.

\end{codedesc}

Some C libraries require to ignore the Unicode capabilities of Haskell and
treat values of type \code{Char} as single byte characters.  Hence, the module
\code{CString} provides a variant of the above marshalling routines that
truncates character sets correspondingly.  These functions should be used with
care, as a loss of information can occur.
%
\begin{codedesc}
\item[castCharToCChar ::\ Char -> CChar]
\item[castCCharToChar ::\ CChar -> Char] \combineitems These two functions cast
  Haskell characters to C characters and vice versa while ignoring the Unicode
  encoding of the Haskell character.  More precisely, only the first 256
  character points are preserved.

\item[peekCAString~~~~::\ CString~~~~-> IO String]
\item[peekCAStringLen~::\ CStringLen~-> IO String]\combineitems
\item[newCAString~~~~~::\ String -> IO CString]\combineitems
\item[newCAStringLen~~::\ String -> IO CStringLen] \combineitems
\item[withCAString~~~~::\ String -> (CString~~~~-> IO a) -> IO a]\combineitems
\item[withCAStringLen~::\ String -> (CStringLen~-> IO a) -> IO a]
  \combineitems These functions for whole-string marshalling cast Haskell
  characters to C characters and vice versa while ignoring the Unicode
  encoding of Haskell characters.
\end{codedesc}

To simplify bindings to C libraries that use \code{wchar\_t} for character
sets that cannot be encoded in byte strings, the module \code{CString} also
exports a variant of the above string marshalling routines for wide
characters---i.e., for the C \code{wchar\_t} type.\footnote{Note that if the
  platform defines \code{\_\_STDC\_ISO\_10646\_\_} then \code{wchar\_t}
  characters are Unicode code points, and thus, the conversion between Haskell
  \code{Char} and \code{CWchar} is a simple cast.  On other platforms, the
  translation is locale-dependent, just as for \code{CChar}.}
%
\begin{codedesc}
\item[type CWString~~~~= Ptr CWchar]
\item[type CWStringLen~= (Ptr CWchar, Int)] \combineitems
  Wide character strings in a NUL-terminated version and a variant with
  explicit length information in number of wide characters.

\item[peekCWString~~~~::\ CWString~~~~-> IO String]
\item[peekCWStringLen~::\ CWStringLen~-> IO String]\combineitems
\item[newCWString~~~~~::\ String -> IO CWString]\combineitems
\item[newCWStringLen~~::\ String -> IO CWStringLen] \combineitems
\item[withCWString~~~~::\ String -> (CWString~~~~-> IO a) -> IO a]\combineitems
\item[withCWStringLen~::\ String -> (CWStringLen~-> IO a) -> IO a]
  \combineitems String marshalling for wide character strings.  The interface
  is the same as for byte strings.
\end{codedesc}

\subsection{\code{CError}}
\label{sec:CError}

The module CError facilitates C-specific error handling of \code{errno}.  In
Haskell, we represent values of \code{errno} by
%
\begin{quote}
\begin{verbatim}
newtype Errno = Errno CInt
\end{verbatim}
\end{quote}
%
which has an instance for the type class \code{Eq}.  The implementation of
\code{Errno} is disclosed on purpose.  Different operating systems and/or C
libraries often support different values of \code{errno}.  This module defines
the common values, but due to the open definition of \code{Errno} users may
add definitions which are not predefined.  The predefined values are the
following:
%
\begin{quote}
\begin{verbatim}
eOK, e2BIG, eACCES, eADDRINUSE, eADDRNOTAVAIL, eADV, eAFNOSUPPORT, eAGAIN, 
  eALREADY, eBADF, eBADMSG, eBADRPC, eBUSY, eCHILD, eCOMM, eCONNABORTED, 
  eCONNREFUSED, eCONNRESET, eDEADLK, eDESTADDRREQ, eDIRTY, eDOM, eDQUOT, 
  eEXIST, eFAULT, eFBIG, eFTYPE, eHOSTDOWN, eHOSTUNREACH, eIDRM, eILSEQ, 
  eINPROGRESS, eINTR, eINVAL, eIO, eISCONN, eISDIR, eLOOP, eMFILE, eMLINK, 
  eMSGSIZE, eMULTIHOP, eNAMETOOLONG, eNETDOWN, eNETRESET, eNETUNREACH, 
  eNFILE, eNOBUFS, eNODATA, eNODEV, eNOENT, eNOEXEC, eNOLCK, eNOLINK, 
  eNOMEM, eNOMSG, eNONET, eNOPROTOOPT, eNOSPC, eNOSR, eNOSTR, eNOSYS, 
  eNOTBLK, eNOTCONN, eNOTDIR, eNOTEMPTY, eNOTSOCK, eNOTTY, eNXIO, 
  eOPNOTSUPP, ePERM, ePFNOSUPPORT, ePIPE, ePROCLIM, ePROCUNAVAIL, 
  ePROGMISMATCH, ePROGUNAVAIL, ePROTO, ePROTONOSUPPORT, ePROTOTYPE, 
  eRANGE, eREMCHG, eREMOTE, eROFS, eRPCMISMATCH, eRREMOTE, eSHUTDOWN, 
  eSOCKTNOSUPPORT, eSPIPE, eSRCH, eSRMNT, eSTALE, eTIME, eTIMEDOUT, 
  eTOOMANYREFS, eTXTBSY, eUSERS, eWOULDBLOCK, eXDEV
  :: Errno
\end{verbatim}
\end{quote}
%
The meaning of these values corresponds to that of the C constants of the same
name with the leading "e" converted to upper-case.

The module \code{CError} provides the following functions:
%
\begin{codedesc}
\item[isValidErrno ::\ Errno -> Bool] Yield \code{True} if the given
  \code{Errno} value is valid on the system.  This implies that the \code{Eq}
  instance of \code{Errno} is also system dependent as it is only defined for
  valid values of \code{Errno}.

\item[getErrno ::\ IO Errno] Get the current value of \code{errno}.

\item[resetErrno ::\ IO ()] Reset \code{errno} to \code{eOK}.
  
\item[errnoToIOError ::\ String -> Errno -> Maybe Handle -> Maybe String ->
  IOError] Compute a Haskell 98 I/O error based on the given \code{Errno}
  value.  The first argument to the function should specify the location where
  the error occurred and the third and fourth can be used to specify a file
  handle and filename in the course of whose manipulation the error occurred.
  This is optional information, which can be used to improve the accuracy of
  error messages.
  
\item[throwErrno ::\ String -> IO a] Apply \code{errnoToIOError} to the value
  currently returned by \code{getErrno}.  Its first argument specifies the
  location---no extra information about a file handle or filename can be
  provided in this case.

\item[throwErrnoIf~~:: (a -> Bool) -> String -> IO a -> IO a]
\item[throwErrnoIf\us~:: (a -> Bool) -> String -> IO a -> IO ()]\combineitems
  Behave like \code{throwErrno} in case that the result of the \code{IO}
  action fulfils the predicate passed as a first argument.  The second variant
  discards the result after error handling.

\item[throwErrnoIfRetry~~:: (a -> Bool) -> String -> IO a -> IO a]
\item[throwErrnoIfRetry\us~:: (a -> Bool) -> String -> IO a -> IO ()]%
\combineitems Like \code{throwErrnoIf} and \code{throwErrnoIf\us}, but retry
the \code{IO} action when it yields the error code \code{eINTR}---this amounts
to the standard retry loop for interrupted POSIX system calls.

\item[throwErrnoIfMinus1~~:: Num a => String -> IO a -> IO a]
\item[throwErrnoIfMinus1\us~:: Num a => String -> IO a -> IO ()]\combineitems
  Instantiate \code{throwErrnoIf} and \code{throwErrnoIf\us} with the predicate
  \code{(== -1)}.

\item[throwErrnoIfMinus1Retry~~:: Num a => String -> IO a -> IO a]
\item[throwErrnoIfMinus1Retry\us~:: Num a => String -> IO a -> IO ()]%
  \combineitems Instantiate \code{throwErrnoIfRetry} and
  \code{throwErrnoIfRetry\us} with the predicate \code{(== -1)}.

\item[throwErrnoIfNull~~~~~~:: String -> IO (Ptr a) -> IO (Ptr a)]
\item[throwErrnoIfNullRetry~:: String -> IO (Ptr a) -> IO (Ptr a)]%
  \combineitems Instantiate \code{throwErrnoIf} and \code{throwErrnoIfRetry}
  with the predicate \code{(== Ptr.nullPtr)}.
\end{codedesc}


\bibliographystyle{plain}
\bibliography{ffi}

\end{document}
