******************** Using Python as glue ******************** | There is no conversation more boring than the one where everybody | agrees. | --- *Michel de Montaigne* | Duct tape is like the force. It has a light side, and a dark side, and | it holds the universe together. | --- *Carl Zwanzig* Many people like to say that Python is a fantastic glue language. Hopefully, this Chapter will convince you that this is true. The first adopters of Python for science were typically people who used it to glue together large applicaton codes running on super-computers. Not only was it much nicer to code in Python than in a shell script or Perl, in addition, the ability to easily extend Python made it relatively easy to create new classes and types specifically adapted to the problems being solved. From the interactions of these early contributors, Numeric emerged as an array-like object that could be used to pass data between these applications. As Numeric has matured and developed into NumPy, people have been able to write more code directly in NumPy. Often this code is fast-enough for production use, but there are still times that there is a need to access compiled code. Either to get that last bit of efficiency out of the algorithm or to make it easier to access widely-available codes written in C/C++ or Fortran. This chapter will review many of the tools that are available for the purpose of accessing code written in other compiled languages. There are many resources available for learning to call other compiled libraries from Python and the purpose of this Chapter is not to make you an expert. The main goal is to make you aware of some of the possibilities so that you will know what to "Google" in order to learn more. The http://www.scipy.org website also contains a great deal of useful information about many of these tools. For example, there is a nice description of using several of the tools explained in this chapter at http://www.scipy.org/PerformancePython. This link provides several ways to solve the same problem showing how to use and connect with compiled code to get the best performance. In the process you can get a taste for several of the approaches that will be discussed in this chapter. Calling other compiled libraries from Python ============================================ While Python is a great language and a pleasure to code in, its dynamic nature results in overhead that can cause some code ( *i.e.* raw computations inside of for loops) to be up 10-100 times slower than equivalent code written in a static compiled language. In addition, it can cause memory usage to be larger than necessary as temporary arrays are created and destroyed during computation. For many types of computing needs the extra slow-down and memory consumption can often not be spared (at least for time- or memory- critical portions of your code). Therefore one of the most common needs is to call out from Python code to a fast, machine-code routine (e.g. compiled using C/C++ or Fortran). The fact that this is relatively easy to do is a big reason why Python is such an excellent high-level language for scientific and engineering programming. Their are two basic approaches to calling compiled code: writing an extension module that is then imported to Python using the import command, or calling a shared-library subroutine directly from Python using the ctypes module (included in the standard distribution with Python 2.5). The first method is the most common (but with the inclusion of ctypes into Python 2.5 this status may change). .. warning:: Calling C-code from Python can result in Python crashes if you are not careful. None of the approaches in this chapter are immune. You have to know something about the way data is handled by both NumPy and by the third-party library being used. Hand-generated wrappers ======================= Extension modules were discussed in Chapter `1 <#sec-writing-an-extension>`__ . The most basic way to interface with compiled code is to write an extension module and construct a module method that calls the compiled code. For improved readability, your method should take advantage of the PyArg_ParseTuple call to convert between Python objects and C data-types. For standard C data-types there is probably already a built-in converter. For others you may need to write your own converter and use the "O&" format string which allows you to specify a function that will be used to perform the conversion from the Python object to whatever C-structures are needed. Once the conversions to the appropriate C-structures and C data-types have been performed, the next step in the wrapper is to call the underlying function. This is straightforward if the underlying function is in C or C++. However, in order to call Fortran code you must be familiar with how Fortran subroutines are called from C/C++ using your compiler and platform. This can vary somewhat platforms and compilers (which is another reason f2py makes life much simpler for interfacing Fortran code) but generally involves underscore mangling of the name and the fact that all variables are passed by reference (i.e. all arguments are pointers). The advantage of the hand-generated wrapper is that you have complete control over how the C-library gets used and called which can lead to a lean and tight interface with minimal over-head. The disadvantage is that you have to write, debug, and maintain C-code, although most of it can be adapted using the time-honored technique of "cutting-pasting-and-modifying" from other extension modules. Because, the procedure of calling out to additional C-code is fairly regimented, code-generation procedures have been developed to make this process easier. One of these code- generation techniques is distributed with NumPy and allows easy integration with Fortran and (simple) C code. This package, f2py, will be covered briefly in the next session. f2py ==== F2py allows you to automatically construct an extension module that interfaces to routines in Fortran 77/90/95 code. It has the ability to parse Fortran 77/90/95 code and automatically generate Python signatures for the subroutines it encounters, or you can guide how the subroutine interfaces with Python by constructing an interface- defintion-file (or modifying the f2py-produced one). .. index:: single: f2py Creating source for a basic extension module -------------------------------------------- Probably the easiest way to introduce f2py is to offer a simple example. Here is one of the subroutines contained in a file named :file:`add.f`: .. code-block:: none C SUBROUTINE ZADD(A,B,C,N) C DOUBLE COMPLEX A(*) DOUBLE COMPLEX B(*) DOUBLE COMPLEX C(*) INTEGER N DO 20 J = 1, N C(J) = A(J)+B(J) 20 CONTINUE END This routine simply adds the elements in two contiguous arrays and places the result in a third. The memory for all three arrays must be provided by the calling routine. A very basic interface to this routine can be automatically generated by f2py:: f2py -m add add.f You should be able to run this command assuming your search-path is set-up properly. This command will produce an extension module named addmodule.c in the current directory. This extension module can now be compiled and used from Python just like any other extension module. Creating a compiled extension module ------------------------------------ You can also get f2py to compile add.f and also compile its produced extension module leaving only a shared-library extension file that can be imported from Python:: f2py -c -m add add.f This command leaves a file named add.{ext} in the current directory (where {ext} is the appropriate extension for a python extension module on your platform --- so, pyd, *etc.* ). This module may then be imported from Python. It will contain a method for each subroutin in add (zadd, cadd, dadd, sadd). The docstring of each method contains information about how the module method may be called: >>> import add >>> print add.zadd.__doc__ zadd - Function signature: zadd(a,b,c,n) Required arguments: a : input rank-1 array('D') with bounds (*) b : input rank-1 array('D') with bounds (*) c : input rank-1 array('D') with bounds (*) n : input int Improving the basic interface ----------------------------- The default interface is a very literal translation of the fortran code into Python. The Fortran array arguments must now be NumPy arrays and the integer argument should be an integer. The interface will attempt to convert all arguments to their required types (and shapes) and issue an error if unsuccessful. However, because it knows nothing about the semantics of the arguments (such that C is an output and n should really match the array sizes), it is possible to abuse this function in ways that can cause Python to crash. For example: >>> add.zadd([1,2,3],[1,2],[3,4],1000) will cause a program crash on most systems. Under the covers, the lists are being converted to proper arrays but then the underlying add loop is told to cycle way beyond the borders of the allocated memory. In order to improve the interface, directives should be provided. This is accomplished by constructing an interface definition file. It is usually best to start from the interface file that f2py can produce (where it gets its default behavior from). To get f2py to generate the interface file use the -h option:: f2py -h add.pyf -m add add.f This command leaves the file add.pyf in the current directory. The section of this file corresponding to zadd is: .. code-block:: none subroutine zadd(a,b,c,n) ! in :add:add.f double complex dimension(*) :: a double complex dimension(*) :: b double complex dimension(*) :: c integer :: n end subroutine zadd By placing intent directives and checking code, the interface can be cleaned up quite a bit until the Python module method is both easier to use and more robust. .. code-block:: none subroutine zadd(a,b,c,n) ! in :add:add.f double complex dimension(n) :: a double complex dimension(n) :: b double complex intent(out),dimension(n) :: c integer intent(hide),depend(a) :: n=len(a) end subroutine zadd The intent directive, intent(out) is used to tell f2py that ``c`` is an output variable and should be created by the interface before being passed to the underlying code. The intent(hide) directive tells f2py to not allow the user to specify the variable, ``n``, but instead to get it from the size of ``a``. The depend( ``a`` ) directive is necessary to tell f2py that the value of n depends on the input a (so that it won't try to create the variable n until the variable a is created). The new interface has docstring: >>> print add.zadd.__doc__ zadd - Function signature: c = zadd(a,b) Required arguments: a : input rank-1 array('D') with bounds (n) b : input rank-1 array('D') with bounds (n) Return objects: c : rank-1 array('D') with bounds (n) Now, the function can be called in a much more robust way: >>> add.zadd([1,2,3],[4,5,6]) array([ 5.+0.j, 7.+0.j, 9.+0.j]) Notice the automatic conversion to the correct format that occurred. Inserting directives in Fortran source -------------------------------------- The nice interface can also be generated automatically by placing the variable directives as special comments in the original fortran code. Thus, if I modify the source code to contain: .. code-block:: none C SUBROUTINE ZADD(A,B,C,N) C CF2PY INTENT(OUT) :: C CF2PY INTENT(HIDE) :: N CF2PY DOUBLE COMPLEX :: A(N) CF2PY DOUBLE COMPLEX :: B(N) CF2PY DOUBLE COMPLEX :: C(N) DOUBLE COMPLEX A(*) DOUBLE COMPLEX B(*) DOUBLE COMPLEX C(*) INTEGER N DO 20 J = 1, N C(J) = A(J) + B(J) 20 CONTINUE END Then, I can compile the extension module using:: f2py -c -m add add.f The resulting signature for the function add.zadd is exactly the same one that was created previously. If the original source code had contained A(N) instead of A(\*) and so forth with B and C, then I could obtain (nearly) the same interface simply by placing the INTENT(OUT) :: C comment line in the source code. The only difference is that N would be an optional input that would default to the length of A. A filtering example ------------------- For comparison with the other methods to be discussed. Here is another example of a function that filters a two-dimensional array of double precision floating-point numbers using a fixed averaging filter. The advantage of using Fortran to index into multi-dimensional arrays should be clear from this example. .. code-block:: none SUBROUTINE DFILTER2D(A,B,M,N) C DOUBLE PRECISION A(M,N) DOUBLE PRECISION B(M,N) INTEGER N, M CF2PY INTENT(OUT) :: B CF2PY INTENT(HIDE) :: N CF2PY INTENT(HIDE) :: M DO 20 I = 2,M-1 DO 40 J=2,N-1 B(I,J) = A(I,J) + $ (A(I-1,J)+A(I+1,J) + $ A(I,J-1)+A(I,J+1) )*0.5D0 + $ (A(I-1,J-1) + A(I-1,J+1) + $ A(I+1,J-1) + A(I+1,J+1))*0.25D0 40 CONTINUE 20 CONTINUE END This code can be compiled and linked into an extension module named filter using:: f2py -c -m filter filter.f This will produce an extension module named filter.so in the current directory with a method named dfilter2d that returns a filtered version of the input. Calling f2py from Python ------------------------ The f2py program is written in Python and can be run from inside your module. This provides a facility that is somewhat similar to the use of weave.ext_tools described below. An example of the final interface executed using Python code is: .. code-block:: python import numpy.f2py as f2py fid = open('add.f') source = fid.read() fid.close() f2py.compile(source, modulename='add') import add The source string can be any valid Fortran code. If you want to save the extension-module source code then a suitable file-name can be provided by the source_fn keyword to the compile function. Automatic extension module generation ------------------------------------- If you want to distribute your f2py extension module, then you only need to include the .pyf file and the Fortran code. The distutils extensions in NumPy allow you to define an extension module entirely in terms of this interface file. A valid setup.py file allowing distribution of the add.f module (as part of the package f2py_examples so that it would be loaded as f2py_examples.add) is: .. code-block:: python def configuration(parent_package='', top_path=None) from numpy.distutils.misc_util import Configuration config = Configuration('f2py_examples',parent_package, top_path) config.add_extension('add', sources=['add.pyf','add.f']) return config if __name__ == '__main__': from numpy.distutils.core import setup setup(**configuration(top_path='').todict()) Installation of the new package is easy using:: python setup.py install assuming you have the proper permissions to write to the main site- packages directory for the version of Python you are using. For the resulting package to work, you need to create a file named __init__.py (in the same directory as add.pyf). Notice the extension module is defined entirely in terms of the "add.pyf" and "add.f" files. The conversion of the .pyf file to a .c file is handled by numpy.disutils. Conclusion ---------- The interface definition file (.pyf) is how you can fine-tune the interface between Python and Fortran. There is decent documentation for f2py found in the numpy/f2py/docs directory where-ever NumPy is installed on your system (usually under site-packages). There is also more information on using f2py (including how to use it to wrap C codes) at http://www.scipy.org/Cookbook under the "Using NumPy with Other Languages" heading. The f2py method of linking compiled code is currently the most sophisticated and integrated approach. It allows clean separation of Python with compiled code while still allowing for separate distribution of the extension module. The only draw-back is that it requires the existence of a Fortran compiler in order for a user to install the code. However, with the existence of the free-compilers g77, gfortran, and g95, as well as high-quality commerical compilers, this restriction is not particularly onerous. In my opinion, Fortran is still the easiest way to write fast and clear code for scientific computing. It handles complex numbers, and multi-dimensional indexing in the most straightforward way. Be aware, however, that some Fortran compilers will not be able to optimize code as well as good hand- written C-code. .. index:: single: f2py weave ===== Weave is a scipy package that can be used to automate the process of extending Python with C/C++ code. It can be used to speed up evaluation of an array expression that would otherwise create temporary variables, to directly "inline" C/C++ code into Python, or to create a fully-named extension module. You must either install scipy or get the weave package separately and install it using the standard python setup.py install. You must also have a C/C++-compiler installed and useable by Python distutils in order to use weave. .. index:: single: weave Somewhat dated, but still useful documentation for weave can be found at the link http://www.scipy/Weave. There are also many examples found in the examples directory which is installed under the weave directory in the place where weave is installed on your system. Speed up code involving arrays (also see scipy.numexpr) ------------------------------------------------------- This is the easiest way to use weave and requires minimal changes to your Python code. It involves placing quotes around the expression of interest and calling weave.blitz. Weave will parse the code and generate C++ code using Blitz C++ arrays. It will then compile the code and catalog the shared library so that the next time this exact string is asked for (and the array types are the same), the already- compiled shared library will be loaded and used. Because Blitz makes extensive use of C++ templating, it can take a long time to compile the first time. After that, however, the code should evaluate more quickly than the equivalent NumPy expression. This is especially true if your array sizes are large and the expression would require NumPy to create several temporaries. Only expressions involving basic arithmetic operations and basic array slicing can be converted to Blitz C++ code. For example, consider the expression:: d = 4*a + 5*a*b + 6*b*c where a, b, and c are all arrays of the same type and shape. When the data-type is double-precision and the size is 1000x1000, this expression takes about 0.5 seconds to compute on an 1.1Ghz AMD Athlon machine. When this expression is executed instead using blitz: .. code-block:: python d = empty(a.shape, 'd'); weave.blitz(expr) execution time is only about 0.20 seconds (about 0.14 seconds spent in weave and the rest in allocating space for d). Thus, we've sped up the code by a factor of 2 using only a simnple command (weave.blitz). Your mileage may vary, but factors of 2-8 speed-ups are possible with this very simple technique. If you are interested in using weave in this way, then you should also look at scipy.numexpr which is another similar way to speed up expressions by eliminating the need for temporary variables. Using numexpr does not require a C/C++ compiler. Inline C-code ------------- Probably the most widely-used method of employing weave is to "in-line" C/C++ code into Python in order to speed up a time-critical section of Python code. In this method of using weave, you define a string containing useful C-code and then pass it to the function **weave.inline** ( ``code_string``, ``variables`` ), where code_string is a string of valid C/C++ code and variables is a list of variables that should be passed in from Python. The C/C++ code should refer to the variables with the same names as they are defined with in Python. If weave.line should return anything the the special value return_val should be set to whatever object should be returned. The following example shows how to use weave on basic Python objects: .. code-block:: python code = r""" int i; py::tuple results(2); for (i=0; ic.data)[i].real = \ (a.data)[i].real + \ (b.data)[i].real (c.data)[i].imag = \ (a.data)[i].imag + \ (b.data)[i].imag return c This module shows use of the ``cimport`` statement to load the definitions from the c_numpy.pxd file. As shown, both versions of the import statement are supported. It also shows use of the NumPy C-API to construct NumPy arrays from arbitrary input objects. The array c is created using PyArray_SimpleNew. Then the c-array is filled by addition. Casting to a particiular data-type is accomplished using . Pointers are de-referenced with bracket notation and members of structures are accessed using '.' notation even if the object is techinically a pointer to a structure. The use of the special for loop construct ensures that the underlying code will have a similar C-loop so the addition calculation will proceed quickly. Notice that we have not checked for NULL after calling to the C-API --- a cardinal sin when writing C-code. For routines that return Python objects, Pyrex inserts the checks for NULL into the C-code for you and returns with failure if need be. There is also a way to get Pyrex to automatically check for exceptions when you call functions that don't return Python objects. See the documentation of Pyrex for details. Pyrex-filter ------------ The two-dimensional example we created using weave is a bit uglierto implement in Pyrex because two-dimensional indexing using Pyrex is not as simple. But, it is straightforward (and possibly faster because of pre-computed indices). Here is the Pyrex-file I named image.pyx. .. code-block:: none cimport c_numpy from c_numpy cimport import_array, ndarray, npy_intp,\ NPY_DOUBLE, NPY_CDOUBLE, \ NPY_FLOAT, NPY_CFLOAT, NPY_ALIGNED \ #We need to initialize NumPy import_array() def filter(object ao): cdef ndarray a, b cdef npy_intp i, j, M, N, oS cdef npy_intp r,rm1,rp1,c,cm1,cp1 cdef double value # Require an ALIGNED array # (but not necessarily contiguous) # We will use strides to access the elements. a = c_numpy.PyArray_FROMANY(ao, NPY_DOUBLE, \ 2, 2, NPY_ALIGNED) b = c_numpy.PyArray_SimpleNew(a.nd,a.dimensions, \ a.descr.type_num) M = a.dimensions[0] N = a.dimensions[1] S0 = a.strides[0] S1 = a.strides[1] for i from 1 <= i < M-1: r = i*S0 rm1 = r-S0 rp1 = r+S0 oS = i*N for j from 1 <= j < N-1: c = j*S1 cm1 = c-S1 cp1 = c+S1 (b.data)[oS+j] = \ ((a.data+r+c))[0] + \ (((a.data+rm1+c))[0] + \ ((a.data+rp1+c))[0] + \ ((a.data+r+cm1))[0] + \ ((a.data+r+cp1))[0])*0.5 + \ (((a.data+rm1+cm1))[0] + \ ((a.data+rp1+cm1))[0] + \ ((a.data+rp1+cp1))[0] + \ ((a.data+rm1+cp1))[0])*0.25 return b This 2-d averaging filter runs quickly because the loop is in C and the pointer computations are done only as needed. However, it is not particularly easy to understand what is happening. A 2-d image, ``in`` , can be filtered using this code very quickly using: .. code-block:: python import image out = image.filter(in) Conclusion ---------- There are several disadvantages of using Pyrex: 1. The syntax for Pyrex can get a bit bulky, and it can be confusing at first to understand what kind of objects you are getting and how to interface them with C-like constructs. 2. Inappropriate Pyrex syntax or incorrect calls to C-code or type- mismatches can result in failures such as 1. Pyrex failing to generate the extension module source code, 2. Compiler failure while generating the extension module binary due to incorrect C syntax, 3. Python failure when trying to use the module. 3. It is easy to lose a clean separation between Python and C which makes re-using your C-code for other non-Python-related projects more difficult. 4. Multi-dimensional arrays are "bulky" to index (appropriate macros may be able to fix this). 5. The C-code generated by Prex is hard to read and modify (and typically compiles with annoying but harmless warnings). Writing a good Pyrex extension module still takes a bit of effort because not only does it require (a little) familiarity with C, but also with Pyrex's brand of Python-mixed-with C. One big advantage of Pyrex-generated extension modules is that they are easy to distribute using distutils. In summary, Pyrex is a very capable tool for either gluing C-code or generating an extension module quickly and should not be over-looked. It is especially useful for people that can't or won't write C-code or Fortran code. But, if you are already able to write simple subroutines in C or Fortran, then I would use one of the other approaches such as f2py (for Fortran), ctypes (for C shared- libraries), or weave (for inline C-code). .. index:: single: pyrex ctypes ====== Ctypes is a python extension module (downloaded separately for Python <2.5 and included with Python 2.5) that allows you to call an arbitrary function in a shared library directly from Python. This approach allows you to interface with C-code directly from Python. This opens up an enormous number of libraries for use from Python. The drawback, however, is that coding mistakes can lead to ugly program crashes very easily (just as can happen in C) because there is little type or bounds checking done on the parameters. This is especially true when array data is passed in as a pointer to a raw memory location. The responsibility is then on you that the subroutine will not access memory outside the actual array area. But, if you don't mind living a little dangerously ctypes can be an effective tool for quickly taking advantage of a large shared library (or writing extended functionality in your own shared library). .. index:: single: ctypes Because the ctypes approach exposes a raw interface to the compiled code it is not always tolerant of user mistakes. Robust use of the ctypes module typically involves an additional layer of Python code in order to check the data types and array bounds of objects passed to the underlying subroutine. This additional layer of checking (not to mention the conversion from ctypes objects to C-data-types that ctypes itself performs), will make the interface slower than a hand-written extension-module interface. However, this overhead should be neglible if the C-routine being called is doing any significant amount of work. If you are a great Python programmer with weak C-skills, ctypes is an easy way to write a useful interface to a (shared) library of compiled code. To use c-types you must 1. Have a shared library. 2. Load the shared library. 3. Convert the python objects to ctypes-understood arguments. 4. Call the function from the library with the ctypes arguments. Having a shared library ----------------------- There are several requirements for a shared library that can be used with c-types that are platform specific. This guide assumes you have some familiarity with making a shared library on your system (or simply have a shared library available to you). Items to remember are: - A shared library must be compiled in a special way ( *e.g.* using the -shared flag with gcc). - On some platforms (*e.g.* Windows) , a shared library requires a .def file that specifies the functions to be exported. For example a mylib.def file might contain. :: LIBRARY mylib.dll EXPORTS cool_function1 cool_function2 Alternatively, you may be able to use the storage-class specifier __declspec(dllexport) in the C-definition of the function to avoid the need for this .def file. There is no standard way in Python distutils to create a standard shared library (an extension module is a "special" shared library Python understands) in a cross-platform manner. Thus, a big disadvantage of ctypes at the time of writing this book is that it is difficult to distribute in a cross-platform manner a Python extension that uses c-types and includes your own code which should be compiled as a shared library on the users system. Loading the shared library -------------------------- A simple, but robust way to load the shared library is to get the absolute path name and load it using the cdll object of ctypes.: .. code-block:: python lib = ctypes.cdll[] However, on Windows accessing an attribute of the cdll method will load the first DLL by that name found in the current directory or on the PATH. Loading the absolute path name requires a little finesse for cross-platform work since the extension of shared libraries varies. There is a ``ctypes.util.find_library`` utility available that can simplify the process of finding the library to load but it is not foolproof. Complicating matters, different platforms have different default extensions used by shared libraries (e.g. .dll -- Windows, .so -- Linux, .dylib -- Mac OS X). This must also be taken into account if you are using c-types to wrap code that needs to work on several platforms. NumPy provides a convenience function called :func:`ctypeslib.load_library` (name, path). This function takes the name of the shared library (including any prefix like 'lib' but excluding the extension) and a path where the shared library can be located. It returns a ctypes library object or raises an OSError if the library cannot be found or raises an ImportError if the ctypes module is not available. (Windows users: the ctypes library object loaded using :func:`load_library` is always loaded assuming cdecl calling convention. See the ctypes documentation under ctypes.windll and/or ctypes.oledll for ways to load libraries under other calling conventions). The functions in the shared library are available as attributes of the ctypes library object (returned from :func:`ctypeslib.load_library`) or as items using ``lib['func_name']`` syntax. The latter method for retrieving a function name is particularly useful if the function name contains characters that are not allowable in Python variable names. Converting arguments -------------------- Python ints/longs, strings, and unicode objects are automatically converted as needed to equivalent c-types arguments The None object is also converted automatically to a NULL pointer. All other Python objects must be converted to ctypes-specific types. There are two ways around this restriction that allow c-types to integrate with other objects. 1. Don't set the argtypes attribute of the function object and define an :obj:`_as_parameter_` method for the object you want to pass in. The :obj:`_as_parameter_` method must return a Python int which will be passed directly to the function. 2. Set the argtypes attribute to a list whose entries contain objects with a classmethod named from_param that knows how to convert your object to an object that ctypes can understand (an int/long, string, unicode, or object with the :obj:`_as_parameter_` attribute). NumPy uses both methods with a preference for the second method because it can be safer. The ctypes attribute of the ndarray returns an object that has an _as_parameter\_ attribute which returns an integer representing the address of the ndarray to which it is associated. As a result, one can pass this ctypes attribute object directly to a function expecting a pointer to the data in your ndarray. The caller must be sure that the ndarray object is of the correct type, shape, and has the correct flags set or risk nasty crashes if the data-pointer to inappropriate arrays are passsed in. To implement the second method, NumPy provides the class-factory function :func:`ndpointer` in the :mod:`ctypeslib` module. This class-factory function produces an appropriate class that can be placed in an argtypes attribute entry of a ctypes function. The class will contain a from_param method which ctypes will use to convert any ndarray passed in to the function to a ctypes-recognized object. In the process, the conversion will perform checking on any properties of the ndarray that were specified by the user in the call to :func:`ndpointer`. Aspects of the ndarray that can be checked include the data-type, the number-of-dimensions, the shape, and/or the state of the flags on any array passed. The return value of the from_param method is the ctypes attribute of the array which (because it contains the _as_parameter\_ attribute pointing to the array data area) can be used by ctypes directly. The ctypes attribute of an ndarray is also endowed with additional attributes that may be convenient when passing additional information about the array into a ctypes function. The attributes **data**, **shape**, and **strides** can provide c-types compatible types corresponding to the data-area, the shape, and the strides of the array. The data attribute reutrns a ``c_void_p`` representing a pointer to the data area. The shape and strides attributes each return an array of ctypes integers (or None representing a NULL pointer, if a 0-d array). The base ctype of the array is a ctype integer of the same size as a pointer on the platform. There are also methods data_as({ctype}), shape_as(), and strides_as(). These return the data as a ctype object of your choice and the shape/strides arrays using an underlying base type of your choice. For convenience, the **ctypeslib** module also contains **c_intp** as a ctypes integer data-type whose size is the same as the size of ``c_void_p`` on the platform (it's value is None if ctypes is not installed). Calling the function -------------------- The function is accessed as an attribute of or an item from the loaded shared-library. Thus, if "./mylib.so" has a function named "cool_function1" , I could access this function either as: .. code-block:: python lib = numpy.ctypeslib.load_library('mylib','.') func1 = lib.cool_function1 # or equivalently func1 = lib['cool_function1'] In ctypes, the return-value of a function is set to be 'int' by default. This behavior can be changed by setting the restype attribute of the function. Use None for the restype if the function has no return value ('void'): .. code-block:: python func1.restype = None As previously discussed, you can also set the argtypes attribute of the function in order to have ctypes check the types of the input arguments when the function is called. Use the :func:`ndpointer` factory function to generate a ready-made class for data-type, shape, and flags checking on your new function. The :func:`ndpointer` function has the signature .. function:: ndpointer(dtype=None, ndim=None, shape=None, flags=None) Keyword arguments with the value ``None`` are not checked. Specifying a keyword enforces checking of that aspect of the ndarray on conversion to a ctypes-compatible object. The dtype keyword can be any object understood as a data-type object. The ndim keyword should be an integer, and the shape keyword should be an integer or a sequence of integers. The flags keyword specifies the minimal flags that are required on any array passed in. This can be specified as a string of comma separated requirements, an integer indicating the requirement bits OR'd together, or a flags object returned from the flags attribute of an array with the necessary requirements. Using an ndpointer class in the argtypes method can make it significantly safer to call a C-function using ctypes and the data- area of an ndarray. You may still want to wrap the function in an additional Python wrapper to make it user-friendly (hiding some obvious arguments and making some arguments output arguments). In this process, the **requires** function in NumPy may be useful to return the right kind of array from a given input. Complete example ---------------- In this example, I will show how the addition function and the filter function implemented previously using the other approaches can be implemented using ctypes. First, the C-code which implements the algorithms contains the functions zadd, dadd, sadd, cadd, and dfilter2d. The zadd function is: .. code-block:: c /* Add arrays of contiguous data */ typedef struct {double real; double imag;} cdouble; typedef struct {float real; float imag;} cfloat; void zadd(cdouble *a, cdouble *b, cdouble *c, long n) { while (n--) { c->real = a->real + b->real; c->imag = a->imag + b->imag; a++; b++; c++; } } with similar code for cadd, dadd, and sadd that handles complex float, double, and float data-types, respectively: .. code-block:: c void cadd(cfloat *a, cfloat *b, cfloat *c, long n) { while (n--) { c->real = a->real + b->real; c->imag = a->imag + b->imag; a++; b++; c++; } } void dadd(double *a, double *b, double *c, long n) { while (n--) { *c++ = *a++ + *b++; } } void sadd(float *a, float *b, float *c, long n) { while (n--) { *c++ = *a++ + *b++; } } The code.c file also contains the function dfilter2d: .. code-block:: c /* Assumes b is contiguous and a has strides that are multiples of sizeof(double) */ void dfilter2d(double *a, double *b, int *astrides, int *dims) { int i, j, M, N, S0, S1; int r, c, rm1, rp1, cp1, cm1; M = dims[0]; N = dims[1]; S0 = astrides[0]/sizeof(double); S1=astrides[1]/sizeof(double); for (i=1; idimensions[0]; int dims[1]; dims[0] = n; PyArrayObject* ret; ret = (PyArrayObject*) PyArray_FromDims(1, dims, NPY_DOUBLE); int i; char *aj=a->data; char *bj=b->data; double *retj = (double *)ret->data; for (i=0; i < n; i++) { *retj++ = *((double *)aj) + *((double *)bj); aj += a->strides[0]; bj += b->strides[0]; } return (PyObject *)ret; } """ import Instant, numpy ext = Instant.Instant() ext.create_extension(code=s, headers=["numpy/arrayobject.h"], include_dirs=[numpy.get_include()], init_code='import_array();', module="test2b_ext") import test2b_ext a = numpy.arange(1000) b = numpy.arange(1000) d = test2b_ext.add(a,b) Except perhaps for the dependence on SWIG, Instant is a straightforward utility for writing extension modules. PyInline -------- This is a much older module that allows automatic building of extension modules so that C-code can be included with Python code. It's latest release (version 0.03) was in 2001, and it appears that it is not being updated. PyFort ------ PyFort is a nice tool for wrapping Fortran and Fortran-like C-code into Python with support for Numeric arrays. It was written by Paul Dubois, a distinguished computer scientist and the very first maintainer of Numeric (now retired). It is worth mentioning in the hopes that somebody will update PyFort to work with NumPy arrays as well which now support either Fortran or C-style contiguous arrays.