SciPy

NumPy C Style Guide

The NumPy C coding conventions are based on Python PEP-0007 by Guido van Rossum with a few added strictures. There are many C coding conventions and it must be emphasized that the primary goal of the NumPy conventions isn’t to choose the ‘best’, about which there is certain to be disagreement, but to achieve uniformity. Because the NumPy conventions are very close to those in PEP-0007, that PEP is used as a template below with the NumPy additions and variations in the appropriate spots.

NumPy modified PEP-0007

Introduction

This document gives coding conventions for the C code comprising the C implementation of NumPy. Note, rules are there to be broken. Two good reasons to break a particular rule:

  1. When applying the rule would make the code less readable, even for someone who is used to reading code that follows the rules.
  2. To be consistent with surrounding code that also breaks it (maybe for historic reasons) – although this is also an opportunity to clean up someone else’s mess.

C dialect

  • Use ANSI/ISO standard C (the 1989 version of the standard). This means, amongst many other things, that all declarations must be at the top of a block (not necessarily at the top of function).

  • Don’t use GCC extensions (e.g. don’t write multi-line strings without trailing backslashes). Preferably break long strings up onto separate lines like so:

    "blah blah"
    "blah blah"
    

    This will work with MSVC, which otherwise chokes on very long strings.

  • All function declarations and definitions must use full prototypes (i.e. specify the types of all arguments).

  • Do not use C++ style // one line comments, they aren’t portable. Note: this will change with the proposed transition to C++.

  • No compiler warnings with major compilers (gcc, VC++, a few others). Note: NumPy still produces compiler warnings that need to be addressed.

Code lay-out

  • Use 4-space indents and no tabs at all.

  • No line should be longer than 80 characters. If this and the previous rule together don’t give you enough room to code, your code is too complicated, consider using subroutines.

  • No line should end in whitespace. If you think you need significant trailing whitespace, think again, somebody’s editor might delete it as a matter of routine.

  • Function definition style: function name in column 1, outermost curly braces in column 1, blank line after local variable declarations:

    static int
    extra_ivars(PyTypeObject *type, PyTypeObject *base)
    {
        int t_size = PyType_BASICSIZE(type);
        int b_size = PyType_BASICSIZE(base);
    
        assert(t_size >= b_size); /* type smaller than base! */
        ...
        return 1;
    }
    

    If the transition to C++ goes through it is possible that this form will be relaxed so that short class methods meant to be inlined can have the return type on the same line as the function name. However, that is yet to be determined.

  • Code structure: one space between keywords like if, for and the following left parenthesis; no spaces inside the parenthesis; braces around all if branches and no statements on the same line as the if. They should be formatted as shown:

    if (mro != NULL) {
        one_line_statement;
    }
    else {
        ...
    }
    
    
    for (i = 0; i < n; i++) {
        one_line_statement;
    }
    
    
    while (isstuff) {
        dostuff;
    }
    
    
    do {
        stuff;
    } while (isstuff);
    
    
    switch (kind) {
        /* Boolean kind */
        case 'b':
            return 0;
        /* Unsigned int kind */
        case 'u':
            ...
        /* Anything else */
        default:
            return 3;
    }
    
  • The return statement should not get redundant parentheses:

    return Py_None; /* correct */
    return(Py_None); /* incorrect */
    
  • Function and macro call style: foo(a, b, c), no space before the open paren, no spaces inside the parens, no spaces before commas, one space after each comma.

  • Always put spaces around assignment, Boolean and comparison operators. In expressions using a lot of operators, add spaces around the outermost (lowest priority) operators.

  • Breaking long lines: if you can, break after commas in the outermost argument list. Always indent continuation lines appropriately, e.g.,

    PyErr_SetString(PyExc_TypeError,
            "Oh dear, you messed up.");
    

    Here appropriately means at least two tabs. It isn’t necessary to line everything up with the opening parenthesis of the function call.

  • When you break a long expression at a binary operator, the operator goes at the end of the previous line, e.g.,

    if (type > tp_dictoffset != 0 &&
            base > tp_dictoffset == 0 &&
            type > tp_dictoffset == b_size &&
            (size_t)t_size == b_size + sizeof(PyObject *)) {
        return 0;
    }
    

    Note that the terms in the multi-line Boolean expression are indented so as to make the beginning of the code block clearly visible.

  • Put blank lines around functions, structure definitions, and major sections inside functions.

  • Comments go before the code they describe. Multi-line comments should be like so:

    /*
     * This would be a long
     * explanatory comment.
     */
    

    Trailing comments should be used sparingly. Instead of

    if (yes) {/* Success! */
    

    do

    if (yes) {
        /* Success! */
    
  • All functions and global variables should be declared static when they aren’t needed outside the current compilation unit.

  • Declare external functions and variables in a header file.

Naming conventions

  • There has been no consistent prefix for NumPy public functions, but they all begin with a prefix of some sort, followed by an underscore, and are in camel case: PyArray_DescrAlignConverter, NpyIter_GetIterNext. In the future the names should be of the form Npy*_PublicFunction, where the star is something appropriate.
  • Public Macros should have a NPY_ prefix and then use upper case, for example, NPY_DOUBLE.
  • Private functions should be lower case with underscores, for example: array_real_get. Single leading underscores should not be used, but some current function names violate that rule due to historical accident. Those functions should be renamed at some point.

Function documentation

NumPy doesn’t have a C function documentation standard at this time, but needs one. Most numpy functions are not documented in the code and that should change. One possibility is Doxygen with a plugin so that the same NumPy style used for Python functions can also be used for documenting C functions, see the files in doc/cdoc/.