Math 480: Lecture 9: Cython Language Constructions

In this lecture, we will systematically go through the most important standard Cython language constructions.

We will not talk about using numpy from Cython, dynamic memory allocation, or subtleties of the C language in this lecture.  

{{{id=4| /// }}}

Declaring Cython Variables Using "cdef"

cdef type_name variable_name1, variable_name2, ...

The single most important statement that Cython adds to Python is

cdef type_name

. This allows you to declare a variable to have a type. The possibilities for the type include:

{{{id=23| %cython def C_type_example(): cdef int n=5/3, x=2^3 # ^ = exclusive or -- no preparser in Cython! cdef long int m=908230948239489394 cdef float y=4.5969 cdef double z=2.13 cdef char c='c' cdef char* s="a C string" print n, x, m, y, z, c, s /// }}} {{{id=22| C_type_example() /// 1 1 908230948239489394 4.59689998627 2.13 99 a C string }}} {{{id=24| %cython def type_example2(x, y): cdef list v cdef dict z v = x z = y /// }}} {{{id=25| type_example2([1,2], {'a':5}) /// }}} {{{id=26| type_example2(17, {'a':5}) /// Traceback (most recent call last): File "", line 1, in File "_sage_input_10.py", line 10, in exec compile(u'open("___code___.py","w").write("# -*- coding: utf-8 -*-\\n" + _support_.preparse_worksheet_cell(base64.b64decode("dHlwZV9leGFtcGxlMigxNywgeydhJzo1fSk="),globals())+"\\n"); execfile(os.path.abspath("___code___.py"))' + '\n', '', 'single') File "", line 1, in File "/tmp/tmpaUoBWo/___code___.py", line 3, in exec compile(u"type_example2(_sage_const_17 , {'a':_sage_const_5 })" + '\n', '', 'single') File "", line 1, in File "_sagenb_flask_sage_notebook_sagenb_home_openidSfmMv1OuVE_31_code_sage9_spyx_0.pyx", line 9, in _sagenb_flask_sage_notebook_sagenb_home_openidSfmMv1OuVE_31_code_sage9_spyx_0.type_example2 (_sagenb_flask_sage_notebook_sagenb_home_openidSfmMv1OuVE_31_code_sage9_spyx_0.c:468) v = x TypeError: Expected list, got sage.rings.integer.Integer }}} {{{id=27| type_example2([1,2], 17) /// Traceback (most recent call last): File "", line 1, in File "_sage_input_11.py", line 10, in exec compile(u'open("___code___.py","w").write("# -*- coding: utf-8 -*-\\n" + _support_.preparse_worksheet_cell(base64.b64decode("dHlwZV9leGFtcGxlMihbMSwyXSwgMTcp"),globals())+"\\n"); execfile(os.path.abspath("___code___.py"))' + '\n', '', 'single') File "", line 1, in File "/tmp/tmpTwoEPY/___code___.py", line 3, in exec compile(u'type_example2([_sage_const_1 ,_sage_const_2 ], _sage_const_17 )' + '\n', '', 'single') File "", line 1, in File "_sagenb_flask_sage_notebook_sagenb_home_openidSfmMv1OuVE_31_code_sage9_spyx_0.pyx", line 10, in _sagenb_flask_sage_notebook_sagenb_home_openidSfmMv1OuVE_31_code_sage9_spyx_0.type_example2 (_sagenb_flask_sage_notebook_sagenb_home_openidSfmMv1OuVE_31_code_sage9_spyx_0.c:478) z = y TypeError: Expected dict, got sage.rings.integer.Integer }}}

For the Cython source code of Sage integers, see /src/rings/integer.pxd and /src/rings/integer.pyx in the notebook.  Also, browse /src/libs/gmp/ for the definition of functions such as mpz_set below.

{{{id=29| %cython from sage.rings.integer cimport Integer # note the cimport! def unsafe_mutate(Integer n, Integer m): mpz_set(n.value, m.value) /// }}} {{{id=28| n = 15 print n, id(n) unsafe_mutate(n, 2011) print n, id(n) /// 15 54852752 2011 54852752 }}} {{{id=3| /// }}}

Explicit casts

  <data_type> foo

If you need to "brutally" force the compiler to treat a variable of one data type as another, you have to use an explicit cast. In Java and C/C++ you would use parenthesis around a type name, as follows:

int i = 1;
long j = 3;
i = (int)j;

But in Cython, you use angle brackets (note: in Cython this particular cast isn't strictly necessary, but in Java it is):

{{{id=37| %cython cdef int i = 1 cdef long j = 3 i = j print i /// 3 }}}

Here's an example where we convert a Python string to a char* (i.e., a pointer to an array of characters), then change one of the characters, thus mutating an immutable string.

{{{id=34| %cython def unsafe_mutate_str(bytes s, n, c): cdef char* t = s t[n] = ord(c) /// }}} {{{id=33| s = 'This is an immutable string.' print s, id(s), hash(s) unsafe_mutate_str(s, 9, ' ') unsafe_mutate_str(s, 11, ' ') unsafe_mutate_str(s, 12, ' ') print s, id(s), hash(s) /// This is an immutable string. 72268152 -5654925717092887818 This is a mutable string. 72268152 -5654925717092887818 }}} {{{id=21| hash('This is a mutable string.') /// -7476166060485806082 }}} {{{id=49| /// }}} {{{id=7| /// }}}

Declaring External Data Types, Functions, Etc.

In order for Cython to make use of a function or data type defined in external C/C++ library, Cython has to explicitly be told what the input and output types are for that function and what the function should be called. Cython will then generate appropriate C/C++ code and conversions based on these assumptions. There are a large number of files in Sage and Cython itself that declare all the functions provided by various standard libraries, but sometimes you want to make use of a function defined elsewhere, e.g., in your own C/C++ library, so you have to declare things yourself. The purpose of the following examples is to illustrate how to do this. It is also extremely useful to look at the Sage library source code for thousands of additional nontrivial working examples.

cdef extern from "filename.h":
     declarations ...    

The following examples illustrates several different possible declarations.  We'll describe each line in detail.

This first example declares a single type of round function on doubles -- it's as straightforward as it gets.

{{{id=54| %cython cdef extern from "math.h": double round(double) def f(double n): return round(n) /// }}} {{{id=53| f(10.53595) /// 11.0 }}}

Now suppose we want a version of round that returns a long.  By consulting the man page for round, we find that there is a round function declared as follows:

long int lround(double x);

We can declare it exactly like the above, or we can use a C name specifier, which let's us tell Cython we want to call the function "round" in our Cython code, but when Cython generates code it should actually emit "lround". This is what we do below.

{{{id=56| %cython cdef extern from "math.h": long int round "lround"(double) def f(double n): return round(n) /// }}} {{{id=51| f(10.53595) /// 11 }}}

Another case when using C name specifiers is useful if you want to be able to call both a C library version of a function and a builtin Python function with the same name.

{{{id=61| %cython cdef extern from "stdlib.h": int c_abs "abs"(int i) def myabs(n): print abs(n) print c_abs(n) /// }}} {{{id=62| myabs(-10) /// 10 10 }}}

We can also declare data types and variables using extern.  To write the code below, I used man several times on each referenced function.  I knew the relevant functions because I read a book on the C programming language when I was a freshman; learning the basics of the C programming language and standard libraries is a very good idea if you want to be able to make effective use of Cython (or computers in general, since most systems programming is done in C).  

Coming up with the declarations below is a little bit of an art form, in that they are not exactly what is given from the man pages, though they are close.   Just realize that the declarations you give here do exactly one thing: they inform Cython about what C code it should generate, e.g., it will convert the "w" below to a char* before calling the fopen function.  That's it -- that's all the declarations do.  They do not have to be perfect.    Click on the .html file below, and look at the corresponding C code, to see what I mean. 

{{{id=63| %cython cdef extern from "stdio.h": ctypedef void FILE # use void if you don't care about internal structure FILE* fopen(char* filename, char* mode) int fclose(FILE *stream) int fprintf(FILE *stream, char *format, ...) def f(filename): cdef FILE* file file = fopen(filename, "w") fprintf(file, "Hi Mom!") fclose(file) /// }}} {{{id=60| f('foo.txt') print open('foo.txt').read() /// Hi Mom! }}} {{{id=58| /// }}} {{{id=8| /// }}}

Defining New Cython Functions

In addition to using the cdef keyword to define variables as above, we can also define functions. These are like Python functions, but you can declare the input types and the return type explicitly, and calling them is then blazingly fast, as compared to calling regular Python functions. (Remember, most of the point of Cython is speed.)

cdef return_type function_name(type1 input1, type2 input2...):
{{{id=1| %cython cdef int add_cython(int a, int b): return a + b def add_python(int a, int b): return a + b def f(int n): cdef int i, s=0 for i in range(n): s += add_cython(s, i) return s def g(int n): cdef int i, s=0 for i in range(n): s += add_python(s, i) return s /// }}} {{{id=66| timeit('f(10^6)') /// 625 loops, best of 3: 595 µs per loop }}} {{{id=65| timeit('g(10^6)') /// 5 loops, best of 3: 94.6 ms per loop }}} {{{id=68| 94.6/.595 /// 158.991596638655 }}}

Notice that add_python is callable from the interpreter, but add_cython isn't:

{{{id=71| add_python(2,8) /// 10 }}} {{{id=69| add_cython(2,8) /// Traceback (most recent call last): File "", line 1, in File "_sage_input_52.py", line 10, in exec compile(u'open("___code___.py","w").write("# -*- coding: utf-8 -*-\\n" + _support_.preparse_worksheet_cell(base64.b64decode("YWRkX2N5dGhvbigyLDgp"),globals())+"\\n"); execfile(os.path.abspath("___code___.py"))' + '\n', '', 'single') File "", line 1, in File "/tmp/tmphov649/___code___.py", line 3, in exec compile(u'add_cython(_sage_const_2 ,_sage_const_8 )' + '\n', '', 'single') File "", line 1, in NameError: name 'add_cython' is not defined }}}

If we use cpdef instead of cdef then everything is almost identical, except the cpdef'd method can also be called from Python.  This is often extremely useful for testing and general usability.  The cpdef method will be slightly slower though.  In this example, it is about 4 times slower.

cpdef return_type function_name(type1 input1, type2 input2...):
{{{id=73| %cython cpdef int add_cython2(int a, int b): return a + b def f2(int n): cdef int i, s=0 for i in range(n): s += add_cython2(s, i) return s /// }}} {{{id=74| timeit('f2(10^6)') /// 125 loops, best of 3: 2.63 ms per loop }}} {{{id=75| 2.63/.595 /// 4.42016806722689 }}} {{{id=76| add_cython2(2,8) /// 10 }}} {{{id=67| /// }}} {{{id=52| /// }}} {{{id=14| /// }}} {{{id=17| /// }}}

Defining New Cython Classes

One of the most powerful features of Cython is that you can define new classes that have C-level attributes and cdef'd functions. The cdef'd attributed and function calls are very, very fast to use. Note that cdef'd classes in Cython can have at most one base class; there is no support for multiple inheritance.

cdef class ClassName(base_class):  
     cdef type_name variable
     # ...
     # Then functions mostly like a Python class, except 
     # you can include cdef'd methods with input and output
     # types as in the previous section.  
     # ...
     # There are some subtleties with special methods such
     # as __add__ and __hash__; see the Cython documentation.

Here is an example in which we create a Cython class that wraps a Python string, and provides the ability of changing the entries of the string:

{{{id=20| %cython cdef class StringMutator: cdef bytes s # cdef's attribute def __init__(self, bytes s): self.s = s def __setitem__(self, int i, bytes a): if i < 0 or i >= len(self.s): raise IndexError if len(a) != 1: raise ValueError ( self.s)[i] = (a)[0] def __repr__(self): return self.s def __str__(self): return "%s"%self.s /// }}} {{{id=30| s = "Hello World" t = StringMutator(s) t[4] = 'X' print s print t /// HellX World HellX World }}} {{{id=13| # setting an entry is really, really fast: timeit("t[4]='X'", number=10^5) /// 100000 loops, best of 3: 226 ns per loop }}} {{{id=77| t[100] = 'x' /// Traceback (most recent call last): File "", line 1, in File "_sage_input_127.py", line 10, in exec compile(u'open("___code___.py","w").write("# -*- coding: utf-8 -*-\\n" + _support_.preparse_worksheet_cell(base64.b64decode("dFsxMDBdID0gJ3gn"),globals())+"\\n"); execfile(os.path.abspath("___code___.py"))' + '\n', '', 'single') File "", line 1, in File "/tmp/tmp2bTyrH/___code___.py", line 3, in exec compile(u"t[_sage_const_100 ] = 'x'" + '\n', '', 'single') File "", line 1, in File "_sagenb_flask_sage_notebook_sagenb_home_openidSfmMv1OuVE_31_code_sage159_spyx_0.pyx", line 11, in _sagenb_flask_sage_notebook_sagenb_home_openidSfmMv1OuVE_31_code_sage159_spyx_0.StringMutator.__setitem__ (_sagenb_flask_sage_notebook_sagenb_home_openidSfmMv1OuVE_31_code_sage159_spyx_0.c:554) if i < 0 or i >= len(self.s): raise IndexError IndexError }}} {{{id=78| m = str(t); m /// 'HellX World' }}} {{{id=79| t[0] = 'X'; t /// XellX World }}} {{{id=80| m /// 'HellX World' }}} {{{id=81| /// }}}