Cython...

Prelude: "Best Practices"

Here is an example of "best practices". Sometimes people strongly suggest one should "write code then later optimize just the parts that really need it, as proven by profiling, etc." In fact, often extremely experienced people claim a lot of things about how to write computer software. (See, e.g., http://docs.cython.org/src/tutorial/profiling_tutorial.html which says exactly the repeatedly.) My advise is to listen, learn something, but never ever, ever believe anything you hear! Be especially suspicous of the phrase "best practices".

I spent the last week successfully writing some truly blazingly fast code (for computing Hilbert modular forms). I did it as follows:

I wrote some incredibly slow code in a few toy cases to get a basic understanding of the algorithms involved.
Debugged the code, and got a much, much better understanding of the relevant algorithms. This mostly entailed doing mathematical exercises, proving numerous little results, making things more explicit than in any papers in order to make sure I understood every detail correctly, etc., and finally spending hours tracking down a sign inconsistency between a paper and a book.
Starting mostly from scratch again, I carefully thought through (often when pencil and paper) how to structure my data and corresponding algorithms for maximum speed, using C-level data types, avoiding malloc and Python for anything performance critical. I tried as much as possible to think through all the data structures and algorithms before writing much code, and to design and implement them to be optimal from the start. The results dramatically exceeded my expectations.
I'm mostly pleased with the results, though some of the code is ugly and hard to maintain do to the lack of operator overloading in Cython/C. Due to recent improvements in Cython, it will be possible to fix this problem by rewriting some of the code to use Cython's support for C++ operator overloading. I wish I had known about this a week ago!

A point of the above story is that despite me having written a large amount of code in Pari, C, C++, Magma, and Python over the last 15 years, and seen even more code by other people in putting together Sage, I absolutely do not know the best way to implement anything. I try to always approach implementing an algorithm as an experience that could potentially change how I approach coding. I hope you do the same. Try new ideas and approaches that you make up, based somewhat on the advise of sage programmers, but with your own spin.

Catchy phrases like "premature optimization is the root of all evil" are cute. Don't believe things just because they are cute or catchy.

Writing New Code Using Cython

Declaring C Data Types

Creating "Cdef'd" Extension Classes

Cdef'd classes are similar to Python classes, but instances can have arbitrary C data types as fields and methods, hence they can be extremely fast. Only single inheritence is supported.

Making Use of Existing C/C++ Libraries Using Cython

Wrapping C Code

Wrapping C++ Classes

Exceptions and Crashes

Handling Exceptions

Using the Gnu Debugger (GDB) to Track Down Crashes

Larger Projects: Putting Code in Multiple Files (pxd Files)

Misc: Profiling Cython Code

If you use

sage: %pdb call_some_function(...)

and that function involves lots of Cython code, you'll likely be disappointed out how little information you get -- msot of the Cython calls are missing. To get around this, you can rebuild your Cython module with profiling enabled. See the profiling tutorial for an excellent tutorial explaining exactly how to do this, which discuss several surprising caveats.