Numpy is an awesome Python library included in Sage. It's been under development for a long time by many people in the numerical Python community -- mainly Travis Oliphant -- and grew out of other libraries such as Numarray and Numeric, which were developed for industrial applications. It is the canonical numerical data manipulation library for Python.
Think of numpy as:
Resources
Sage Supports Diverse Perspectives on Matrices
Numpy arrays are vastly different than Sage matrices, to put it mildly. Essentially every rule, constraint, convention, etc., is different. They server a very different purpose. They are meant to be the ultimate Python general $n$-dimensional dense array object. In contrast, Sage matrices are meant to be very mathematical objects with clear mathematically meaningful semantics and very fast algorithms mainly for algebraic and exact computations. That said, there is a strong interest by some people in numerical computation to make Sage's matrix more science/engineering friendly, especially for structured and sparse matrices.
{{{id=19| /// }}} {{{id=18| /// }}} {{{id=15| /// }}} {{{id=6| /// }}}Example Image Compression Using the Singular Value Decomposition:
This example was part of a math 480 student project two years ago. It illustrates using something in numerical linear algebra called "Singular Value Decomposition" to compress an image. The basic idea is that the SVD algorithm can be used to approximate a matrix of real numbers. So we take an image, greyscale it, then use SVD find three much smaller matrices whose matrix product is a compressed version of the original greyscale matrix. Instead of storing the original matrix, we store these three other matrices, which takes much less space. To display the image, we have to multiply the three matrices (i.e., that's the decompression algorithm).
Mathematically, the Singular Value Decomposition of a real matrix $A$ is a decomposition
$$A = U S V$$
where $U$ and $V$ are unitary, so $U^t U = U U^t = I$ and same for $V$, and $S$ is a diagonal (but not necessarily square) matrix. The diagonal entries of $S$ are called the singular values of $A$. They are the square roots of the eigenvalues of the real symmetric matrix $AA^t.$ (SVD also makes sense for complex matrices.)
{{{id=12| import pylab, numpy # Read in a png image as a *numpy array*. UW = pylab.imread(DATA + 'uw.png') # Convert to greyscale A_image = numpy.mean(UW, 2) # Use numpy to compute the singular value decomposition of the image u,s,v = numpy.linalg.svd(A_image) # SVD is a big deal in numerical linear algebra. # (See http://trac.sagemath.org/sage_trac/ticket/9011) S = numpy.zeros( A_image.shape ) S[:len(s),:len(s)] = numpy.diag(s) # The SVD is a factorization of A_image as a product: U * S * V # We verify that this is indeed the case. # WARNING u * S is *NOT* matrix multiplication in numpy, # it's componentwise multiplication. Instead use "numpy.dot" to multiply. check = numpy.dot(u, numpy.dot(S, v)) print check == A_image # Of course, nothing is ever exactly right with floating point matrices... # so we check for closeness. print numpy.allclose(check, A_image, 1e-4, 1e-4) /// [[False False False ..., False False False] [False False False ..., False False False] [False False False ..., False False False] ..., [False False False ..., False False False] [False False False ..., False False False] [False False False ..., False False False]] True }}} {{{id=5| A_image = numpy.mean(UW, 2) u,s,v = numpy.linalg.svd(A_image) S = numpy.zeros( A_image.shape ) S[:len(s),:len(s)] = numpy.diag(s) @interact def svd_image(i = ("Eigenvalues (quality)",(20,(1..A_image.shape[0])))): # Thus the following expression involves only the first i eigenvalues, # and the first i rows of v and the first i columns of u. # Thus we use i*m + i*n + i = i*(m+n+1) storage, compared to the # original image that uses m*n storage. A_approx = numpy.dot(numpy.dot(u[:,:i], S[:i,:i]), v[:i,:]) # Now plot both matrices using Sage's matrix_plot command. g = graphics_array([matrix_plot(A_approx), matrix_plot(A_image)]) # And show the plot, with no axes: show(g, axes=False, figsize=(8,3)) # And add a caption html("
|
Create a numpy array as follows. For numeric types, one can and should explicitly give the datatype (dtype), and it can be any numerical type or generic Python objects. This is already much different than Sage's very mathematical matrices.
{{{id=21| Z = numpy.array([[sqrt(2) + 3, random_matrix(QQ, 2)], [1.5, 'sage']]) Z /// array([[sqrt(2) + 3, [ -1 -1] [1/2 -1]], [1.50000000000000, sage]], dtype=object) }}} {{{id=1| Z + Z /// array([[2*sqrt(2) + 6, [-2 -2] [ 1 -2]], [3.00000000000000, sagesage]], dtype=object) }}} {{{id=26| /// }}}Note: It might be possible to accomplish something similar (matrix whose entries are arbitrary) using Sage matrices by defining a new ring, as illustrated below.
{{{id=27| class ObjRing(sage.rings.ring.Field): def __init__(self): ParentWithGens.__init__(self, self, ('x',), normalize=False, category = EuclideanDomains()) def __call__(self, x): return x def _coerce_map_from_(self, S): return True def __repr__(self): return "'Field' of all Python objects" /// }}} {{{id=32| R = ObjRing() /// }}} {{{id=28| Z = matrix(R, 2, [[sqrt(2) + 3, random_matrix(QQ, 2)], [1.5, 'sage']]); Z /// [ sqrt(2) + 3 [-1 1] [-1 1]] [1.50000000000000 'sage'] }}} {{{id=25| Z + Z /// [ 2*sqrt(2) + 6 [-2 2] [-2 2]] [3.00000000000000 'sagesage'] }}}But this doesn't quite work yet...
{{{id=24| Z*Z /// Traceback (most recent call last): File "What the heck does that mean?
Numpy arrays can have any number of dimensions! Think of these higher dimensional arrays as arrays of arrays (of arrays...). They are useful, e.g., in image processing, where there is one array for each color channel.
{{{id=38| M = numpy.array([[[1,2], [3,4]], [[1/3,5], [10,2]], [[8,1], [-5,2]]], dtype=int) M /// array([[[ 1, 2], [ 3, 4]], [[ 0, 5], [10, 2]], [[ 8, 1], [-5, 2]]]) }}}The actual array entries are NOT Python int's. This means that manipulations with Numpy arrays can be implemented in such a way that they are blazingly fast.
{{{id=37| type(M[0,0,0]) ///It's a 3-dimensional array:
{{{id=176| M.ndim /// 3 }}}M.flat gives an iterator over all elements in the array:
{{{id=177| for x in M.flat: print x, /// 1 2 3 4 0 5 10 2 8 1 -5 2 }}} {{{id=173| /// }}}dtype: "A data-type object that fully describes each fixed-length item in the array. The dtype attribute can be set to anything that can be interpreted as a data type (numpy includes 21 of these). Setting this attribute allows you to change the interpretation of the data in the array. The new data type must be compatible with the array's current data type."
{{{id=36| M.dtype /// dtype('int64') }}} {{{id=175| /// }}}Changing the dtype means that the underlying raw data in the array is unchanged -- it's just interpreted differently. This is not like Sage's change_ring command.
{{{id=44| M.dtype = numpy.dtype('float64') M /// array([[[ 4.94065646e-324, 9.88131292e-324], [ 1.48219694e-323, 1.97626258e-323]], [[ 0.00000000e+000, 2.47032823e-323], [ 4.94065646e-323, 9.88131292e-324]], [[ 3.95252517e-323, 4.94065646e-324], [ NaN, 9.88131292e-324]]]) }}} {{{id=47| M.dtype = numpy.dtype('int64') M /// array([[[ 1, 2], [ 3, 4]], [[ 0, 5], [10, 2]], [[ 8, 1], [-5, 2]]]) }}}For coercion that makes sense, use astype (which does not change M):
{{{id=48| M.astype(numpy.dtype('float64')) /// array([[[ 1., 2.], [ 3., 4.]], [[ 0., 5.], [ 10., 2.]], [[ 8., 1.], [ -5., 2.]]]) }}} {{{id=168| /// }}}There is excellent support in Cython for gaining direct high-speed access to Numpy arrays.
{{{id=170| %cython cimport numpy as cnumpy def mysum(cnumpy.ndarray a): cdef double *p =We create a numpy array:
{{{id=4| A = numpy.array([[1/3,2,3], [4,5,6], [7,8,9]], dtype=float); A /// array([[ 0.33333333, 2. , 3. ], [ 4. , 5. , 6. ], [ 7. , 8. , 9. ]]) }}} {{{id=8| type(A[0,0]) ///Mutability (writeability) is not strictly enforced once set, since you can trivially change it to true, make a write, then change the flag back.
{{{id=13| before = A.flags['WRITEABLE'] A.flags['WRITEABLE'] = True A[0,0] = 20 A.flags['WRITEABLE'] = before A /// array([[ 20., 2., 3.], [ 4., 5., 6.], [ 7., 8., 9.]]) }}}This is different than Sage matrix mutability!
{{{id=183| B = matrix(RDF, 3, 3, [1..9]) B.set_immutable() /// }}}Now B *can't* (without evil hackery) be changed back to being mutable...
{{{id=136| B[0,0] = 20 /// Traceback (most recent call last): File "Generally speaking, Sage is much more about enforcing rigor, etc., since a lot of subtle mathematical errors in complicated algebraic algorithms can arise from the lax (but more user-friendly?) numpy approach.
{{{id=185| /// }}} {{{id=69| /// }}}NOTE: All numpy array have exactly the same Python data type numpy.ndarray, whereas Sage's matrices all have their own types and their is an object inheritance scheme.
{{{id=92| B = matrix(A); B /// [1 2 3] [4 5 6] [7 8 9] }}} {{{id=94| type(B) ///ptp -- Difference of the largest to the smallest value of A
{{{id=110| A.ptp() /// 0.75807179063843888 }}}clip: Return a new array where any element in A less than min is set to min and any element less than max is set to max.
{{{id=111| A.clip(min=-0.5, max=0.5) /// array([[ 0.05684831, 0.19929462, 0.5 ], [ 0.34850321, 0.3547917 , 0.23568246], [ 0.07878773, 0.36703638, 0.0933386 ]]) }}} {{{id=98| A.round() /// array([[ 0., 0., 1.], [ 0., 0., 0.], [ 0., 0., 0.]]) }}} {{{id=107| A.trace() /// 0.50497860643082049 }}} {{{id=128| A.mean() /// 0.28324479016293708 }}}var -- return the variance of the entries of A
{{{id=129| A.var() /// 0.04861216778805854 }}} {{{id=130| A.std() /// 0.22048167222710041 }}} {{{id=131| sqrt(A.var()) /// 0.22048167222710041 }}}prod -- product of all entries of self.
{{{id=132| A.prod() /// 7.2621406387616591e-07 }}}Lots of other functions that aren't methods! E.g.,
numpy.cov -- compute the covariance matrix of data
{{{id=142| import numpy numpy.cov(A) /// array([[ 0.16232642, -0.02632619, -0.01940079], [-0.02632619, 0.00449251, 0.00546182], [-0.01940079, 0.00546182, 0.02636825]]) }}} {{{id=143| numpy.corrcoef(A) /// array([[ 1. , -0.97487456, -0.29654068], [-0.97487456, 1. , 0.50182508], [-0.29654068, 0.50182508, 1. ]]) }}} {{{id=144| hist, bins = numpy.histogram(A, bins=4) print bins print hist /// [ 0.05684831 0.24636626 0.43588421 0.62540215 0.8149201 ] [5 3 0 1] }}} {{{id=146| /// }}} {{{id=147| /// }}}Numpy (and Sage) both have sophisticated array slicing...
{{{id=141| A = numpy.random.rand(3, 3); A /// array([[ 0.38184404, 0.52621461, 0.63089128], [ 0.89864138, 0.69864541, 0.40820973], [ 0.07656772, 0.74912548, 0.83850891]]) }}} {{{id=149| A[:2,] /// array([[ 0.38184404, 0.52621461, 0.63089128], [ 0.89864138, 0.69864541, 0.40820973]]) }}} {{{id=150| A[:3,0:2] /// array([[ 0.38184404, 0.52621461], [ 0.89864138, 0.69864541], [ 0.07656772, 0.74912548]]) }}} {{{id=151| A[1:3,0:2] /// array([[ 0.89864138, 0.69864541], [ 0.07656772, 0.74912548]]) }}} {{{id=152| A[1:3,1:3] /// array([[ 0.69864541, 0.40820973], [ 0.74912548, 0.83850891]]) }}} {{{id=153| B = matrix(A); B /// [ 0.381844041764 0.526214606105 0.630891282192] [ 0.898641376518 0.698645412576 0.408209732941] [0.0765677178583 0.749125484197 0.838508906693] }}} {{{id=154| B[1:3, 1:3] /// [0.698645412576 0.408209732941] [0.749125484197 0.838508906693] }}}A key difference between numpy and sage slicing, is that in Sage the slices are new copies, but in numpy they are references into the original matrix!
{{{id=157| print A B = A[0:2,0:2] /// [[ 0.38184404 0.52621461 0.63089128] [ 0.89864138 0.69864541 0.40820973] [ 0.07656772 0.74912548 0.83850891]] }}} {{{id=158| B[0,0] = 100 B /// array([[ 100. , 0.52621461], [ 0.89864138, 0.69864541]]) }}} {{{id=159| A /// array([[ 1.00000000e+02, 5.26214606e-01, 6.30891282e-01], [ 8.98641377e-01, 6.98645413e-01, 4.08209733e-01], [ 7.65677179e-02, 7.49125484e-01, 8.38508907e-01]]) }}}But in Sage...
{{{id=160| C = matrix(A) # sage matrix C /// [ 100.0 0.526214606105 0.630891282192] [ 0.898641376518 0.698645412576 0.408209732941] [0.0765677178583 0.749125484197 0.838508906693] }}} {{{id=161| B = C[0:2, 0:2] B[0,0] = -999; B /// [ -999.0 0.526214606105] [0.898641376518 0.698645412576] }}} {{{id=162| C /// [ 100.0 0.526214606105 0.630891282192] [ 0.898641376518 0.698645412576 0.408209732941] [0.0765677178583 0.749125484197 0.838508906693] }}} {{{id=164| /// }}} {{{id=165| /// }}}