SORTING MESH
Daniel Bernstein, 2001
In the linear algebra step of NFS, the block Wiedemann algorithm is
extensively used. For this purpose, we need to obtain products of the form
where
is a huge (100 million to 10 billion) but
sparse (about 100 1's in each row) matrix and
is a vector of the same size. Bernstein
observed the following:
He proposed the following scheme for matrix-vector multiplication using Schimmler sorting.
Mesh Sorting Algorithm
Step 0: Build an mesh
such that it exceeds the number of the nonzero elements in
plus twice the
number of rows.
Step 1: Store all the non-zero entries of
in the mesh as pairs
such that
and nonzero entries of
as
.
Fill the rest of the mesh with 0's.
Step 2: Sort all the values in the mesh in increasing order
of in a snakelike fashion, to obtain a
configuration as follows.
Step 3: Communicate each
to the cells
, as they are going to be
multiplied. Then replace the
values by the
values of the cells which contain both
and
.
Set the other places 0.
Step 4: Sort once again in the same snakelike fashion as before to obtain the following.
Step 5: As we are working in GF(2), compare each cell with its neighbors and cancel both if the values are the same. This apply the GF(2) addition along rows.
The mesh now contains the non-zero elements of
in place of
and also the original matrix
. Hence, it is
again ready for the next multiplication.
Complexity: For a mesh of size
, it requires about
steps.