PIPELINED LINEAR ALGEBRA CIRCUIT
Willi Geiselmann, Adi Shamir, Rainer Steinwandt and Eran Tromer, 2005
This is a highly sophisticated pipelined approach to matrix-vector multiplication in the block Wiedemann method. A highly simplified description will be as follows.
Basic Pipelined Idea
This simplified device consists of several stations connected in a pipeline. The i-th station is in charge of the i-th matrix row, and contains a compressed representation of the non-zero entries in that row. It is also in charge of the i-th entry of the output vector, and contains a corresponding accumulator. In each multiplication, the input vector is fed into the top of the pipeline, and moves down as in a shift register. As the entries of the vector pass by, the i-th station looks at all vector entries passing through it, identifies the ones corresponding to the non-zero matrix entries in row i, and for those entries adds 1 to its accumulator. Once the input vector has passed all stations in the pipeline, the accumulators contain the entries of the product vector. These can now be off-loaded and fed back to the top of the pipeline in order to compute the next multiplication.
Optimizations and Improvements
There are several ways to tweak the set up and optimize the performance of this device. But, by far, this is the best device proposed for the linear algebra system in NFS.