[top] [TitleIndex] [WordIndex]


20090408: Bill Hart: Toom 7


Summary of last time

Toom 3: 5 mults Toom 4 2:

Toom 4: 7 mults Toom 5 3: Toom 6 2:

Odd/Even: $A_1(x^2) + x A_2(x^2)$ and $B_1(x^2) + x B_2(x^2)$ To date, nobody has ever beat ordinary or unbalanced Tooms using odd/even.

Question from last time: Why was one of the two matrices (one with [1,2,4,8,16]) better than the other?


Schoolboy $O(n^2)$

Karatsuba: $K(2n) = 3 K(n)$, so $K(n) = n^k$, where $k=\log(3)/\log(2)=1.585$.

Toom 3: $K(3n) = 5 K(n)$, so $K(n) = O(n^{\log(5)/\log(3)}) = O(n^{1.465})$

Toom $k$: $K(k n) = (2k-1) K(n)$, so $K(n) = O(n^{\log(2k-1)/\log(k)})$. As $k\infty$, this goes to $1$.

FFT Multiplication

Evaluate at roots of unity.

FFT (Fast Fourier Transform, Cooley-Tkey). Asymptotically fast method for evaluating a polynomial of length $n$ at $2n$ roots of unity.

Schoenhage-Strassen: work in a ring $\mathbf{Z} / p\mathbf{Z}$, where $p=2^{2^{\ell}} + 1$. Note that $2$ is a $2^{\ell+1}$th root of unity. This gives an algorithm that is $M(n)=O(n \log(n)\log(\log(n)))$ to multiply two integers.

However, the cutoff is about 7500 limbs in GMP. I.e. for smaller numbers this isn't worth it.

But what about smaller than 7500?

Toom 4/5

Bodrato/Zanoni (2007): Published optimal evaluation and interpolation sequences.

There is almost no point in going from Toom 4 to Toom 5, since the cutoff is about the same (about 190 limbs on an Opteron, say; less on a core 2).

Toom 7

For Toom 7, cutoff is abut 190 limbs, and it always beats Toom 4/5.

No optimal sequence known. However,

Bill then shows us a projector of code and Bodrato's steps.

Heuristic A: After row operation, row $i$ has at least one more zero.


r1 = 1 1 0
r2 = 1 2 3
r2 = r2 - r1 = 0 1 2


Heuristic B: After row operation, there exists row $i_2\neq i_1$ such that $$

$$ occurs for more entries than before.

r1 = r1 + r3
r1 0 2 2 2 0
r2 0 4 4 6 1
r3 0 0 0 0 1

William: Why is Toom 7 hard to implement in GMP, but easy in MPIR?


For example, to do a subtraction, $a-b$ have to do maybe $b-a$ and keep of sign separately.

Dan argues that one can keep track of barrows... etc., and Bill tends to agree. Lots of subtlety involving data structures.

Start discussing quadratic sieve

One word of caution: the quadratic and number field sieves are enormous black holes that claims a lot of academic lives. Make sure that if you start a project on this, that you have a goal in mind and reasonable time frame. This will be a multi-year effort, probably about 6 years, and will be a minimum 9000 line program. In effect, aiming for an optimal number field sieve is a complete waste of time.

My first project: 60 decimal digit numbers:

Code looks very similar! You would barely be able to tell the difference between "12 minutes" and "9.3 seconds". The code becomes organic and takes on a life of its own. Small changes can make it way slower.

up to 90 decimal digits: always use quadratic sieve, if suitably random...

2013-05-11 18:32