Special Purpose Factoring Devices

   A journey from Bicycle to Quantum Computers

Computational Number Theory Project
Math 583e, University of Washington
Spring 2009

Kraitchik's Idea (1920s) 

In 1920, Kraitchik came up with the following enhanced variant of Fermat's method, which can be termed the congruence of squares method.

Congruence of Squares Method

Idea: To factor , try to find such that . Then either or yields a factor of .
Practicality: Computation of gcd is easy and if is a product of two odd primes, the probability of success is about 1/2. The only difficulty is to find such .

 

Morisson and Brillhart's Method (1975)

In 1975, Morisson and Brillhart proposed a systematic scheme to find the numbers . An outline of the method is as follows.

  1. Choose a factor base of primes  for some suitable bound .
  2. Collect a set of integers such that  for positive integers and .
  3. Find linear dependencies over GF(2) among the -dimensional vectors .
  4. Each such dependency gives a congruence and hence provides a 50% chance of factoring .

In this scheme, the main computational bottlenecks are steps 2 and 3. Step 2 is known as the relation collection process and step 3 is a linear algebra operation over GF(2). After this approach for factoring came out in the 1970s, special purpose hardware was proposed to solve the relation collection stage. Let us take a look at a couple of those.

 

Georgia Cracker (1982) 

The first successful hardware configuration to collect relations was the Extended Precision Operand Computer (aka Georgia Cracker). This was a 128-bit, highly parallel special-purpose computer for factoring using the Continued Fraction method, built by Jeffrey Smith and Samuel Wagstaff in 1982. For further reference, please check An extended precision operand computer by Jeffrey W. Smith, Samuel S. Wagstaff, Jr. in proc. 21st Southeast Region. ACM Conference, 1983.

It was not anything particularly spectacular, nor did it have any strikingly new feature. It was used to factor numbers from the Cunningham tables and the greatest achievement of this machine was to factor in 1986. This machine is currently held by J. Smith.

 

Quadratic Sieve (1981)

In 1981, Dixon modified Morisson-Brillhart's idea to obtain a faster algorithm and thereafter, Pomerance optimized Dixon's idea to produce the Quadratic Sieve (QS) in the same year.

  1. Choose a factor base of primes  for some suitable bound and choose a polynomial .
  2. Collect a set of relations for which are -smooth.
  3. Find linear dependencies over GF(2) among the -dimensional vectors .
  4. Use these dependencies to construct congruences and thus factor .

Once again, the main bottleneck for QS is the relation collection stage. Pomerance, Smith and Tuler designed a hardware to implement QS efficiently. It was named Quasimodo.

 

Quasimodo (1988) 

Quasimodo stands for "quadratic sieve motor" and was designed by Pomerance, Smith and Tuler in 1988. This was a special purpose hardware for collecting relations using the QS and factorize integers. It had an interesting pipelined architecture and was analyzed to be really fast when designed. But in practice, after the machine was built, it failed to provide satisfactory performance and had not been debugged ever. Rather, as it got superseded by the newer generation of computers, the project was abandoned and was never used to factor anything. The machine has been dismantled and its whereabouts are unknown.

For further information, please refer to A pipeline architecture for factoring large integers with the quadratic sieve algorithm by Carl Pomerance, Jeffrey W. Smith and Randy Tuler in SIAM Journal on Computing, vol. 17, 1988.

 

Generation Gap

Since late 1980s and early 1990s, the general purpose computers took over and the special purpose devices had to step back as they didn't stand a chance compared to the efficient implementations of the computer hardware. The computational power for the relation collection step could be arranged using general purpose computers quite easily, if you are interested to factor medium sized integers, which are still larger than the ones factored previously, but not as large as RSA challenge numbers. Hence, the relation collection became 'easier' and linear algebra took over the focus due to huge hardware requirements and efficient matrix handling. In such circumstances, the special purpose hardware were not longer worth the trouble if they did not provide something ingeniously unique, or "lead to interesting funding possibilities" - A. Lenstra.

In the 1990s, the major developments in factoring centered around the improvement of quadratic sieve which led to the generation of the Number Field sieve in 1993. Algorithms and complexity computations for NFS using general purpose computers was the main focus during this decade. Let us take a look at the most efficient factoring algorithm till date.

 

Number Field Sieve

Reference: Lecture notes by William B. Hart

Basic Idea of NFS

  1. Pick a monic irreducible polynomial of degree and an element such that . Also let in and .
  2. Choose suitable factor bases: primes in less than some bound and prime ideals in such that their norms are less than some bound .
  3. Find with such that is smooth and is smooth. Find sufficiently many of these relations.
  4. Multiply the relations using linear dependency checks over GF(2) so as to obtain a congruence of squares as before.

 Advantage over QS 

We can choose a suitable polynomial with degree higher than 2 and thus reduce the range for . In practice, quartic and quintic polynomials are used quite often.

Main Steps

  • Choosing the polynomial suitably and efficiently [Not very hard and there are several practical ways to do this]
  • Relation collection over integers and the ring of integers in an algebraic field [Bottleneck 1]
  • Linear algebra over GF(2) with huge sparse matrices [Bottleneck 2]
  • Taking square root modulo [Efficient using Montgomery's Algorithm or FFT Polynomial Algorithm]

 

Resurrection of SPDs 

The main approach to speed-up NFS is to remove the bottlenecks as mentioned above. This is where the special purpose hardware came into picture once again. We will first take a look at Bottleneck 2: Linear Algebra and the SPDs dedicated to this problem. This is mainly because this bottleneck has already been removed by proposed and prototyped hardware configurations to deal with sparse matrices over GF(2). Next, we will examine the state-of-the-art SPD configurations proposed (none built) for Bottleneck 1: Relation Collection although this problem has not been solved yet.

 

SPDs for Linear Algebra

Mesh-based SPDs

  • Sorting Mesh by Daniel J. Bernstein
    Circuits for Integer Factorization: a Proposal, manuscript, 2001

    Bernstein, in his proposal, observed that the NFS linear algebra step has a huge input that is accessed repeatedly. On traditional computers, very few processors are accessing that huge input, which is inherently inefficient because most of the storage is idle most of the time. Bernstein also proposed an implementation using mesh sorting, which reduces the asymptotic cost. He also put forward a new efficiency measure in terms of time × cost of components. But analysis of this proposal later on revealed that the cost of implementation is quite prohibitive for 1024-bit integers.

    The idea of the Sorting Mesh [Illustrated brief of the idea]


  • Routing Mesh by Arjen K. Lenstra, Adi Shamir, Jim Tomlinson, Eran Tromer
    Analysis of Bernstein's factorization circuit
    , proc. Asiacrypt 2002, LNCS 2501, 1-26, Springer, 2002

    This paper analysed the above mentioned proposal by Bernstein, asymptotically and concretely and proposed an alternative concrete design that improves efficiency by various algorithmic and technological means. The hardware proposed is a high-level parallel, wafer-scale electronic device consisting of an array of cells, executing mesh routing. This proposal reduces the cost of implementation to a feasible level for 1024-bit composites, but remains quite challenging from technological implementation point of view.

    The idea of the Routing Mesh [Illustrated brief of the idea]


  • Reconfigurable Hardware by Sashisu Bajracharya, Deapesh Misra, Kris Gaj, Tarek El-Ghazawi
    Reconfigurable hardware implementation of mesh routing in number field sieve factorization, SHARCS 2005, February 2005

    This paper improves the above mentioned routing mesh by splitting into smaller and semi-independent chips. This variant was prototyped on FPGA.

    Implementation idea of the FPGA prototype [Talk by the authors at SHARCS'05]

Pipelined SPDs

  • Scalable Pipelined Circuits by Willi Geiselmann, Adi Shamir, Rainer Steinwandt and Eran Tromer
    Scalable Hardware for Sparse Systems of Linear Equations, with Applications to Integer Factorization
    , March 2005

    This paper proposes a highly-parallel electronic device, based on a pipelined architecture for linear algebra. This is more efficient than any of the above mesh-based devices, and more technologically feasible due to smaller chips and better scalability. This solves the Linear Algebra problem by offering efficiency within a feasible cost for 1024-bit composites. But this device has not been built (to public knowledge).

    Idea of the Pipelined Architecture [Illustrated brief of the idea]

 

SPDs for Relation Collection (Sieving)

Opto-Electronic Sieve

  • TWINKLE ("The Weizmann Institute Key Locating Engine") by Adi Shamir and Arjen K. Lenstra
    Factoring large numbers with the TWINKLE device, Cryptographic Hardware and Embedded Systems (CHES) 1999, LNCS 1717, 2-12, Springer, 1999

    The original proposal was published in the context of the Quadratic Sieve. Later it turned out to be the first modern NFS sieving device. The device uses non-conventional electro-optical components for analog summation. This approach is not very efficient for large (≥768 bit) integers, due to need for large number of support PCs for computation. The device was never built in practice.

    The idea of TWINKLE [Talk by Shamir and Tromer at SHARCS'05]

Mesh-based Sieve

  • Dedicated Sieving Mesh by Willi Geiselmann and Rainer Steinwandt
    A dedicated sieving hardware, proc. Public Key Cryptography (PKC) 2003, LNCS 2567, 254-266, Springer, 2003

    In this paper, the authors propose a class of highly-parallel electronic sieving devices based on mesh sorting or mesh routing. It was an ingenious adaptation of ideas from Bernstein's NFS linear algebra sorting mesh to sieving. It promised feasible cost for 1024-bit composites, but was never built due to technological challenges.

Pipelined Sieve

  • TWIRL ("The Weizmann Institute Relation Locator") by Adi Shamir and Eran Tromer
    Factoring large numbers with the TWIRL device, proc. Crypto 2003, LNCS 2729, 1-26, Springer, 2003

    The pipelined architecture offer a class of highly-parallel electronic sieving devices based on unidirectional information flows and processing. This machine provides a highly-parallel electronic sieving device using a pipelined architecture different from mesh-based sieving. This offers a lower cost than mesh based sieving and hence a feasible cost for 1024-bit composites, but was never built in practice.

    The idea of TWIRL [Talk by Shamir and Tromer at SHARCS'05]

Routing Network Sieve

  • SHARK by Jens Franke, Thorsten Kleinjung, Christof Paar, Jan Pelzl, Christine Priplata and Colin Stahlke
    SHARK — a realizable special hardware sieving device for factoring 1024-bit integers, proc. SHARCS 2005, February 2005

    This is a highly-parallel electronic sieving device. Differs from TWIRL in its use of a butterfly routing network, use of standard-sized chips, and being tailored for special-q sieving. The cost of this machine has been evaluated for 1024-bit composites. Though the predicted cost is higher than that of TWIRL's ($200M vs. $1.1M for 1024-bit composites in 1 year), but the technological challenges are different, and hence might provide a better solution to sieving.

    The idea of SHARK [Talk by the authors at SHARCS'05]

 

Though the current advancements in analyzing special purpose devices have stirred quite a lot of interest in the research sector, most of these devices have never been built and are facing tremendous technological challenges for 1024-bit integers. Apart from the SPDs dedicated towards NFS, ongoing research is also considering alternate factoring methods based on proposed hardware of the future. Let us take a look at the other options that might be the future of SPDs in factoring.

 

<< Prev: SPDs from Past Next: SPDs of Future >>