In 1920, Kraitchik came up with the following enhanced variant
of Fermat's method, which can be termed the congruence of
squares method.
Congruence of Squares Method
Idea: To factor , try to
find such that
. Then either
or
yields a factor of .
Practicality: Computation of gcd is easy and if
is a product of two odd primes, the probability of success is
about 1/2. The only difficulty is to find such .
In 1975, Morisson and Brillhart proposed a systematic scheme to
find the numbers . An outline of the
method is as follows.
- Choose a factor base of primes
for some suitable bound
.
- Collect a set
of integers
such that
for
,
positive integers and
.
- Find
linear dependencies
over GF(2) among the
-dimensional
vectors .
- Each such dependency gives a congruence
and hence provides a 50% chance
of factoring
.
In this scheme, the main computational bottlenecks are steps 2
and 3. Step 2 is known as the relation collection process
and step 3 is a linear algebra operation over GF(2). After
this approach for factoring came out in the 1970s, special purpose
hardware was proposed to solve the relation collection
stage. Let us take a look at a couple of those.
The first successful hardware configuration to collect relations
was the Extended Precision Operand Computer (aka Georgia Cracker).
This was a 128-bit, highly parallel special-purpose computer for
factoring using the Continued Fraction method, built by Jeffrey
Smith and Samuel Wagstaff in 1982. For further reference, please check
An extended precision operand computer by Jeffrey W. Smith, Samuel S. Wagstaff, Jr.
in proc. 21st Southeast
Region. ACM Conference, 1983.
It was not anything particularly spectacular, nor did it have
any strikingly new feature. It was used to factor numbers from the
Cunningham tables and the greatest achievement of this machine was
to factor in 1986. This machine is
currently held by J. Smith.
In 1981, Dixon modified Morisson-Brillhart's idea to obtain a
faster algorithm and thereafter, Pomerance optimized Dixon's idea to
produce the Quadratic Sieve (QS) in the same year.
- Choose a factor base of primes
for some suitable bound
and choose a polynomial
.
- Collect a set of relations
for
which are
-smooth.
- Find linear dependencies
over GF(2) among the
-dimensional
vectors .
- Use these dependencies to construct congruences
and thus factor
.
Once again, the main bottleneck for QS is the relation
collection stage. Pomerance, Smith and Tuler designed a hardware to
implement QS efficiently. It was named Quasimodo.
Quasimodo (1988)
Quasimodo stands for "quadratic sieve motor" and was designed by
Pomerance, Smith and Tuler in 1988. This was a special purpose
hardware for collecting relations using the QS and factorize
integers. It had an interesting pipelined architecture and was
analyzed to be really fast when designed. But in practice, after the
machine was built, it failed to provide satisfactory performance and
had not been debugged ever. Rather, as it got superseded by the
newer generation of computers, the project was abandoned and was
never used to factor anything. The machine has been dismantled and
its whereabouts are unknown.
For further information, please refer to A pipeline
architecture for factoring large integers with the quadratic sieve
algorithm by Carl Pomerance, Jeffrey W. Smith and Randy Tuler
in SIAM Journal on Computing, vol.
17, 1988.
Generation Gap
Since late 1980s and early 1990s, the general purpose computers
took over and the special purpose devices had to step back as they
didn't stand a chance compared to the efficient implementations of
the computer hardware. The computational power for the relation
collection step could be arranged using general purpose computers
quite easily, if you are interested to factor medium sized integers,
which are still larger than the ones factored previously, but not as
large as RSA challenge numbers. Hence, the relation collection
became 'easier' and linear algebra took over the focus due to huge
hardware requirements and efficient matrix handling. In such
circumstances, the special purpose hardware were not longer worth
the trouble if they did not provide something ingeniously unique, or
"lead to interesting funding possibilities" - A. Lenstra.
In the 1990s, the major developments in factoring centered
around the improvement of quadratic sieve which led to the
generation of the Number Field sieve in 1993. Algorithms and
complexity computations for NFS using general purpose computers was
the main focus during this decade. Let us take a look at the most
efficient factoring algorithm till date.
Reference: Lecture notes by William B. Hart
Basic Idea of NFS
- Pick a monic irreducible polynomial
of degree
and an element
such that
. Also let
in
and
.
- Choose suitable factor bases: primes in
less than some bound
and prime ideals in
such that their norms are less
than some bound .
- Find
with
such that
is
smooth and is
smooth. Find sufficiently many
of these relations.
- Multiply the relations using linear dependency checks over
GF(2) so as to obtain a congruence of squares as before.
Advantage over QS
We can choose a suitable polynomial with degree higher
than 2 and thus reduce the range for .
In practice, quartic and quintic polynomials are used quite often.
Main Steps
- Choosing the polynomial suitably and efficiently [Not very
hard and there are several practical ways to do this]
- Relation collection over integers and the ring of integers
in an algebraic field [Bottleneck
1]
- Linear algebra over GF(2) with huge sparse matrices
[Bottleneck 2]
- Taking square root modulo
[Efficient using
Montgomery's Algorithm or FFT
Polynomial Algorithm]
Resurrection of SPDs
The main approach to speed-up NFS is to remove the bottlenecks
as mentioned above. This is where the special purpose hardware came
into picture once again. We will first take a look at
Bottleneck 2: Linear Algebra and the SPDs dedicated to this
problem. This is mainly because this bottleneck has already been
removed by proposed and prototyped hardware configurations to deal with
sparse matrices over GF(2). Next, we will examine the
state-of-the-art SPD configurations proposed (none built) for
Bottleneck 1: Relation Collection although this
problem has not been solved yet.
Mesh-based SPDs
- Sorting Mesh by Daniel J.
Bernstein
Circuits for Integer Factorization: a Proposal,
manuscript, 2001
Bernstein, in his proposal, observed that the NFS linear algebra
step has a huge input that is accessed repeatedly. On
traditional computers, very few processors are accessing that
huge input, which is inherently inefficient because most of the
storage is idle most of the time. Bernstein also proposed an
implementation using mesh sorting, which reduces the asymptotic
cost. He also put forward a new efficiency measure in terms of
time × cost of
components. But analysis of this proposal later on revealed
that the cost of implementation is quite prohibitive for
1024-bit integers.
The idea of the Sorting Mesh
[Illustrated brief of the idea]
- Routing Mesh by Arjen K.
Lenstra, Adi Shamir, Jim Tomlinson, Eran Tromer
Analysis of Bernstein's factorization circuit, proc.
Asiacrypt 2002, LNCS 2501, 1-26, Springer, 2002
This paper analysed the above mentioned proposal by Bernstein,
asymptotically and concretely and proposed an alternative
concrete design that improves efficiency by various algorithmic
and technological means. The hardware proposed is a high-level
parallel, wafer-scale electronic device consisting of an array
of cells, executing mesh routing. This proposal reduces the cost
of implementation to a feasible level for 1024-bit composites,
but remains quite challenging from technological implementation
point of view.
The idea of the Routing Mesh
[Illustrated brief of the idea]
- Reconfigurable Hardware
by Sashisu Bajracharya, Deapesh Misra, Kris
Gaj, Tarek El-Ghazawi
Reconfigurable hardware implementation
of mesh routing in number field sieve factorization,
SHARCS 2005, February 2005
This paper improves the above mentioned routing mesh by
splitting into smaller and semi-independent chips. This variant
was prototyped on FPGA.
Implementation idea of the FPGA prototype
[Talk by the authors at SHARCS'05]
Pipelined SPDs
- Scalable
Pipelined Circuits by Willi Geiselmann, Adi Shamir, Rainer
Steinwandt and Eran Tromer
Scalable Hardware for Sparse
Systems of Linear Equations, with Applications to Integer
Factorization, March 2005
This paper proposes a highly-parallel electronic device, based on a pipelined
architecture for linear algebra. This is more
efficient than any of the above mesh-based devices, and more
technologically feasible due to smaller chips and better
scalability. This solves the Linear Algebra problem by
offering efficiency within a feasible cost for 1024-bit
composites. But this device has not been built (to public
knowledge).
Idea of the Pipelined Architecture
[Illustrated brief of the idea]
Opto-Electronic Sieve
- TWINKLE ("The Weizmann Institute Key Locating
Engine") by Adi Shamir and Arjen K. Lenstra
Factoring large numbers with the TWINKLE device, Cryptographic Hardware and
Embedded Systems (CHES) 1999, LNCS 1717, 2-12, Springer, 1999
The original proposal was published in the context of the Quadratic Sieve.
Later it turned out to be the first modern
NFS sieving device. The device uses non-conventional
electro-optical components for analog summation. This approach is
not very efficient for
large (≥768 bit) integers, due to need for large number of support
PCs for computation. The device was never built in practice.
The idea of TWINKLE
[Talk by Shamir and Tromer at SHARCS'05]
Mesh-based Sieve
- Dedicated Sieving Mesh by Willi Geiselmann and Rainer Steinwandt
A dedicated sieving hardware, proc. Public Key Cryptography (PKC) 2003, LNCS 2567, 254-266,
Springer, 2003
In this paper, the authors propose a class of highly-parallel electronic sieving devices based on mesh
sorting or mesh routing. It was an ingenious adaptation of ideas
from Bernstein's NFS
linear algebra sorting mesh to sieving. It promised feasible cost
for 1024-bit composites, but was never built due to
technological challenges.
Pipelined Sieve
- TWIRL ("The Weizmann Institute Relation
Locator") by Adi Shamir and Eran Tromer
Factoring large numbers with the TWIRL device,
proc. Crypto 2003, LNCS 2729, 1-26, Springer, 2003
The pipelined architecture offer a class of highly-parallel electronic sieving devices based on
unidirectional information flows and processing. This machine
provides a highly-parallel electronic sieving device using a
pipelined architecture different from mesh-based sieving. This
offers a lower cost than mesh based sieving and hence a feasible
cost for 1024-bit composites, but was never built in practice.
The idea of TWIRL [Talk
by Shamir and Tromer at SHARCS'05]
Routing Network Sieve
- SHARK by Jens Franke, Thorsten Kleinjung, Christof
Paar, Jan Pelzl, Christine Priplata and Colin Stahlke
SHARK — a realizable special hardware
sieving device for factoring 1024-bit integers, proc.
SHARCS 2005, February 2005
This is a highly-parallel electronic sieving device. Differs from TWIRL in its
use of a butterfly routing network, use of standard-sized chips, and
being tailored for special-q sieving. The cost of this
machine has been evaluated for
1024-bit composites. Though the predicted cost is higher than that
of TWIRL's
($200M vs. $1.1M for 1024-bit composites in 1 year), but the
technological challenges are different, and hence might provide a
better solution to sieving.
The idea of SHARK [Talk by
the authors at SHARCS'05]
Though the current advancements in analyzing special purpose
devices have stirred quite a lot of interest in the research sector,
most of these devices have never been built and are facing
tremendous technological challenges for 1024-bit integers. Apart
from the SPDs dedicated towards NFS, ongoing research is also
considering alternate factoring methods based on proposed hardware
of the future. Let us take a look at the other options that might be
the future of SPDs in factoring.
|