**Input description**: An integer *n*.

**Problem description**: Is *n* a prime number, and if not what are the factors of *n*?

**Discussion**: The dual problems of factoring integers and testing primality
have surprisingly many applications for a problem long suspected
of being only of mathematical interest.
The security of the RSA public-key cryptography system
(see Section ) is based on the computational
intractability of factoring large integers.
As a more modest application, hash table performance
typically improves when the table size is a prime number.
To get this benefit, an initialization routine must identify a prime
near the desired table size.
Finally, prime numbers are just interesting to play with.
It is no
coincidence that programs to generate large primes often reside in the
games directory of UNIX systems.

Although factoring and primality testing are
related problems,
algorithmically they are quite different.
There exist algorithms that can demonstrate that an
integer is *composite* (i.e. not prime) without actually giving the
factors.
To convince yourself of the plausibility of this, note that you can
demonstrate the compositeness of any nontrivial integer whose last digit is
0, 2, 4, 5, 6, or 8 without doing the actual division.

The simplest algorithm for both of these problems is brute-force trial
division.
To factor *n*, compute the remainder of *n*/*i* for all
.
The prime factorization of *n* will contain at least one instance of every *i*
such that , unless *n* is prime.
Make sure you handle
the multiplicities correctly and account for any primes larger than
.

Such algorithms can be sped up by using a precomputed
table of small primes to avoid
testing all possible *i*.
Surprisingly large numbers of primes can be represented in surprisingly
little space by using bit vectors (see Section ).
A bit vector of all odd numbers less than 1,000,000 fits in
under 64 kilobytes.
Even tighter encodings become possible by eliminating all multiples of three
and other small primes.

Considerably faster factoring algorithms exist,
whose correctness depends upon more substantial number theory.
The fastest known algorithm, the *number field sieve*,
uses randomness to construct a system of congruences, the solution
of which usually gives a factor of the integer.
Integers with as many at 128 digits have been factored using this method,
although such feats require enormous amounts of computation.

Randomized algorithms make it much easier to test whether an integer is
prime.
Fermat's little theorem states that for all *a*,
when *n* is prime.
Suppose we pick a random value of and compute the residue
of .
If this residue is not 1, we have just proven that *n* cannot be prime.
Such randomized primality tests are very efficient.
PGP (see Section ) finds 300+ digit primes using
hundreds of these tests in minutes, for use as cryptographic keys.

Although the primes are scattered in a seemingly random way throughout
the integers, there is some regularity to their distribution.
The *prime number theorem* states that the number of primes
less than *n*, commonly denoted by , is approximately .
Further, there are never large gaps between primes, so
in general, one would expect to examine about integers
if one wanted to find the first prime larger than *n*.
This distribution and the fast randomized primality test
explain how PGP can find such large primes so quickly.

**Implementations**: My first choice for factoring or primality testing applications
would be
PARI, a system capable of handling complex number-theoretic
problems on integers with up to 300,000 decimal digits, as well as
reals, rationals, complex numbers, polynomials, and matrices.
It is written mainly in C, with assembly code for inner loops on
major architectures, and includes more
than 200 special predefined mathematical functions.
PARI can be used as a library, but it also possesses a calculator mode
that gives instant access to all the types and functions.
The main advantage of PARI is its speed.
On a Unix platform, it is between 5 to
100 times faster than Maple or Mathematica, depending on the applications.
PARI is available for PC, Amiga, Macintosh, and most Unix platforms
by anonymous ftp at ftp://megrez.ceremab.u-bordeaux.fr/pub/pari/.

A Mathematica implementation by Ilan Vardi of Lenstra's elliptic curve method of factorization is available in Packages/NumberTheory/FactorIntegerECM.m of the standard Mathematica distribution and MathSource. It is designed to find prime factors of up to about 18 digits in reasonable time, extending Mathematica's ability to factor all numbers of up to 40 digits. It is faster when factoring the product of small primes.

**Notes**: Bach and Shallit's book [BS96] is the most comprehensive
reference on computational number theory,
while Adleman's excellent survey [Adl94a]
describes the state of the art, as well as open problems.
Good expositions on modern algorithms for factoring and primality
testing include [CLR90].

The Miller-Rabin [Mil76, Rab80] randomized primality testing algorithm eliminates problems with Carmichael numbers, which are composite integers that always satisfy Fermat's theorem. The best algorithms for integer factorization include the quadratic-sieve [Pom84] and the elliptic-curve methods [HWL87].

Mechanical sieving devices provided the fastest way to factor integers surprisingly far into the computing era. See [SWM95] for a fascinating account of one such device, built during World War I. Hand-cranked, it proved the primality of in fifteen minutes of sieving time.

An important problem in computational complexity theory is whether
P = NP co-NP.
The decision problem ``is *n* a composite number?''
is perhaps the best candidate
for a counterexample.
By exhibiting the factors of *n*, it is trivially in NP.
It can be shown to be
in co-NP, since every prime has a short
proof of its primality [Pra75].
However, there is no evidence it is in P.
For more information on complexity classes, see [GJ79, Joh90].

A group headed by Arjen Lenstra has regularly broken records for general-purpose integer factoring, using an Internet-distributed implementation of the quadratic sieve factoring method. The June 1993 factorization of RSA-120 took approximately 830 MIP-years of computation. The April 1994 factorization of RSA-129, famous for appearing in the original RSA paper [RSA78], was factored in eight months using over 1,600 computers. This was particularly noteworthy because in [RSA78] they had originally predicted such a factorization would take 40 quadrillion years using 1970s technology.

**Related Problems**: Cryptography (see page ),
high precision arithmetic (see page ).

Mon Jun 2 23:33:50 EDT 1997