Input description: Two very large integers, x and y.
Problem description: What is x+y, x-y, , and x / y?
Discussion: Any programming language whose level rises above basic assembler supports single- and perhaps double-precision integer/real addition, subtraction, multiplication, and division. But what if we wanted to represent the national debt of the United States in pennies? One trillion dollars worth of pennies requires 15 decimal digits, which is far more than can fit into a 32-bit integer.
In other applications much larger integers are needed. The RSA algorithm for public-key cryptography requires integer keys of at least 100 digits to achieve any level of security, and 1000 digits are recommended. Experimenting with number-theoretic conjectures for fun or research always requires playing with large numbers. I once solved a minor open problem [GKP89] by performing an exact computation on the integer .
What should you do when you need large integers?
If instead you have an embedded application requiring high-precision arithmetic, you would be well advised to use an existing library. In addition to the four basic operations, you are likely to get additional functions for computing things like greatest common divisor in the bargain. See the implementations below for details.
Why? The higher the base, the fewer digits we need to represent the number (compare 64 decimal with 1000000 binary). Since hardware addition usually takes one clock cycle independent of the actual numbers, best performance is achieved using the highest base. The reason for limiting us to is that in performing high-precision multiplication, we will multiply two of these ``digits'' together and need to avoid overflow.
The only complication of using a larger base is that integers must be converted to and from base-10 for input and output, but the conversion is easily performed once all four high-precision arithmetical operations are supported.
The algorithm of choice for each of the five basic arithmetic operations is as follows:
In fact, integer division can be reduced to integer multiplication, although in a nontrivial way, so if you are implementing asymptotically fast multiplication, you can reuse that effort in long division. See the references below for details.
High- but not arbitrary-precision arithmetic can be conveniently performed using the Chinese remainder theorem and modular arithmetic. The Chinese remainder theorem states that an integer between 1 and is uniquely determined by its set of residues mod , where each are relatively prime integers. Addition, subtraction, and multiplication (but not division) can be supported using such residue systems, with the advantage that large integers can be manipulated without complicated data structures.
Many of these algorithms for computations on long integers can be directly applied to computations on polynomials. See the references for more details. A particularly useful algorithm is Horner's rule for fast polynomial evaluation. When is blindly evaluated term by term, multiplications will be performed. Much better is observing that , the evaluation of which uses only a linear number of operations.
Implementations: All major commercial computer algebra systems incorporate high-precision arithmetic, including Maple, Mathematica, Axiom, and Macsyma. If you have access to one of these, this is your best option for a quick, nonembedded application. The rest of this section focuses on source code available for embedded applications.
PARI, developed by Henri Cohen and his colleagues in France, is a system capable of handling complex number-theoretic problems on integers with up to 300,000 decimal digits, as well as reals, rationals, complex numbers, polynomials, and matrices. It is probably the most powerful free software available for number theory. Written mainly in C (with assembly-language code for speed-critical routines), it includes more than 200 special predefined mathematical functions. PARI can be used as a library, but it possesses also a powerful calculator mode that gives instant access to all the types and functions. The main advantage of PARI is its speed. On a Unix platform, it runs between 5 to 100 times faster than Maple or Mathematica, depending on the applications. PARI is available for PC, Amiga, Macintosh, and most Unix platforms by anonymous ftp at ftp://megrez.ceremab.u-bordeaux.fr/pub/pari/.
Algorithm 693 [Smi91] of the Collected Algorithms of the ACM is a Fortran implementation of floating-point, multiple-precision arithmetic. See Section .
An implementation of arbitrary-precision integer and rational arithmetic in C++ is embedded in LEDA (see Section ), including GCD, square roots, and logarithms as well as the basic four operations. Sparc assembler code is used for certain time-critical functions.
Implementations in C of a high-precision calculator with all four elementary operations appear in [BR95]. The authors use base 10 for arithmetic and arrays of digits to represent long integers, with short integers as array indices, thus limiting computations to 32,768 digits. The code for these algorithms is printed in the text and available on disk for a modest fee.
Bare bones implementations in C of high-precision multiplication and in Pascal of such special functions as logarithm and arctangent appear in [GBY91]. See Section for further details. Sedgewick [Sed92] provides a bare bones implementation of polynomial arithmetic in C++. See Section for details.
Notes: Knuth [Knu81] is the primary reference on algorithms for all basic arithmetic operations, including implementations of them in the MIX assembly language. Bach and Shallit [BS96] provide a more recent treatment of computational number theory.
Expositions on the -time divide-and-conquer algorithm for multiplication [KO63] include [AHU74, Man89]. An FFT-based algorithm multiplies two n-bit numbers in time and is due to Schönhage and Strassen [SS71]. Expositions include [AHU74, Knu81]. The reduction between integer division and multiplication is presented in [AHU74, Knu81].
Good expositions of algorithms for modular arithmetic and the Chinese remainder theorem include [AHU74, CLR90]. A good exposition of circuit-level algorithms for elementary arithmetic algorithms is [CLR90].
Euclid's algorithm for computing the greatest common divisor of two numbers is perhaps the oldest interesting algorithm. Expositions include [CLR90, Man89].
Related Problems: Factoring integers (see page ), cryptography (see page ).