        Next: Matching Up: Graph Problems: Polynomial-Time Previous: Shortest Path

## Transitive Closure and Reduction Input description: A directed graph G=(V,E).

Problem description: For transitive closure, construct a graph G'=(V,E') with edge iff there is a directed path from i to j in G. For transitive reduction, construct a small graph G'=(V,E') with a directed path from i to j in G' iff .

Discussion: Transitive closure can be thought of as establishing a data structure that makes it possible to solve reachability questions (can I get to x from y?) efficiently. After the preprocessing of constructing the transitive closure, all reachability queries can be answered in constant time by simply reporting a matrix entry.     Transitive closure is fundamental in propagating the consequences of modified attributes of a graph G. For example, consider the graph underlying any spreadsheet model, where the vertices are cells and there is an edge from cell i to cell j if the result of cell j depends on cell i. When the value of a given cell is modified, the values of all reachable cells must also be updated. The identity of these cells is revealed by the transitive closure of G. Many database problems reduce to computing transitive closures, for analogous reasons.

There are three basic algorithms for computing transitive closure:

• The simplest algorithm just performs a breadth-first or depth-first search from each vertex and keeps track of all vertices encountered.   Doing n such traversals gives an O(n (n+m) ) algorithm, which degenerates to cubic time if the graph is dense. This algorithm is easily implemented, runs well on sparse graphs, and is likely the right answer for your application.
• If the transitive closure of G will be dense, a better algorithm exploits the fact that the strongly connected components of G can be computed in linear time (see Section ).   All pairs of vertices in each strongly connected component are mutually reachable. Further, if (x,y) is an edge between two vertices in different strongly connected components, every vertex in y's component is reachable from each vertex in x's component. Thus the problem reduces to finding the transitive closure on a graph of strongly connected components, which should have considerably fewer edges and vertices than G.
• Warshall's algorithm constructs transitive closures in with a simple, slick algorithm that is identical to Floyd's all-pairs-shortest-path algorithm of Section .    If we are interested only in the transitive closure, and not the length of the resulting paths, we can reduce storage by retaining only one bit for each matrix element. Thus iff j is reachable from i using only vertices as intermediates.

Another related algorithm, discussed in the references, runs in the same time as matrix multiplication.   You might conceivably win for large n by using Strassen's fast matrix multiplication algorithm, although I for one wouldn't bother trying.   Since transitive closure is provably as hard as matrix multiplication, there is little hope for a significantly faster algorithm.

Transitive reduction (also known as minimum equivalent digraph) is essentially the inverse operation of transitive closure, namely reducing the number of edges while maintaining identical reachability properties.   The transitive closure of G is identical to the transitive closure of the transitive reduction of G.   The primary application of transitive reduction is space minimization, by eliminating redundant edges from G that do not effect reachability.   Transitive reduction also arises in graph drawing, where it is important to eliminate as many unnecessary edges as possible in order to reduce the visual clutter.

Although the transitive closure of G is uniquely defined, a graph may have many different transitive reductions, including G itself. We want the smallest such reduction, but there can be multiple formulations of the problem:

• A linear-time, quick-and-dirty transitive reduction algorithm identifies the strongly connected components of G, replaces each by a simple directed cycle, and adds these edges to those bridging the different components. Although this reduction is not provably minimal, it is likely to be pretty close on typical graphs.

One catch with this heuristic is that it can add edges to the transitive reduction of G that are not in G. This may or may not be a problem for your application.

• If, in fact, all edges of the transitive reduction of G must be in G, we must abandon hope of finding the minimum size reduction. To see why, consider a directed graph consisting of one strongly connected component, so that every vertex can reach every other vertex. The smallest possible transitive reduction will be a simple directed cycle, consisting of exactly n edges. This is possible as a subset of edges only if G is Hamiltonian, thus proving that finding the smallest subset of edges is NP-complete.

A heuristic for finding such a transitive reduction is to consider each edge successively and delete it if its removal does not change the transitive reduction. Implementing this efficiently means minimizing the time spent on reachability tests. Observe that directed edge (i,j) can be eliminated whenever there is another path from i to j avoiding this edge.

• If we are allowed to have arbitrary pairs of vertices as edges in the reduction and need the minimum size reduction, it can be found in time. See the references below for details. However, the quick-and-dirty algorithm above will likely suffice for most applications and will be easier to program as well as more efficient.

Implementations:   LEDA (see Section ) provides an implementation of transitive closure in C++ using O(n m) time [GK79].

Combinatorica [Ski90] provides Mathematica implementations of transitive closure and reduction, as well as the display of partial orders requiring transitive reduction. See Section .

Notes: Van Leeuwen [vL90a] provides an excellent survey on transitive closure and reduction, including 33 references. Good expositions of Warshall's algorithm [War62] include [Baa88, CLR90, Man89]. The equivalence between matrix multiplication and transitive closure was proven by Fischer and Meyer [FM71], with good expositions including [AHU74].

The equivalence between transitive closure and reduction, as well as the reduction algorithm, was established in [AGU72]. Empirical studies of transitive closure algorithms include [SD75].

Estimating the size of the transitive closure is important in database query optimization.     A linear-time algorithm for estimating the size of the closure is given by Cohen [Coh94].

Related Problems: Connected components (see page ), shortest path (see page ).        Next: Matching Up: Graph Problems: Polynomial-Time Previous: Shortest Path

Algorithms
Mon Jun 2 23:33:50 EDT 1997