Input description: A graph G = (V,E) with weighted edges.
Problem description: The subset of of minimum weight forming a tree on V.
Discussion: The minimum spanning tree (MST) of a graph defines the cheapest subset of edges that keeps the graph in one connected component. Telephone companies are particularly interested in minimum spanning trees, because the minimum spanning tree of a set of sites defines the wiring scheme that connects the sites using as little wire as possible. It is the mother of all network design problems.
Minimum spanning trees prove important for several reasons:
Two classical algorithms efficiently construct minimum spanning trees, namely Prim's and Kruskal's. Brief overviews of both algorithms are given below, with correctness arguments in Section . We refer the reader to the codes for implementation details.
Prim's algorithm starts with an arbitrary vertex v and ``grows'' a tree from it, repeatedly finding the lowest-cost edge that will link some new vertex into this tree. During execution we will label each vertex as either in the tree, fringe - meaning there exists an edge from a tree vertex, or unseen - meaning the vertex is more than one edge away from the tree.
Prim(G) Select an arbitrary vertex to start
While (there are fringe vertices)
select minimum-weight edge between tree and fringe
add the selected edge and vertex to the tree
This creates a spanning tree for any connected graph, since no cycle can be introduced via edges between tree and fringe vertices. That it is in fact a tree of minimum weight can be proven by contradiction, and the proof is in Section . With simple data structures, Prim's algorithm can be implemented in time.
Kruskal's algorithm is also greedy. It starts with each vertex as a separate tree and merges these trees together by repeatedly adding the lowest cost edge that merges two distinct subtrees (i.e. does not create a cycle).
Kruskal(G) Sort the edges in order of increasing weight
count=0
while (count < n-1) do
get next edge (v,w)
if (component (v) component(w))
add to T
component(v) = component(w)
The ``which component?'' tests are efficiently implemented using the union-find data structure of Section , to yield an algorithm.
Minimum spanning tree is only one of several spanning tree problems that arise in practice. The following questions will help you sort your way through them:
I personally find Kruskal's algorithm easier to understand and implement than Prim's, but that is just a matter of taste.
Implementations: Pascal implementations of Prim's, Kruskal's, and the Cheriton-Tarjan algorithm are provided in [MS91], along with extensive empirical analysis that shows that the implementation of Prim's algorithm with the appropriate priority queue is fastest on most graphs. See Section .
The Stanford GraphBase (see Section ) contains implementations of four different minimum spanning tree algorithms, and the result of timing experiments suggesting that Kruskal's algorithm is best. The results are reported in terms of memory accesses (mems) instead of seconds, to make them independent of processor speed.
A C++ implementation of Kruskal's algorithm is provided in LEDA (see Section ). Alternative implementations of Prim's and Kruskal's algorithms are provided in Pascal [SDK83] and C++ [Sed92]. See Section Section . XTango (see Section ) includes an animation of both Prim's and Kruskal's algorithms.
Algorithm 479 [Pag74] and Algorithm 613 [HJS84] of the Collected Algorithms of the ACM are Fortran codes for minimum spanning tree, the former in an implementation of a point clustering algorithm. They are available from Netlib (see Section ). A bare bones Fortran implementation is provided in [NW78], including the enumeration of all spanning trees. See Section .
Combinatorica [Ski90] provides Mathematica implementations of Kruskal's minimum spanning tree algorithm and quickly counting the number of spanning trees of a graph. See Section .
Notes: Good expositions on Prim's [Pri57] and Kruskal's [Kru56] algorithms will appear in any textbook on algorithms, but include [Baa88, CLR90, Man89, Tar83]. The fastest implementations of Prim's and Kruskal's algorithms use Fibonacci heaps [FT87]. Expositions of faster algorithms for geometric instances include [PS85].
A recent breakthrough on the minimum spanning tree problem is the linear-time randomized algorithm of Karger, Klein, and Tarjan [KKT95]. Simplifications will be needed before this becomes the algorithm of choice. The history of the minimum spanning tree problem dates back at least to Boruvka, in 1926, and is presented in [GH85]. Interestingly, it is Boruvka's algorithm that serves as the foundation to the new randomized one.
Fürer and Raghavachari [FR94] give an algorithm that constructs a spanning tree whose maximum degree is almost minimized, indeed is at most one more than the lowest-degree spanning tree. The situation is analogous to Vizing's theorem for edge coloring, which also gives an approximation algorithm to within additive factor one.
Minimum spanning tree algorithms have an interpretation in terms of matroids, which are systems of subsets closed under inclusion, for which the maximum weighted independent set can be found using a greedy algorithm. The connection between greedy algorithms and matroids was established by Edmonds [Edm71]. Expositions on the theory of matroids include [Law76, PS82].
Algorithms for generating spanning trees in order from minimum to maximum weight are presented in [Gab77]. Good expositions on the matrix-tree theorem, which counts the number of spanning trees of a graph, include [Eve79a].
Related Problems: Steiner tree (see page ), traveling salesman (see page ).