Input description: A (weighted) graph G=(V,E) and integers j, k, and m.
Problem description: Partition the vertices into m subsets such that each subset has size at most j, while the cost of the edges spanning the subsets is bounded by k.
Discussion: Graph partitioning arises as a preprocessing step to divide-and-conquer algorithms, where it is often a good idea to break things into roughly equal-sized pieces. It also arises when dealing with extremely large graphs, when we need to cluster the vertices into logical components for storage (to improve virtual memory performance) or for drawing purposes (to collapse dense subgraphs into single nodes in order to reduce cluttering).
Several different flavors of graph partitioning arise depending on the desired objective function:
Certain special graphs always have small separators, which partition the vertices into balanced pieces. For any tree, there always exists a single vertex whose deletion partitions the tree so that no component contains more than n/2 of the original n vertices. These components need not always be connected. For example, consider the separating vertex of a star-shaped tree. However, the separating vertex can be found in linear time using depth first-search. Similarly, every planar graph has a set of vertices whose deletion leaves no component with more than 2n/3 vertices. Such separators provide a particularly useful way to decompose planar graphs.
The basic approach to dealing with graph partitioning or max-cut problems is to construct an initial partition of the vertices (either randomly or according to some problem-specific strategy) and then sweep through the vertices, deciding whether the size of the cut would increase or decrease if we moved this vertex over to the other side. The decision to move v can be made in time proportional to its degree by simply counting whether more of v's neighbors are on the same team as v or not. Of course, the desirable side for v will change if many of its neighbors jump, so multiple passes are likely to be needed before the process converges on a local optimum. Even such a local optimum can be arbitrarily far away from the global max-cut.
There are many variations of this basic procedure, by changing the order we test the vertices in or moving clusters of vertices simultaneously. Using some form of randomization, particularly simulated annealing, is almost certain to be a good idea.
Implementations: Jon Berry's implementations of several graph partitioning heuristics, including Kernighan-Lin, simulated annealing, and path optimization are available from http://www.elon.edu/users/f/berryj/www/.
A non-network-flow-based implementation of minimum cut is included with LEDA (see Section ).
Notes: The fundamental heuristic for graph partitioning is due to Kernighan and Lin [KL70]. Empirical results on graph partitioning heuristics include [BG95, LR93].
The planar separator theorem and an efficient algorithm for finding such a separator are due to Lipton and Tarjan [LT79, LT80]. Although network flow can be used to find minimum cut sets in graphs, faster algorithms are available, including [SW94] and [Kar96a].
Expositions on the hardness of max-cut [Kar72] include [Eve79a]. Note that any random vertex partition will expect to cut half of the edges in the graph, since the probability that the two vertices defining an edge end up on different sides of the partition is 1/2. Goemans and Williamson [GW95] gave an 0.878-factor approximation algorithm for maximum-cut, based on semidefinite programming techniques. Tighter analysis of this algorithm followed by Karloff [Kar96b].
Related Problems: Edge/vertex connectivity (see page ), network flow (see page ).