Figure: A pretty bandwidth-4 layout of a binary tree atop an ugly bandwidth-3 layout
To better demonstrate the power of pruning and symmetry detection, let's apply these ideas to producing a search program that solves the bandwidth minimization problem, discussed in detail in catalog Section . I annually run competitions for the fastest bandwidth-minimization program for students in my algorithms courses; the timings below are drawn from these experiences.
The bandwidth problem takes as input a graph G, with n vertices and m edges. The goal is to find a permutation of the vertices on the line that minimizes the maximum length of any edge. Figure gives two distinct layouts of a complete binary tree on 15 vertices. The clean, neat layout on the top has a longest edge of length 4, but the seemingly cramped layout on the bottom realizes the optimal bandwidth of 3.
The bandwidth problem has a variety of applications, including circuit layout, linear algebra, and optimizing memory usage in hypertext documents. The problem is NP-complete, which implies that no polynomial time worst-case algorithm is known for the problem. It remains NP-complete even for very restricted classes of trees.
Since the bandwidth problem seeks a particular permutation, a backtracking program that iterates through all the n! possible permutations and computes the length of the longest edge for each gives a straightforward algorithm. Depending upon how well it is programmed, and how fast a machine it is running on, such an algorithm can expect to solve instances of approximately 8 to 12 vertices within one CPU minute.
To speed up this search, we can try to exploit symmetry. For any permutation p, its reverse permutation will realize the exact same bandwidth, since the length of each edge is the same. Hence we can immediately eliminate half of our search space. The reverse copies are easily removed by placing the leftmost and rightmost elements of the permutation as the first two elements of the vector and then pruning if . Because we are dealing with an exponential search, removing a single factor of two can only be of limited usefulness. Such symmetry elimination might add one to the size of the problem we can do within one CPU minute.
For more serious speedups, we need to prune partial solutions. Say we have found a permutation p that yields a longest edge of . By definition, . Suppose that among the elements in a partial layout , where k < n, there is an edge that is at least in length. Can this partial solution expand to provide a better bandwidth solution? Of course not! By pruning the search the instant we have created a long edge, typical instances of 15 to 20 vertices can be solved in one CPU minute, thus providing a substantial improvement.
Efforts to further improve the search algorithm must strive for even greater pruning. By using a heuristic method to find the best solution we can before starting to search, we save time by realizing early length cutoffs. In fact, most of the effort in a combinatorial search is typically spent after the optimal solution is found, in the course of proving that no better answer exists. By observing that the optimal bandwidth solution must always be at least half the degree of any vertex (think about the incident edges), we have a lower bound on the size of the optimal solution. We can terminate search soon as we find a solution matching the lower bound.
One limitation of this pruning strategy is that only partial solutions of length >b can be pruned, where b is the bandwidth of the best solution to date, since we must place b+1 vertices before we can generate any edges of length at least b. To achieve earlier cutoffs, we can alternately fill in the leftmost and rightmost slots of the configuration, instead of always proceeding from the left. This way, whenever there is an edge between a vertex on the left side and a vertex on the right side, this edge is likely long enough to achieve a cutoff. Pruning can easily occur while positioning the second vertex in the solution vector.
Using these enhancements, top-notch programs are capable of solving typical problems on up to 30 vertices consistently within one CPU minute, operating literally millions of times faster than unpruned, untuned efforts. The speed difference between the final and initial versions of the program dwarf the difference between a supercomputer and a microcomputer. Clever search algorithms can easily have a bigger impact on performance than expensive hardware.