next up previous contents index CD CD Algorithms
Next: Set Packing Up: Set and String Problems Previous: Set and String Problems

Set Cover



Input description: A set of subsets tex2html_wrap_inline30613 of the universal set tex2html_wrap_inline30615 .

Problem description: What is the smallest subset T of S such that tex2html_wrap_inline30617 ?

Discussion: Set cover arises when you try to efficiently acquire or represent items that have been packaged in a fixed set of lots. You want to obtain all the items, while buying as few lots as possible. Finding a cover is easy, because you can always buy one of each lot. However, by finding a small set cover you can do the same job for less money.   

An interesting application of set cover is Boolean logic minimization. We are given a particular Boolean function of k variables, which for each of the tex2html_wrap_inline30619 possible input vectors describes whether the desired output is 0 or 1. We seek the simplest circuit that exactly implements this function. One approach is to find a disjunctive normal form (DNF) formula on the variables and their complements, such as tex2html_wrap_inline30621 . We could build one ``and'' term for each input vector and then ``or'' them all together, but we might save considerably by factoring out common subsets of variables. Given a set of feasible ``and'' terms, each of which covers a subset of the vectors we need, we seek to ``or'' together the smallest number of terms that realize the function. This is exactly the set cover problem.     

There are several variations of set cover problems to be aware of:

Since the vertex cover problem is NP-complete, the set cover problem must be at least as hard. In fact, it is somewhat harder. Approximation algorithms do no worse than twice optimal for vertex cover, but only a tex2html_wrap_inline30641 times optimal approximation algorithm exists for set cover.

The greedy heuristic is the right approach for set cover. Begin by placing the largest subset in the set cover, and then mark all its elements as covered. We will repeatedly add the subset containing the largest number of uncovered elements, until all elements are covered. This heuristic always gives a set cover using at most tex2html_wrap_inline30643 times as many sets as optimal, and in practice it usually does a lot better.  

The simplest implementation of the greedy heuristic sweeps through the entire input instance of m subsets for each greedy step. However, by using such data structures as linked lists and a bounded-height priority queue (see Section gif), the greedy heuristic can be implemented in O(S) time, where tex2html_wrap_inline30647 is the size of the input representation.  

It pays to check whether there are certain elements that exist in only a few subsets, ideally only one. If so, we should select the biggest subsets containing these elements at the very beginning. We will have to take them eventually, and they carry with them extra elements that we might have to pay to cover by waiting until later.

Simulated annealing is likely to produce somewhat better set covers than these simple heuristics, if that is important for your application. Backtracking can be used to guarantee you an optimal solution, but typically it is not worth the computational expense.   

Implementations: Pascal implementations of an exhaustive search algorithm for set packing, as well as heuristics for set cover, appear in [SDK83]. See Section gif.  

Notes: Survey articles on set cover include [BP76]. See [CK75] for a computational study of set cover algorithms. An excellent exposition on algorithms and reduction rules for set cover is presented in [SDK83].  

Good expositions of the greedy heuristic for set cover include [CLR90]. An example demonstrating that the greedy heuristic for set cover can be as bad as tex2html_wrap_inline30649 is presented in [Joh74, PS82]. This is not a defect of the heuristic. Indeed, it is provably hard to approximate set cover to within an approximation factor better than tex2html_wrap_inline30651 [LY93].  

Related Problems: Matching (see page gif), vertex cover (see page gif), set packing (see page gif).      

next up previous contents index CD CD Algorithms
Next: Set Packing Up: Set and String Problems Previous: Set and String Problems

Mon Jun 2 23:33:50 EDT 1997