Input description: A directed acyclic graph G=(V,E), where the vertices represent jobs and the edge (u,v) implies that task u must be completed before task v.
Problem description: What schedule of tasks completes the job using the minimum amount of time or processors?
Discussion: Devising a proper schedule to satisfy a set of constraints is fundamental to many applications. A critical aspect of any parallel processing system is the algorithm mapping tasks to processors. Poor scheduling can leave most of the expensive machine sitting idle while one bottleneck task is performed. Assigning people to jobs, meetings to rooms, or courses to final exam periods are all different examples of scheduling problems.
Scheduling problems differ widely in the nature of the constraints that must be satisfied and the type of schedule desired. For this reason, several other catalog problems have a direct application to various kinds of scheduling:
In this section, we focus on precedence-constrained scheduling problems for directed acyclic graphs. These problems are often called PERT/CPM, for Program Evaluation and Review Technique/Critical Path Method. Suppose you have broken a big job into a large number of smaller tasks. For each task you know how long it should take (or perhaps an upper bound on how long it might take). Further, for each pair of tasks you know whether it is essential that one task be performed before another. The fewer constraints we have to enforce, the better our schedule can be. These constraints must define a directed acyclic graph, acyclic because a cycle in the precedence constraints represents a Catch-22 situation that can never be resolved.
We are interested in several problems on these networks:
The minimum completion time for a DAG can be easily computed in O(n+m) time. Initialize the completion time for all the vertices to 0, except the start vertex, which is initialized to the length of the start task. Perform a topological sort to order the vertices such that all precedences will have been considered by the time we evaluate each vertex. For each vertex u, consider all the edges (u,v) leaving u. The completion time of these vertices is the maximum of the current completion time for v plus the completion time of u plus the task time of v.
An even more general formulation seeks critical paths in networks where certain jobs are restricted to certain people. Such networks are known as disjunctive networks. Finding critical paths on such networks is more complicated than CPM/PERT, but implementations are described below.
In any sufficiently large and sufficiently real scheduling application, there will be combinations of constraints that are difficult or impossible to model using these techniques. There are two reasonable ways to deal with such problems. First, we can ignore enough constraints that the problem reduces to one of the types that we have described here, solve it, and then see how bad it is using the other constraints. Perhaps the schedule can be easily modified by hand to satisfy constraints like keeping Joe and Bob apart so they can't kill each other. Another approach is to formulate your scheduling problem via linear-integer programming (see Section ) in all its complexity. This method can be better only if you really know enough about your desired schedule to formulate the right linear program, and if you have the time to wait for the program to give you the solution. I would start out with something simpler and see what happens first.
Another fundamental scheduling problem takes a set of jobs without precedence constraints and assign them to identical machines so as to minimize the total elapsed time. Consider a copy shop with k Xerox machines and a stack of jobs to finish by the end of the day. Such tasks are called job-shop scheduling. They can be modeled as bin-packing problems (see Section ), where each job is assigned a number equal to the number of hours it will take and each machine is represented by a bin with space equal to the number of hours in a day.
More sophisticated variations of job-shop scheduling provide each task with allowable start and required finishing times. Effective heuristics are known, based on sorting the tasks by size and finishing time. We refer the reader to the references for more information. Note that these scheduling problems become hard only when the tasks cannot be broken up onto multiple machines or interrupted (preempted) and then rescheduled. If your application has these degrees of freedom, you should exploit them.
Implementations: Pascal implementations of Balas's algorithm for disjunctive network scheduling and Hu's algorithm for assigning jobs to processors with precedence constraints appear in [SDK83]. See Section .
Algorithm 520 [WBCS77] of the Collected Algorithms of the ACM is a Fortran code for multiple-resource network scheduling. It is available from Netlib (see Section ).
Notes: The literature on scheduling algorithms is a vast one. For a more detailed survey of the field, we refer the reader to [Cof76, LLK83].
Good expositions on CPM/PERT include [Eve79a, Law76, PGD82, SDK83]. Good expositions on job-shop scheduling include [AC91, SDK83].
Related Problems: Topological sorting (see page ), matching (see page ), vertex coloring (see page ), edge coloring (see page ), bin packing (see page ).