I'll confess that I'm always surprised to see graduate students stop by my office early in the morning. Something about working through the night makes that time inconvenient for them. But there they were, two future Ph.D.s working in the field of high-performance computer graphics. They studied new techniques for rendering pretty computer images. The picture they painted for me that morning was anything but pretty, however.
``You see, we want to build a program to morph one image into another,'' they explained.
``What do you mean by morph?'' I asked.
``For special effects in movies, we want to construct the intermediate stages in transforming one image into another. Suppose we want to turn you into Humphrey Bogart. For this to look realistic, we must construct a bunch of in-between frames that start out looking like you and end up looking like him.''
``If you can realistically turn me into Bogart, you have something,'' I agreed.
``But our problem is that it isn't very realistic.'' They showed me a dismal morph between two images. ``The trouble is that we must find the right correspondence between features in the two images. It looks real bad when we get the correspondence wrong and try to morph a lip into an ear.''
``I'll bet. So you want me to give you an algorithm for matching up lips?''
Figure: A successful alignment of two lines of pixels
``No, even simpler. We just morph each row of the initial image into the identical row of the final image. You can assume that we give you two lines of pixels, and you have to find the best possible match between the dark pixels in a row from object A to the dark pixels in the corresponding row of object B. Like this,'' they said, showing me images of successful matchings like those in Figure .
``I see,'' I said. ``You want to match big dark regions to big dark regions and small dark regions to small dark regions.''
``Yes, but only if the matching doesn't shift them too much to the left or the right. We might prefer to merge or break up regions rather than shift them too far away, since that might mean matching a chin to an eyebrow. What is the best way to do it?''
``One last question. Will you ever want to match two intervals to each other in such a way that they cross?''
``No, I guess not. Crossing intervals can't match. It would be like switching your left and right eyes.''
I scratched my chin and tried to look puzzled, but I'm just not as good an actor as Bogart. I'd had a hunch about what needed to be done the instant they started talking about lines of pixels. They want to transform one array of pixels into another array, with the minimum amount of changes. That sounded like editing one string of pixels into another string, which is a classic application of dynamic programming. See Sections and for discussions of approximate string matching.
The fact that the intervals couldn't cross just sealed things up. It meant that whenever a stretch of dark pixels from A was mapped to a stretch from B, the problem would be split into two smaller subproblems, i.e. the pixels to the left of the match and the pixels to the right of the match. The cost of the global match would ultimately be the cost of this match plus those of matching all the pixels to the left and of matching all the pixels to the right. Constructing the optimal match on the left side is a smaller problem and hence simpler. Further, there can be only possible left subproblems, since each is completely described by the pair of one of n top pixels and one of n bottom pixels.
``Your algorithm will be based on dynamic programming,'' I pronounced. ``However, there are several possible ways to do things, depending upon whether you want to edit pixels or runs. I would probably convert each row into a list of black pixel runs, with the runs sorted by right endpoint and each run labeled with its starting position and length. You will maintain the cost of the cheapest match between the leftmost i runs and the leftmost j runs for all i and j. The possible edit operations are:
Figure: Morphing a lobster into a head via dynamic programming
``For each pair of runs (i,j) and all the cases that apply, we compute the cost of the edit operation and add to the (already computed and stored) edit cost to the left of the start of the edit. The cheapest of these cases is what we will take for the cost of c[i,j].''
The pair of graduate students scribbled this down, then frowned. ``So we are going to have a cost measure for matching two runs that is a function of their lengths and positions. How do we decide what the relative costs should be?''
``That is your business. The dynamic programming serves to optimize the matchings once you know the cost functions. It is up to your aesthetic sense to decide what penalties there should be for line length changes or offsets. My recommendation is that you implement the dynamic programming and try different values for the constants effecting the relative penalties on each of several different images. Then pick the setting that seems to do what you want.''
They looked at each other and smiled, then ran back into the lab to implement it. Using dynamic programming to do their alignments, they completed their morphing system, which is described in [HWK94]. They produced the images shown in Figure , morphing a lobster into a man. Unfortunately, they never got around to turning me into Humphrey Bogart.