I'd been tipped off about a small printed-circuit board testing company nearby that was in need of some algorithmic consulting. And so I found myself inside a typically non-descript building in a typically non-descript industrial park, talking with the president of Integri-Test, along with one of his lead technical people.
``We're the leaders in robotic printed-circuit board testing devices. Our customers have very high reliability requirements for their PC-boards. They must check that each and every board has no wire breaks before filling it with components. This means testing that each and every pair of points on the board that are supposed to be connected are connected.''
``How do you do the testing?'' I asked.
``We have a robot with two arms, each with electric probes. To test whether two points are properly connected, the arms simultaneously contact both of the points. If the two points are propertly connected, then the probes will complete a circuit. For each net, we hold one arm fixed at one point and move the other to cover the rest of the points.''
``Wait!'' I cried. ``What is a net?''
Figure: An sample net showing (a) the metal connection layer, (b) the contact points, (c) their minimum spanning tree, and (d) the points partitioned into clusters
``On a circuit board there are certain sets of points that are all connected together with a metal layer. This is what we mean by a net. Sometimes a net consists of two points, i.e. is an isolated wire. Sometimes a net can have 100 to 200 points, like all the connections to power or ground.''
``I see. So you have a list of all the connections between pairs of points on the circuit board, and you want to trace out these wires.''
He shook his head. ``Not quite. The input for our testing program consists only of the net contact points, as shown in Figure (b). We don't know where the actual wires are, but we don't have to. All we have to do is verify that all the points in a net are connected together. We do this by putting the left robot arm on the leftmost point in the net, then having the right arm move around to all the other points in the net to test if they are connected to the left point. If they are all connected to the left point, it means that they must all be connected to each other.''
I thought for a moment about what this meant. ``OK. So your right arm has to visit all the other points in the net. How do you choose the order to visit them?''
The technical guy spoke up. ``Well, we sort the points from left to right and then go in that order. Is that a good thing to do?''
``Have you ever heard of the traveling salesman problem?'' I asked.
He was an electrical engineer, not a computer scientist. ``No, what's that?'' he asked.
``Traveling salesman is the name of the exact problem that you are trying to solve. Given a set of points you have to visit, how do you order them so as to minimize the travel time. Algorithms for the traveling salesman problem have been extensively studied. For small nets, by doing an exhaustive search you will be able to find the optimal tour. For big nets, there are heuristics that will get you very close to the optimal tour.'' I would have pointed them to Section if I had had this book handy.
The president scribbled down some notes and then frowned. ``Fine. Maybe you can order the points in a net better for us. But that is not our real problem. When you watch our robot in action, the right arm sometimes has to run all the way to the right side of the board on a given net, while the left arm just sits there. It seems we would benefit by breaking a net into smaller pieces to balance things out.''
I sat down and thought. The left and right arms were each going to have interlocking TSP problems to solve. The left arm would move between the leftmost points of each net, while the right arm was going to visit all the other points in each net as ordered by the left TSP tour. By breaking each net into smaller nets, so that each net occupies a small chunk of real estate, we would avoid making the right arm cross all the way across the board. Further, a lot of little nets meant there would be more points in the left TSP, so each left-arm movement was likely to be short, too.
``You are right. We should win if we can break big nets into small nets. We want the nets to be small, both in the number of points and in the area of the net. But we must be sure that if we validate the connectivity of each small net, we will have confirmed that the big net is connected. Whenever there is one point in common between two little nets, that is enough to show that the bigger net formed by the two little nets is connected, since current can flow between any pair of points.''
Now we had to break each net into overlapping pieces, where each piece was small. This is a clustering problem. Minimum spanning trees are often used for clustering, as discussed in Section . In fact, that was the answer! We could find the minimum spanning tree of the net points and break it into little clusters whenever a spanning tree edge got too long. As shown in Figure (d), each cluster would share exactly one point in common with another cluster, with connectivity ensured because we are covering the edges of a spanning tree. The shape of the clusters would reflect the points in the net, exactly as we would want. If the points lay along a line across the board, the minimum spanning tree would be a path, and the clusters would be pairs of points. If the points all fell in a tight region, there would be one nice fat cluster that the right arm would just scoot around.
So I explained the idea of constructing the minimum spanning tree of a graph. The boss listened, scribbled more notes, and frowned again.
``I like your clustering idea. But these minimum spanning trees you talk about are defined on graphs. All you got are points. Where do the weights of the edges come from?''
``Oh, we can think of it as a complete graph, where every pair of points are connected. The weight of the edge defined by the two points is simply the distance. Or is it...?''
I went back to thinking. The edge cost between two points should reflect the travel time between them. While the distance was related to the travel time, it wasn't necessarily exactly the same thing.
``Hey. I have a question about your robot. Does it take the same amount of time to move the arm left-right as it does up-down?''
They thought a minute. ``Yeah, it does. We use the same type of motors to control horizontal and vertical movements. Further, since the two motors for each arm are independent, we can simultaneously move each arm both horizontally and vertically.''
``That so? The time to move both one foot left and one foot up is exactly the same as just moving one foot left? This means that the weight cost for each edge in the graph should not be the Euclidean distance between the two points, but the biggest difference between either the x- or y-coordinate. This is something we call the metric, but we can capture it by changing the edge weights in the graph. Anything else funny about your robots?'' I asked.
``Well, it takes some time for the robot to come up to speed. We should probably also factor in acceleration and deceleration of the arms.''
``Darn right. The more accurately you can model the time your arm takes to move between two points, the better our solution will be. But now we have a very clean formulation. Let's code it up and let's see how well it works!''
They were somewhat skeptical whether this approach would do any good, but they agreed to think about it. A few weeks later they called me back and reported that the new algorithm reduced testing time by about 30% over their previous approach, at a cost of a little more computational preprocessing. However, since their testing machine costs $200,000 a pop and a PC costs $2,000, this is an excellent tradeoff. It is particularly advantageous since the preprocessing need only be done once when testing multiple instances of the same board.
The key idea leading to the successful solution was knowing how to model the job in terms of classical algorithmic graph problems. I smelled TSP the instant they started talking about minimizing robot motion. Once I realized that they were implicitly forming a star-shaped spanning tree to ensure connectivity, it was natural to ask whether a minimum spanning tree would perform any better. This idea led to a natural way to think about clustering, and thus partitioning each net into smaller nets. Finally, by carefully constructing our distance metric to accurately model the costs of the robot itself, we get to incorporate quite complicated properties (such as acceleration and differences between horizontal and vertical speeds) without changing our fundamental graph model or algorithm design.