IPSC 2005 Solution to Problem F – Find The Right Words

IPSC 2005

Solution to Problem F – Find The Right Words

C++ code that solves both data sets

This task belonged to the harder ones. At first we make a complete weighted bipartite graph. Its first partition consists of the letters chosen by one player. The second partition consists of words from the wordlist which contain the 2 letters chosen at the beginning of the game. The weight of an edge between a letter c and a word w is |M|+1-|w| (where M is the longest word and |w| denotes the length of the word w) if w contains c and 0 otherwise. Then we find a perfect matching in this graph with maximum weight. A matching in a graph is a set of pairwise disjoint edges. A perfect matching is a matching which contains all vertices of one (the smaller) partition. The weight of a matching is the sum of weights of its edges. We will find a perfect matching with maximum possible weight. It is clear that the words in the maximum matching are different, contain all the characters and the sum of their lengths is minimal because otherwise the matching formed by the words with smaller sum of lengths would have higher weight. If any of the characters is matched with a word with an edge with weight 0 then there exists no solution to the task and we will output -1.

So it is enough to solve the maximum perfect matching problem for weighted complete bipartite graphs. Let the partitions have sizes N₁ and N₂ respectively (N₁<=N₂). We will show an O((N₁+N₂)⁴)$ algorithm. Note that there are more efficient algorithms known, but this one was already fast enough.

The term size of a matching will denote the number of edges in it. The optimal matching of size k (M_k) is a matching whose weight is minimal over all possible matchings of size k. We will find optimal matchings of sizes 0, 1, ... , N₁. M₀ is clearly an empty set and M_N₁ is the answer to our problem. We assume that we have found M_k and we find now M_k+1.

Let's make some definitions first. Let's have a matching M. Matched edge (vertex) with respect to M is an edge (vertex) in M. An alternating path is a path whose starts at an unmatched vertex in one partition and ends at an unmatched vertex in the second partition and the edges along it alternate between unmatched and matched, that is the first edge along the path is unmatched, the second is matched, the third is unmatched and so on. Let P_M denote the unmatched edges and Q_M the matched edges in M of some path P. The cost of P (cost(P)) is the sum of the weights of the edges in P_M minus the sum of the weights of the edges in Q_M. cost(P)=wt(P_M)-wt(Q_M). By augmenting a matching with an alternating path P we will denote the "flipping" of the edges in path P - that is every matched edge in P will become unmatched and vice versa. An augmenting path is an alternating path which will increase the weight of the matching by augmenting the matching with this path.

Let's go back to the problem of finding M_k+1 from M_k. We find the augmenting path P with the largest cost. Then we augment M_k with P to obtain M_k+1. By definition of cost(P) it follows that wt(M_k+1)=wt(M_k)+cost(P). It is easy to prove that M_k+1 is indeed the maximum matching of size k+1. So we have to find the augmenting path with maximum cost. One possible way to do it is to collapse all unmatched vertices into two new vertices U_L and U_R. Then we direct all matched edges from left to right retaining their weights and direct all unmatched edges from right to left and assign them their negative weight. Than we will find the shortest path between U_L and U_R and it is clear that this path will be the augmenting path with the largest cost. The resulting graph will have no cycles so we can do it by Bellman-Ford algorithm in O(NM) which results to an algorithm in O(N²M).

The second aproach which is used in the example solution works as follows: We will do it in k iterations. In the i-th (0<=i<=k) iteration we will find for every vertex w_r in one of the partitions an augmenting path with largest cost to some unmatched vertex in the second partition considering only paths of length 2i+1 and shorter. Let's call it c_i(w_r). The length of any augmenting path can't be longer than 2k+1. The 0-th iteration is easy. For every vertex w_r in one of the partitions we will find its unmatched neighbour connected with an edge with maximum weight. The i-th iteration works as follows: We check for every w_r whether there exists matched w_s in the same partition such that: c_i-1(w_s)+wt(w_r,mate(w_s))-wt(w_s,mate(w_s)) > c_i-1(w_r) (mate(w_s) denotes the matched neighbour of vertex w_s), i.e. there is more expensive alternating path to w_r that comes directly from w_s via mate(w_s) - if so, we update c_i(w_r) accordingly. Then we will choose such unmatched w_r whose augmenting path has largest cost. Every iteration runs in O(N²), there are at most N iterations for every M_k and 0<=k<=N so our algorithm runs in O(N⁴).

Internet Problem Solving Contest

IPSC 2005

Solution to Problem F – Find The Right Words