Let us assume that all ℓi are either integers or consist of lowercase English letter only. A label consisting of a mixture of digits and letters cannot be a valid label, and a label that begins with a 0
cannot be a valid label either. We can cut the sequence before the first such offending label before processing it in the ways described below.
For the easy subtask it was sufficient to try all reasonable possibilities (remember that k ≤ 2) and to pick the best one among them.
In other words, we can try the following cases:
Each time we gerate a sequence, we simply find its longest prefix that matches the input sequence.
The previous solution works but depending on the implementation and programming language used it can already be annoyingly slow. Luckily, there is a simple improvement that will speed up the search significantly.
Suppose we are in the case k = 1 and we found the smallest j such that the given label ℓj is a string. Then obviously d1 = j is the only possibility we need to try. Larger d1 won’t give us a better solution because its sequence will have the number j instead of the correct label ℓj in j-th position.
Then, in the case k = 2, this d1 is again the only possible d1 for the same reason. So instead of Θ(n2) pairs (d1, d2) we now only have to try Θ(n) possible d2 with this single d1.
Instead of applying brute force we will now try to deduce the di and si greedily. This solution will be a fairly straightforward generalization of our better solution for the easy subproblem.
We will go through the given labels from 1 to n. At the beginning, there are no pairs (dj, sj).
Each time we process a new label ℓi, there are two possibilities:
The label ℓi is an integer. In this case we need to check two things. First, i must not be a multiple of any of the dj we already have. Second, ℓi must actually be i, not some other integer. (Did you miss this on your first try?) If either check fails, we have no way of saving it, this is the end of the valid prefix.
The label ℓi is a string. In this case we look at the pairs (dj, sj) we already determined, and we use them to construct the (possibly empty) label ℓi′ we should see according to what we know. Now there are three cases:
This way we incrementally construct the only possible set of rules that corresponds to the prefix we already processed. Hence, when we get stuck, we can be certain that the solution we just found is optimal.
An alternate implementation of the same idea is that whenever you determine a new rule (di, si), you immediately look at all multiples of di. If the label of a multiple of di starts with si, you remove the prefix si and keep the rest of the label. And if it doesn’t, you cut off the input sequence at that point.