First of all, we should realize that there is no need for an adaptive strategy. In other words, we don’t have to see the “outcome” of day 1 before determining the optimal order on day 2. This is because there isn’t actually any hidden outcome. We know what we’ll get as soon as we make the order. For example, if the first day order is (1, 1, 2, 3, 4), we know that after we get the meals:
Thus, the entire optimal strategy can be determined in advance, as a sequence of ℓ fixed orders.
How can we tell whether a sequence of orders solves our problem? First of all, each dish must have been ordered at least once. Second, we need to be able to tell each pair of dishes apart. In other words, for each pair of dishes there has to be a day when we ordered different amounts of the two dishes.
This observation alone turns out to be sufficient to solve the easy subproblem. The constraints are small enough, so if you try a sufficient amount of random orders, you are virtually certain to discover one that’s optimal – there are always many of them.
Of course, we can also solve the easy subproblem in a deterministic way: by traversing the entire state space. Imagine any situation where you already spent some days on making orders. How does the information on the dishes look like? The dishes are divided into some equivalence classes. Each equivalence class is a set of dishes we still cannot tell apart – because so far we ordered the same quantity of each of them on each day. For example, after three days there will be one (possibly empty) equivalence class of dishes such that you ordered once on day 1, not at all on day 2, and three times on day 3.
Each division of dishes into equivalence classes represents one possible state. And for each state we want to answer the same question: “What is the smallest number of additional days we need to get from this state to the end of the game?”
This can be done using recursion with memoization. To solve a specific state that is not the final state, try all possible orders for the next day. Each of them will bring you into some new state. Solve each of those recursively, and then your answer is 1 + the minimum of their solutions.
There are several things to note here. First, when processing a given state we do not actually need to know the sequence of orders that led to the state. All the information is already contained in the division of dishes into equivalence classes.
Second, we do not actually care about which dish belongs to which equivalence class. The answer for the state “we still cannot tell dish 1 from dish 2 and we still cannot tell the dishes 3, 4, 5 apart” is clearly the same as the answer for the state where dishes 1 and 4 are still equivalent and dishes 2, 3, and 5 are still equivalent. Once we get rid of this symmetry, we see that our states correspond to integer partitions: different ways of writing a given positive integer (in our case, d) as a sum of positive integers given in non-descending order.
Finally, there is one technicality we need to pay attention to: one of our equivalence classes is actually special. This is the equivalence class of all dishes that have never been ordered. We need to keep track of its size separately, and we need to make sure that in the final state this class is empty.
One good thing to know about integer partitions: even though their number grows exponentially, the base of that exponential function is quite small, and therefore the partition numbers are surprisingly small as well. With 20 dishes, and even with the need to keep one of the partitions separate from the others, we are only looking at a few thousand states.
One slight issue is the number of transitions: the number of possible orders is pd, so we need to bring that under control as well. How can we do that?
One trick that can be used in similar situations: increase the number of states! Instead of constructing and evaluating the entire order for a day at once, we can take the old state and process it incrementally: one equivalence class of dishes at a time. When processing an equivalence class of dishes, we decide how many of the people we have will order dishes from this class, and then we try all distinct possibilites – which can now, again, be seen as integer partitions. (This time, we are partitioning the people we have into groups that each orders a different dish. Hence, we are only interested in partitions that have at most as many elements as the number of dishes in our current equivalence class.)
Hence, in the new solution the general state will consist of four parts: some equivalence classes after the current day (those we already processed), some equivalence classes before the current day (waiting to be processed), the one special equivalence class (dishes that weren’t ever ordered; this one is also waiting to be processed), and the number of people who still have to make their order for this day.
This change will increase the number of states to about 150 000 for the largest test case, but now we know that the number of transitions from each state is upper-bounded by the number of integer partitions of people, and that makes the solution fast enough.