This problem introduced our version of Penney’s Game. An interesting feature of this game is that it is nontransitive. What does that mean? Intuitively, one might expect that there is an ordering of all possible patterns from the “best” one to the “worst” one, and that the optimal strategy for Yvonne is to simply pick the “best” pattern. This is not the case. Starting from k = 3, for any pattern there are other patterns that beat it – in other words, regardless of what Yvonne choses, Zara can always choose a pattern that is designed to beat Yvonne’s specific pattern. Thus, in Penney’s game for k ≥ 3 Zara wins more than 50 percent of all games if she plays optimally.
Of course, our version of the game mixes stuff up a bit. There are two main differences between our game and the original. First, our game is finite: we will run out of cards eventually. Second, choosing a sequence removes some cards from the deck, which influences the probabilities.
The example annotation in the problem statement already contains an important hint: a good strategy for Zara. Regardless of what sequence Yvonne chooses, Zara always has the option to choose the complement of that sequence. (The cards to do that have to be available – do you see why?) And if she does so, she is guaranteed a winning probability of exactly 0.5. Thus, for any valid k there are only two possibilities – either the game is a tie, or Zara has an even better strategy that tilts the game in her favor.
We already know what happens for small k. For k = 1 and k = 2 the game is a tie. For k = 3 in the original game Zara wins 2/3 of all games if both girls play optimally, and we may expect a very similar result here – after both girls choose their sequences, there are still many cards left in the deck and the ratio of reds to blacks is roughly 1:1, so this won’t skew the probability by a lot. Thus, we may expect that for smallish k Zara should win.
What happens for largeish values of k? The easy thing to note is k ≥ 18. Already for k = 18 after both girls choose their sequences there will only be 16 cards left in the deck, which isn’t enough to construct either of the two chosen sequences. Thus, for k ≥ 18 the game will surely end with a coin flip, and therefore it is a tie.
IF you got this far, this was a good place to solve the subproblem L1 by taking an educated guess that Zara wins for 3 ≤ k ≤ 17. And if this doesn’t work, we can always start from k = 17 downwards, change Zara’s win to a tie and resubmit. This strategy would indeed solve L1, albeit with one bad submission. It turns out that the case k = 17 is still a tie.
We will now show that k = 17 is still a tie if both girls play optimally. One of the optimal choices for Yvonne is the sequence “9 times red, then 8 times black”.
Let’s examine Zara’s options:
If she chooses any sequence with 9 or more red cards, there will be too few red cards left in the deck for either sequence to appear. Hence, the coin flip will decide.
If she chooses any sequence with 10 or more black cards, there will be too few black cards left in the deck for Zara’s sequence to appear, so her probability of winning has to be at most 50%.
The only remaining option is some sequence with 8 red and 9 black cards. What happens then? The deck will contain 9 red and 9 black cards, so there are $18\choose 9$ equally likely possible orders of the deck. Two of these contain an occurrence of Yvonne’s sequence and two contain an occurrence of Zara’s sequence. If those four deck orders are all distinct, the situation is clearly symmetric and each girl wins with probability 50%.
For some pairs (Yvonne’s sequence, Zara’s sequence) it will be the case that some order of the 18 cards in the deck will contain both sequences – cards #1 through #17 will form one of the sequences, cards #2 through #18 will form the other sequence. In that case, the winner will be the girl whose sequence appears sooner, and this will skew the overall probability in her favor.
For Yvonne’s sequence we mentioned above (9 times red, then 8 times black) this can happen in only one way: the way where Yvonne wins. (If you place Yvonne’s sequence as cards #2 through #18 in the deck, there will still be 9 red cards among the first 17, and therefore it cannot be Zara’s sequence.)
Hence, if Yvonne chooses the sequence mentioned above, Zara only has a choice between some patterns that win her the game with probability 50% and some patterns that are even worse than that. And this means that (assuming both girls play optimally) the game is a tie for k = 17.
OK, that was as far as we can reasonably go without writing any code. Let’s now take a look at evaluating a single game. That is, we ask the following question: given k and the patterns chosen by Yvonne and Zara, what is the probability that Yvonne wins the game (assuming optimal play)?
How can we answer this question? Obviously, by going through all possible deck orders one at a time. There are 52 − 2k cards in the deck, so there are roughly 252 − 2k possible deck orders. The actual number is a bit smaller, as the number of red and black cards in the deck is fixed, but this is still a useful estimate. Anything close to 246 for k = 3 is more than we can reasonably afford.
How can we answer the question we have in an efficient way? By realizing that during those 252 − 2k many situations are often repeated, and therefore we can use dynamic programming to speed up the previous solution. One possible observation is that during the game all we need to know are the last k cards that were already played (or fewer if the game is just starting) and the number of red and black cards left in the deck. This gives us only something like 2k ⋅ 262 states to examine – which is much better for small k.
However, there is also a much more compact way to represent the state. For example, imagine that Yvonne has the sequence RBRB and Zara has the sequence BBRR. Assume that the last seven cards played from the deck were RRRBBRB. What do we really need to remember about this sequence? Yvonne doesn’t care about the first five of those cards, as those clearly won’t help her form an occurrence of her pattern. The only thing Yvonne currently cares about is that the last two cards (RB) can be extended to form an occurrence of her pattern. Similarly, Zara doesn’t care about the first six cards played from the deck – only the last B can be used by her.
This observation brings us to a much more compact representation of the state during a game: in addition to the number of red and black cards left in the deck, all we need to remember is the longest suffix of the sequence of already played cards that is also a prefix of one of the girls’ sequences.
And as there are at most 2k + 1 distinct prefixes (we only have two patterns, each of length k), this leaves us with at most (2k + 1)⋅262 different states of the game – which is very small for any valid value of k. Thus, we can now take any two patterns of red and black cards and very quickly compute the probability that Yvonne wins.
In order to compute this probability as an exact fraction, just note that we can represent it as (the number of games Yvonne wins) / (the number of possible deck orders × the number of possible coin flips). For a deck with r red and b black cards the denominator is simply 2(r + b)!. We can then use the dynamic programming described above to compute the numerator of this fraction.
In order to find the result if both girls play optimally, we need to evaluate a simple minimax decision tree. This can be expressed in pseudocode as follows:
best_result_yvonne = 0
for each possible pattern Yvonne can choose:
best_result_zara = 1
for each possible pattern Zara can now choose:
pp = probability that Yvonne wins for those two patterns
best_result_zara = min( best_result_zara, pp )
best_result_yvonne = max( best_result_yvonne, best_result_zara )
return best_result_yvonne
In words: Given a particular choice by Yvonne, Zara always chooses the sequence that minimizes the probability that Yvonne wins. Given this information, Yvonne can evaluate each of her options and choose the one that maximizes her winning probability.
This can still be a lot of work: for example, for k = 13 there are exactly 226 games to evaluate, and this is still not the largest case. Thus, we need to optimize this search somehow in order to speed it up.
In our solution we do two improvements: pruning and reordering.
As for pruning, we do the following: For Zara we never look at patterns for which she cannot win (as there will be too few cards of a color left in the deck), as these leave Yvonne with at least 50% win probability.
As for reordering, we do two things. First, we keep all of Yvonne’s options in a priority queue. For each of them we remember the current winning probability and an index that tells us how many of Zara’s answers we already examined. Obviously, the priority queue is ordered by winning probability. In each step, we take the top option (which is the current candidate for being the best answer) and try another of Zara’s answers. When we do the search in this way, we can terminate it as soon as we finish testing the option that currently sits on the top of the priority queue.
The second improvement by reordering is that we “apply pretests”: in order to speed up the search, we start the processing of each of Yvonne’s patterns by testing it against a few possible options for Zara for which we guess that they may be good answers to this particular choice made by Yvonne.
In the infinite version of Penney’s Game the best answer for Zara is usually a sequence such that its suffix of length k − 1 is a prefix of Yvonne’s sequence – in other words, Zara chooses a sequence that can appear immediately before Yvonne’s sequence. Our version is distorted, but it turns out that Zara’s sequences that end in a prefix of Yvonne’s sequence are still among the best choices, so we’ll start by testing some such sequences first.