The easy subtask could be solved really easily. The key is to realize that the change in rules doesn’t actually change anything. Regardless of which door Monty opens, he is essentially telling the contestant: “If your initial choice was wrong, this other unopened door is where the prize is.” Hence, always switch is still the optimal strategy, and the win probability is 2/3: you lose if and only if your initial door choice was correct.
In the general case this is no longer true, Monty’s tiredness can be a useful source of information. For example, imagine that Monty is always tired and there are 10 doors. If you start by choosing door 1 and then Monty opens door 3, it is certain that the prize is behind door 2 (and not door 1, and not any of the doors 4 through 10).
Luckily, the optimal strategy can still be found easily. We can simply compute all the probabilities and use them to make the optimal decision in any state of the game.
We are interested in the following conditional probabilities: Given that I chose door a and then Monty opened door b, what is the probability that the prize is behind door c?
We can compute these probabilities directly from their definition: as the actual probability of that event, divided by the probability that Monty will open door b in any situation. And the denominator is simply a sum over whether Monty is tired or not, and over all doors that can contain the prize.
Once we know the conditional probabilities described above, the optimal strategy follows. First, for a fixed door a we can, for each b, evaluate the probability that Monty opens b and if he does, we find the door c that is most likely to contain the prize and switch to that door. In this way, we can compute our probability of winning if we start by choosing door a. Finally, we do that for each possible a and pick the best one. See our sample implementation for more details.
Of course, instead of letting a program solve the game for us we can also solve it on paper. Let’s see what we can derive about it.
First of all, the initial door choice still doesn’t actually matter. The situation after our initial choice always looks the same: there is one chosen door and an ordered sequence of doors that weren’t chosen. Thus, without loss of generality, we will assume that we chose door d.
There are now three possibilities for what Monty will do:
He may open door 1. Regardless of whether he did so because he was tired or because he chose to do so at random, the only information we get is that door 1 doesn’t contain the prize. All other doors (2 through d − 1) are equally likely to contain the prize, and that probability is now greater than 1/d so we should switch to any of them.
He may open door with a number greater than 2. This means that the door had to be chosen at random, and again we know nothing about the other doors. Thus, again the optimal strategy is to switch to any other door.
He may open door number 2. This is the interesting part, because it may be one of two distinct cases: either he chose it at random, or he is tired and he was forced to choose it because door 1 contains the prize. If the second case has a nonzero probability, it skews the probability in favor of door 1. Hence, the optimal strategy in this case is to switch to door 1.
We may therefore summarize one optimal strategy as follows: Start by choosing door d. Once Monty opens some other door, switch to the unopened door with the smallest number.
And from this summary we can easily compute our win probability: we win if either the prize is in door 1 (probability 1/d) or if the prize is in door 2 and Monty opens door 1 (probability 1 if he is tired and 1/(d − 2) if he is not).
Thus, the final answer is
$$ \frac{1}{d} + \frac{p}{d} + \frac{1-p}{d(d-2)} $$