## A plan you shouldn't follow (even if you think you will)

Here is a case where a plan maximises expected utility, you are sure that you are going to follow the plan, and yet the plan tells you to do things that don't maximise expected utility.

Middle Knowledge. In front of you are two doors. If you go through the left door, you come into a room with a single transparent box containing $7. If you go through the right door, you come into a room with two opaque boxes, one black, one white. Your first choice is which door to take. Then you have to choose exactly one box from the room in which you find yourself. A psychologist has figured out which box you would take if found yourself in the room with the two boxes. She has put$10 into the box she thinks you would take, and $0 into the other. Suppose you are confident that you'll choose the left door and take the$7. Suppose also that you don't know what the psychologist knows: which of the two boxes you would have taken if you had gone through the other door. Your evidence clearly wouldn't favour one of the boxes over the other. We may assume that some sub-personal mechanism would have broken the tie and made you reach for one of the boxes. The mechanism may well be deterministic, its outcome sensitive to apparently irrelevant environmental circumstances such as the temperature. The psychologist knows how the mechanism works, and she knows what the relevant circumstances in the room are. That's how she knows what you would do.

Now, given all this, we can evaluate the possible plans. Your plan to first choose the left door and then take the transparent box has expected (and certain) payoff $7. How about the alternative plan to first choose the right door and then take the black box? Well, you don't know what's in the black box. You're 50% sure that it contains$10 and 50% sure that it contains $0. The content of the box is settled. So if you were to go through the door on the right and take the black box, you might get$10 or you might get $0, with equal probability. The expected payoff is$5. Same for taking the white box.

$7 is better than$5. Your plan to choose the left door maximises expected utility (assuming utility = monetary payoff).

But here you are standing in front of the two doors. If you go through the one on the left, you'll get $7. What would happen if instead you went through the door on the right? This is not a question we have considered. We've only looked at the more specific questions what would happen if you went through the door on the right and took this or that box. If you went through the door on the right, your sub-personal mechanism would select one of the boxes. You don't know which box it would select, but the psychologist knows. She has put$10 into the box it would select. Thus, if you went through the door on the right, you would almost certainly get $10. So the first move in your plan does not maximise expected utility. This is an example of what I called a "bizarre" case in this recent post. It illustrates the failure of the '(DC4)' principle, according to which, if a plan P maximises expected utility conditional on P then each of its acts maximises expected utility conditional on P. It is important to the example that the relevant plan is not a best equilibrium in the decision problem over the plans. I wonder if that is true for all counterexamples to (DC4). An easy way to block any counterexamples of this kind would be to allow for unspecific plans. If we allow for a plan that merely says that you go through the right-hand door, without settling which box to choose, then your plan to go through the left-hand door and take the$7 no longer maximises expected utility.

# on 09 October 2021, 13:42

I think any plan that is part of a Nash equilibrium but not part of a subgame perfect equilibrium will violate DF4. But that’s a different constraint to being thé equilibrium with the highest payout.

The Wikipedia page on subgame perfect equilibrium is good on this. I think the second example is the kind of case that is relevant here, assuming ‘you’ is player 2.

# on 09 October 2021, 17:59

Thanks Brian -- I should have clarified that I'm only looking at plans in single-player games with perfect recall and stable preferences. I don't think the example from the Wikipedia page can be transferred to this setting. But it's a good idea to think about the issue in terms of subgame perfect equilibria.

# on 09 October 2021, 18:10

I'm not sure what it means to call these examples involving predictors 'single-player'. The predictor makes choices, has a utility function (1 for correct predictions, 0 for incorrect), and generally can be modelled as a player. Harper (1985) I think makes this point. In any case, a game is a decision problem where the external world includes other agents, so decision theory should apply there.

So here's one I've been thinking about a bit. (It does crucially involve predictor having a more complicated utility function than 1 for correct prediction, 0 for incorrect.)

Human and predictor (who has all the normal properties of a Newcomb predictor) are playing the following two stage game. At stage 1, predictor will choose to opt-in or opt-out. If predictor opts-out, predictor gets $2, and human gets$1000. If they opt-in, human and predictor simultaneously choose A or B. If they both choose A, both get $3. If they both choose B, both get$1. If they choose differently, they both get \$0.

The optimal equilibrium for human is planning to play B, and have predictor opt-out. But if predictor opts-in (as I think they probably should), the optimal equilibrium is for both to play A. So I think 'choose the optimal equilibrium' violates DF4 in this case. Whether that's bad news for DF4 or for 'choose the optimal equilibrium' is a further problem.

# on 09 October 2021, 19:33

Interesting! This case violates my restriction that the agent receives no information except through learning about her own acts. If human doesn't find out whether the predictor opted in I think it's not implausible that she ought to choose B.