## Counterexamples to Good's Theorem

Good (1967) famously "proved" that the expected utility of an informed decision is always at least as great as the expected utility of an uninformed decision. The conclusion is clearly false. Let's have a look at the proof and its presuppositions.

Suppose you can either perform one of the acts A_{1}…A_{n} now, or learn the answer to some question E and afterwards perform one of A_{1}…A_{n}. Good argues that the second option is always at least as good as the first. The supposed proof goes as follows.

We assume a Savage-style formulation of decision theory. For any act A_{j}, the expected utility of choosing A_{j} is

where S_{i} ranges over a suitable partition of states. The value of choosing A_{j} after learning E_{k} (an answer to E) is

Here we assume that the truth of E_{k} does not affect the value of choosing A_{j} in state S_{i}, so that U(A_{j} ∧ S_{i} ∧ E_{k}) = U(A_{j} ∧ S_{i}).

Since Cr(S_{i}) = ∑_{k} Cr(S_{i}/E_{k}) Cr(E_{k}), we can rewrite (1) as

If you choose between A_{1}…A_{n} now, you will choose an act A_{j} that maximises EU. Thus, where 'N' stands for choosing between A_{1}…A_{n} now,

What if instead you choose to first learn ('L') the answer to E? Let's assume that you will afterwards choose an act that maximises your posterior expected utility. You don't know which act that is, but we can compute its expected value by averaging over the possible learning events:

Now compare (4) and (5). EU(N) is the maximum of a weighted average; EU(L) is the corresponding average of the maxima. The maximum of an average is always at least as great as the average of the maxima. QED.

As noted, the argument assumes that the answer to E is certain to make no difference to the value of performing any act A_{j} in any state S_{j}, so that U(A_{j} ∧ S_{i} ∧ E_{k}) = U(A_{j} ∧ S_{i}). The argument also assumes that if you choose L then your future self is certain to follow the standard norms of Bayesian rationality: she updates by conditionalisation and maximises expected utility. Let's grant these assumptions.

Even so, the conclusion is not always true. Four counterexamples:

Crime Novel.You have a choice between reading a crime novel and reading a biography. You prefer the crime novel because you like the suspense. Before you make your choice, you have the option of finding out who is the villain in the crime novel (by reading a plot summary), which would spoil the novel for you. After getting the information, you would rather read the biography. You rationally prefer the uninformed choice between the two books over the more informed choice. (Adapted from (Bradley and Steele 2016))

Rain.You have a choice between taking a black box and taking a white box. Before you make your choice you may look through the window and check if it is raining. A reliable predictor has put $1 into the black box and $0 into the white box iff she predicted that you would look outside the window. If she predicted that you would not look outside the window, she has put $2 into the white box and $0 into the black box. You are 50% confident that you were predicted to look outside the window. You rationally prefer the uninformed choice.

Here is why (assuming CDT). If you don't you look outside the window, you will take the white box, with expected payoff $1, compared to $0.50 for the black box. If you do look outside the window, you will become confident that you were predicted to look outside the window; as a result, you will take the black box, with expected payoff $0.50.

Middle Knowledge.Once again you have a choice between taking a black box and taking a white box. A psychologist has figured out what you are disposed to do in this kind of choice situation, where you have no evidence that favours one of the boxes over the other. She has put $1 into whichever box she thinks you would take, and $0 into the other box. Unbeknownst to the psychologist, I have observed what she put into the boxes. I slip you a piece of paper on which I claim to have written down the colour of the box with the $1. You are 90% confident that what I have written down is true. You rationally prefer not to read my note.

Here is why (assuming CDT). If you don't read my note, you can expect to get $1. If you do read my note, you will take the box whose colour is written on the note. Since there's a 90% chance that this box contains $1, the expected payoff is $0.90.

Newcomb Revelation.You are facing the standard Newcomb Problem. Before you make your choice, you have the option of looking inside the opaque box. The predictor knew that you would be given this offer, and has factored your response into her prediction. EDT says that you should reject the offer.

This is only a counterexample to Good's Theorem if we assume EDT. But like CDT, EDT can be formulated in Savage's framework. We only have to stipulate that the states in a properly formulated decision problem are probabilistically independent of the acts. In Newcomb's Problem, a suitable partition of states is { prediction accurate, prediction inaccurate }. Good's "proof" does not seem to rely on a causal construal of the states.

To be fair, one might argue that this case violates the assumption that U(A_{j} ∧ S_{i} ∧ E_{k}) equals U(A_{j} ∧ S_{i}). But this isn't the only problem. Consider equation (5) in the above proof.

In Newcomb Revelation, this says that EU(L) = Cr(full) EU(two-box/full) + Cr(empty) EU(two-box/empty), assuming that conditional on either observation, two-boxing maximises expected utility. But suppose you (as the agent in Newcomb Revelation) are convinced that you will one-box and that you will reject the offer to look inside the opaque box. So Cr(full) is close to 1. And evidently EU(2b/full) is $1M1K. By (5), the expected utility of looking inside the opaque box is therefore close to $1M1K. That's clearly wrong.

More generally, equation (5) in the above proof simply isn't an application of Savage-style decision theory. It is a hand-wavy shortcut.

(Skyrms 1990) argues that one can patch up Good's proof if one assumes that the states are causal dependency hypotheses, but his argument still looks hand-wavy to me, and the other counterexamples suggest that it is fallacious.

Let's see how far we can get if we use a suppositional formulation of decision theory.

Let { O_{i} } be a partition of "value-level propositions" as in (Lewis 1981). Intuitively, the members of this partition settle everything the agent ultimately cares about. In suppositional formulations of decision theory, the expected utility of an act A is given by

where Cr^{A}(O_{i}) is the probability of O_{i} on the supposition A. The relevant type of supposition might be "indicative" (yielding EDT) or "subjunctive" (yielding CDT).

Now let's evaluate the two options. First, you might choose directly between A_{1}…A_{n}, without first learning the answer to E. This is the option we called N. We assume that your future self will choose an option that maximises expected utility, and that your basic (uncentred) desires don't change. Let's also assume that these assumptions are resilient under suppositions. Thus we can assume that on the supposition that you choose N you will afterwards choose an option from A_{1}…A_{n} that maximises (posterior) expected utility. This suggests that the expected utility of N is

where EU^{N}(A_{j}) is the expected utility of A_{j} computed relative to Cr^{N}. I'll return to this assumption below. Let's stick with it for now. By definition,

Since (Cr^{N})^{Aj}(O_{i}) = ∑_{k} (Cr^{N})^{Aj}(O_{i}/E_{k})(Cr^{N})^{Aj}(E_{k}), we can expand (2') into

where EU^{N}(A_{j}/E_{k}) is defined as ∑_{i} (Cr^{N})^{Aj}(O_{i}/E_{k}) V(O_{i}). Plugging this into (1'), we have

Alternatively, you might delay your choice between A_{1}…A_{n} until after you've learned the answer to E. To begin, we have

As before, probability theory allows expanding and rearranging:

\[\begin{align*} (6')\quad EU(L) &= \sum_{i}\sum_{k} Cr^{L}(O_{i}/E_{k}) Cr^{L}(E_{k}) V(O_{i})\\ &= \sum_{k }Cr^{L}(E_{k}) \sum_{i} Cr^{L}(O_{i}/E_{k}) V(O_{i}). \end{align*} \]∑_{i} Cr^{L}(O_{i}/E_{k}) V(O_{i}) is the "desirability" of E_{k} from the perspective of Cr^{L}, understood as in (Jeffrey 1983). We assume that Cr^{L} is concentrated on worlds at which you are going to choose an act from A_{1}…A_{n} that maximises expected utility after learning the true answer to E. We also assume that the answer to E is all you learn. The desirability of E_{k} from the perspective of Cr^{L} then equals max_{j} EU^{L}(A_{j} / E_{k}), where EU^{L}(A_{j} / E_{k}) is the expected utility of A_{j} computed relative to the probability function (Cr^{L})_{Ek} that comes from Cr by first supposing L and then conditioning on E_{k}. Plugging this into (6') yields

(4') and (7') resemble (4) and (5) in Good's proof. (4') is the maximum of an average, (7') is the average of some maxima. But the subscripts and superscripts are different. To infer that L is at least as good as N, we need the following two assumptions:

\[\begin{align*} (i) \quad&(Cr^{N})^{A_{j}}(E_{k}) = Cr^{L}(E_{k}), \text{ for all }A_{j}, E_{k.}\\ (ii) \quad&(Cr^{N})^{A_{j}}(O_{i}/E_{k}) = ((Cr^{L})_{E_{k}})^{A_{j}}(O_{i}), \text{ for all }O_{i,} A_{j}, E_{k}. \end{align*} \]Continuing to use superscripts for supposition and subscripts for conditioning, we can rewrite (ii) as

In EDT, the relevant kind of supposition is conditioning, so we can simplify:

\[\begin{align*} (i_{E})\quad &Cr(E_{k} / N \land A_{j}) = Cr(E_{k} / L),\text{ for all }A_{j}, E_{k}.\\ (ii_{E})\quad &Cr(O_{i} / N \land A_{j} \land E_{k}) = Cr(O_{i} / L \land A_{j} \land E_{k}),\text{ for all }O_{i,} A_{j}, E_{k}. \end{align*} \]Condition (i_{E}) is violated in Newcomb Revelation. Here the probability of the opaque box being empty is low conditional on one-boxing without peeking, but it is high conditional on peeking.

CDT does not allow simplifying the two conditions, at least not without further assumptions.

(i) is fairly easy to understand. It says that the probability of the various answers E_{k} does not "causally" depend on your choice(s). This is violated in the Rain scenario.

(ii) is hard to understand. In normal cases, however, the order of the operations will make little difference. So we can approximately paraphrase (ii) as follows:

You are as likely to get a certain amount of utility by choosing A

_{j}after finding out E_{k}as by choosing A_{j}without finding out E_{k}.

(Here 'without finding out E_{k}' is meant to imply, as it does in English, that E_{k} is true.) This condition is obviously violated in the Crime Novel case.

Unfortunately, my "proof" still relies on some further assumptions, besides the assumptions of diachronic rationality.

One assumption was smuggled into (1'):

In effect, this assumes that \( EU(N) = EU(N \land \hat{A} \)), where \( \hat{A} \) is an act that maximises expected utility on the supposition that N. Without this assumption, I don't know how to get the proof off the ground. In EDT, the assumption is harmless, but in CDT it can fail. It fails in Middle Knowledge.

Another problematic assumption in both my "proof" and in Good's is that the possible propositions you might learn form a partition. To see why this matters, return to the Crime Novel scenario.

Let's construe the relevant states as somewhat course-grained "dependency hypotheses". If you plan to not learn about the plot then most of your credence goes to a state S_{1} in which the act of reading the crime novel would bring about a highly desirable experience while the act of reading the biography would bring about a moderately desirable experience. If E_{k} is a summary of the crime novel's plot, then most of your credence conditional on E_{k} still goes to S_{1}. Your enjoyment depends on not *knowing* the villain, but not on who *is* the villain. So Cr(S_{1}/E_{k}) is high, for all relevant E_{k}. After you've learned E_{k}, however, Cr(S_{1}) is low. You no longer believe that reading the crime novel would be a great experience.

Since Cr(S_{1}/E_{k}) is not equal to Cr(S_{1}), finding out about the plot is not adequately modelled as conditioning on E_{k}. The problem is that if you find out about the plot, you not only learn E_{k}, but also that you know E_{k}. It is this knowledge (or belief) that breaks the connection between reading the crime novel and having a great experience. Conditional on knowing or believing E_{k}, your credence in S_{1} is low.

Since we want to model your learning event in terms of conditioning, we have to make sure that the propositions { E_{k} } include everything you might learn if you chose L. In the Crime Novel case, each member of { E_{k }} should specify (a) a plot and (b) that you believe that this is the plot. But then { E_{k} } no longer forms a partition. Every element of { E_{k} } now implies that you won't enjoy the novel because you think you already know the villain's identity.

There is nothing special here about the Crime Novel case. In realistic cases, the answers to E will *never* form a partition, if we assume that learning the answer goes by conditioning on the answer.

*Philosophy of Science*83 (1): 1–28.

*The British Journal for the Philosophy of Science*17 (4): 319–21.

*The Logic of Decision*. Second. Chicago: University of Chicago Press.

*Australasian Journal of Philosophy*59: 5–30.

In, for example, the crime novel, can't we just describe it as a different act then. The act in the first instance is reading a crime novel and thus discovering a new fact in a fun way. But if the experience of reading the novels are the same, learning it would not be bad.