## Is it ever rational to calculate expected utilities?

Decision theory says that faced with a number of options, one should choose an option that maximizes expected utility. It does not say that before making one's choice, one should calculate and compare the expected utility of each option. In fact, if calculations are costly, decision theory seems to say that one should never calculate expected utilities.

Informally, the argument goes as follows. Suppose an agent faces a
choice between a number of *straight* options (going left, going
right, taking an umbrella, etc.), as well as the option of calculating
the expected utility of all straight options and then executing
whichever straight option was found to have greatest expected
utility. Now this option (whichever it is) could also be taken
directly. And if calculations are costly, taking the option directly
has greater expected utility than taking it as a result of the
calculation.

Let's fill in the details by working through an example, "without loss of generality".

There are two straight options, Left and Right. In addition, there's the option Calculate which eventually leads to either Left or Right, depending on which of these if found to maximize expected utility. (If they are tied, let's stipulate that Calculate leads to Left.) In a non-trivial decision problem, the outcome of choosing Left and of choosing Right depends on the state of the world. We assume there are two relevant states, Bright and Dark. The expected utility of the straight options is given by (1) and (2). Here I've abbreviated Bright and Dark as 'B' and 'D', respectively; 'BL' denotes the outcome that results from going Left in a Bright world (similarly for 'DL', 'BR', and 'BL'). It will be useful to think of outcomes a collections of features the agent cares about.

(1) EU(Left) = P(B) U(BL) + P(D) U(DL).

(2) EU(Right) = P(B) U(BR) + P(D) U(DR).

What if Left is taken as a result of Calculate? In principle, completely different outcomes could come about. For example, there might be a state of the world in which the agent gets richly rewarded if she goes Left as a result of Calculating, but punished if she goes Left without Calculating. But clearly that's not the kind of case we're interested in, and it's not a case where calculating expected utilities is (known to be) costly. We're interested in cases where the outcomes that may result from going Left as a result of Calculate coincide with those that may result from going Left directly except for one feature in which they are worse, reflecting the cost of calculation.

So Calculate can lead to four possible outcomes which coincide with the four outcomes in (1) and (2) except for one respect in which they are worse. I'll therefore abbreviate these outcomes as 'BL-', 'DL-', 'BR-', and 'DR-', respectively. Thus BL- is the outcome of going Left in a Bright world as a result of Calculating, which is somewhat worse than BL (going Left in a Bright world without Calculating).

Let's keep track of the present assumption about the intrinsic cost of Calculating:

(3) U(BL-) < U(BL); U(DL-) < U(DL); U(BR-) < U(BR); U(DR-) < U(DR);

We have four possible outcomes because the result of Calculate depends not only on whether the world is Bright or Dark (B or D), but also on whether Calculate leads to Left or to Right (CL or CR). So we need to extend out state space from { B, D } to the product of { B, D } with { CL, CR }. Then:

(4) EU(Calculate) = P(B & CL)U(BL-) + P(D & CL)U(DL-) + P(B & CR)U(BR-) + P(D & CR)U(DR-).

Now we need another assumption, namely that the immediate result of Calculate (going Left or going Right) is probabilistically independent of whether the world is Bright or Dark:

(5) P(B & CL) = P(B)P(CL); P(B & CR) = P(B)P(CR); P(D & CL) = P(D)P(CL); P(D & CR) = P(D)P(CR).

This may not always be the case, but the exceptions seem highly unusual. After all, our agent knows that the immediate result of Calculate is not sensitive to the external state of the world: it is fixed by her own probabilities and utilities.

Using (5), we can rearrange (4) as (6).

(6) EU(Calculate) = P(CL)[P(B)U(BL-) + P(D)U(DL-)] + P(CR)[P(B)U(BR-) + P(D)U(DR-)].

So EU(Calculate) is a mixture of EU(Left) and EU(Right), except that each term is made worse by the cost of calculation, as per (3). As a result, EU(Calculate) is always less than either EU(Left) or EU(Right) or both. So -- with the possible exceptions of cases where assumption (5) fails -- Calculate is never a rational option. QED.

What shall we make of this strange result? Here are two lines of response, not necessarily exclusive.

First, perhaps it's wrong to model the agent's options as Left,
Right, and Calculate. Instead, we should distinguish between genuine
*act options*, Left and Right, and *process options* such as
Calculate. Calculate is a process option because it's a possible way
of reaching a decision between the act options. Alternative process
options are, for example: trusting one's instincts, or calculating
which option has the best worst-case outcome and then going ahead with
that option. Arguably you can't go Left without choosing any process
option at all. You have to either follow your instinct, calculate
expected utility, or use some other process. So it's wrong to compare
Calculate with Left and Right. We should rather compare Calculate with
other process options like trusting your instinct. Doing that, we'd
probably get the intuitive result that it's sometimes rational to
calculate expected utilities (to varying levels of precision), and
sometimes to trust one's instincts.

The main problem with this line of response (I think) is that it's far from clear that one can't choose Left without first choosing a process for choosing between Left and Right. For how does one choose a process? By first choosing a process for choosing a process? The regress this starts is clearly absurd: when we make a decision, we don't go through an infinite sequence of choosing processes for choosing processes etc. And if the regress can stop at one level, why can't it also stop at the level before? Why can't one simply choose Left, without choosing any process for choosing between Left and Right?

That's not just a theoretical worry. When you come to a fork in the road, it really seems that you can do three things (among others): go left, go right, or sit down and calculate the expected utilities. Each of these is a genuine option. Of course, whatever you end up doing, there will be a psychological explanation of why you did it. Perhaps you did it out of habit, or out of instinct, or as the result of some further computation. But that's equally true for all three options. So I'm not convinced by the first line of response, although I'm also not convinced it can't be rescued.

Here's the second line of response. Calculating expected utilities is a form of a priori (mathematical) reasoning, and there's a well-known problem of making sense of such reasoning in the standard model of Bayesian agents.

More concretely, consider what the agent in the above example should believe about CL and CR. If she knows her own probabilities and utilities, and she knows (as we can assume) that Calculate would lead to choosing an option with greatest expected utility (or to Left in case of ties), then she must also know either that Calculate would lead to Left or that Calculate would lead to Right, for this follows from what she knows and probability 1 is closed under logical consequence. And of course you shouldn't sit down and go through a costly calculation if you already know the result! From a strict Bayesian perspective, a priori reasoning is always a waste of time because the result is always already known.

When we think about whether an agent should calculate expected utilities, the agent we have in mind does not already know the answer. That seems to leave two possibilities: either the agent does not know her own probabilities and utilities, or she is not probabilistically coherent. But if the agent doesn't know her probabilities and utilities, it is unclear how calculating expected utilities is supposed to help. Moreover, intuitively the kind of agent we have in mind need not be uncertain about her own beliefs and basic desires. So it would seem that she must be probabilistically incoherent. But if we're dealing with incoherent agents, it's no longer clear that expected utility maximization is the right standard of choice. We can't assume that the agent should calculate expected utilities iff doing so would maximize expected utility.

The general point is that when we think about whether it's rational to calculate expected utilities, we have implicitly left behind the domain of perfect Bayesian rationality and turned to bounded rationality. Contrary to widespread thought, perfect Bayesian agents don't always calculate expected utility. They never calculate anything, because they already know the result. Before we can say what agents with bounded rationality should do -- including whether and when they should calculate expected utilities -- we need a good model of such agents.

Have you seen Joe Halpern's work on these issues? The papers about "costly computation" and "resource-bounded agents" on his papers page are I think relevant.