Is there a dynamic argument for expected utility maximisation?
Why should you maximize expected utility? A well-known answer – discussed, for example, in McClennen (1990), Cubitt (1996), and Gustafsson (2022) – goes as follows.
Why should you maximize expected utility? A well-known answer – discussed, for example, in McClennen (1990), Cubitt (1996), and Gustafsson (2022) – goes as follows.
There's something odd about how people usually discuss iterated prisoner dilemmas (and other such games).
Let's say you and I each have two options: "cooperate" and "defect". If we both cooperate, we get $10 each; if we both defect, we get $5 each; if only one of us cooperates, the cooperator gets $0 and the defector $15.
This game might be called a monetary prisoner dilemma, because it has the structure of a prisoner dilemma if utility is measured by monetary payoff. But that's not how utility is usually understood.
Suppose you prefer $105 today to $100 tomorrow. You also prefer $105 in 11 days to $100 in 10 days. During the next 10 days, your basic preferences don't change, so that at the end of that period (on day 10), you still prefer $105 now (on day 10) to $100 the next day. Your future self then disagrees with your earlier self about whether it's better to get $105 on day 10 or $100 on day 11.
In economics jargon, your preferences are called time inconsistent. Time inconsistency is supposed to be a failure of ideal rationality.
Luc Bovens and Wlodek Rabinowicz (2010 and 2011) present the following puzzle:
Three people are each given a hat to put on in the dark. The hats' colours, either black or white, has been decided by three independent tosses of a fair coin. Then the light goes on and everyone can see the hats of the two others, but not their own. All of this is common knowledge in the group.
Let's call the three players X, Y and Z. There are eight possible distributions of hat colours, each with probability 1/8:
Professor Procrastinate has to make an important phone call. The call is long overdue because Procrastinate has been playing Farmville all week. The problem is that Procrastinate values current pleasure higher than future pleasure. So when he applies his decision theory, he finds that it is better to play some more Farmville now and make the phone call later instead of making the call now: it doesn't matter much whether the call is delayed by a few more hours, and this way the immediate future will be much more pleasant.