CDT for reflective agents (EDC, ch.3)

Chapter 3 of Evidence, Decision and Causality is called "Causalist objections to CDT". It addresses arguments suggesting that while there is an important connection between causation and rational choice, that connection is not adequately spelled out by CDT.

Arif discusses two such arguments. One is due to Hugh Mellor, who rejects the very idea of evaluating choices by the lights of the agent's beliefs and desires. I'll skip over this part because I agree with Arif's response.

The other argument is more important, because it touches on an easily overlooked connection between rational choice and rational credence.

Consider the "Psycho Button" case from Egan (2007).

In front of you is a button. If you press it, all psychopaths will be killed. You would like all psychopaths to be killed, unless you yourself are a psychopath. You are highly confident that you are not a psychopath, and that only a psychopath would push the button. Should you push it?

We assume that whether or not you're a psychopath is outside your current control. You wouldn't become a psychopath by pushing the button. CDT then seems to suggest that you should push the button, since that would most likely lead to a desirable outcome.

But is that really what CDT suggests? There are two problems.

First, suppose pushing the button is indeed the rational choice. And suppose you know that you're rational. Then you know you're going to push the button. And then you can't be confident that you're not a psychopath, since you believe that only a psychopath would push the button!

From the perspective of CDT, it looks like you could only find yourself in the Psycho Button scenario if you believed yourself to be irrational. But it's not obvious that norms of ideal rationality apply straightforwardly to agents who believe that they are irrational. (If you are infallible at prime factorisation, and just computed some prime factors, how confident should you be in the result if you falsely believe that you often get this kind of computation wrong?)

Here's the second problem. Suppose you somehow find yourself in the described scenario and decide to push the button. At this point, you can hardly still be confident that you're not a psychopath, given your belief that only a psychopath would push the button. Deciding to push the button provides strong evidence that you're a psychopath, and thereby that pushing the button is the wrong choice (by the lights of CDT).

One can perhaps imagine an agent who has no reflective access to their state of deliberation – they only learn about what they will do by observing their own behaviour. Real agents are not like that. Plausibly, ideally rational agents are not like that either.

When we deal with "reflective" agents whose beliefs are sensitive to their state of deliberation, we can't simply apply CDT with arbitrarily stipulated initial beliefs. According to CDT, a rational agent in the Psycho Button situation could not decide to push the button because by that time not pressing the button would have much greater expected utility.

Nor could she decide against pushing the button. If you decide against pushing the button, you should remain confident that you're not a psychopath, and then it would be better to push the button. The more you're inclined towards one of the options, the more you should be inclined towards the other option. We have what's known as an "unstable" decision problem.

What should an agent do in an unstable decision problem?

One natural idea is that she ought to be in a state of indecision. On this approach, the primary aim of decision theory is not to recommend an act, but to recommend a state of decision or indecision. In almost every realistic decision situation, CDT will recommend a "pure" decision state in which the agent has resolved to perform a certain act. In that case, we can also say that CDT recommends the act. But in pathological cases like Psycho Button, CDT recommends an "impure" state in which the agent is undecided between some acts.

Being undecided goes along with a specific credence about what the agent will end up doing. In a rational state of indecision, all acts between which the agent is undecided must appear equally attractive.

How should we understand a state of indecision? On one view, it is a distinctive psychological state in which the ultimate act is selected by a chancy sub-personal process. (This is the view I assumed in Schwarz (2015).) On another view, there is nothing more to a state of indecision than the agent's uncertain beliefs about what she will do. (Arntzenius (2008) seems to favour this kind of view.)

Another tricky question concerns decision problems with multiple "equilibria" – multiple stable deliberation states at which the agent is not drawn towards an alternative choice. Are all such states rationally acceptable? Or only ones the agent could reach through a certain process of deliberation (perhaps as described in Skyrms (1990))? Or only the "best" ones, by some measure of goodness? I prefer the last option.

I might return to these details later if they become relevant. For now, back to chapter 3 of Evidence, Decision and Causality.

Arif here presents the Psycho Button case and claims that standard CDT recommends pressing the button. In some superficial sense this may be true. But CDT does not say that a rational agent who finds herself in the Psycho Button case would decide to push the button. Rather, the agent should become undecided about what to do, and unsure about whether she is a psychopath.

Arif mentions this kind of response, but presents it as an alternative to CDT. He also argues that it is untenable. The main argument (on pp.61-65) turns on a comparison of three scenarios.

The first scenario is Psycho Button. Call the button in this scenario 'button A'.

The second scenario is like Psycho Button, except that you don't think that only a psychopath would push the button. Also, there is a small fee for pushing the button. Call the button in this scenario 'button B'.

In the third scenario, you have a choice between pushing button A and button B. (Pushing A would again be evidence that you're a psychopath. Pushing B would not, and comes with a small fee.)

The second scenario is easy: everyone can agree that you should push the button. The third scenario has the form of a Newcomb Problem. Any sensible form of CDT will say that you should push button A. But if pushing button B is better than doing nothing, and pushing button A is better than pushing button B, then, Arif argues, pushing button A must be better than doing nothing. And that's the choice you have in Psycho Button. In short, by saying that pushing A is better than pushing B in the third scenario, CDT is implicitly committed to saying that pushing A is better than doing nothing in the first.

The argument assumes that the value of pushing the buttons is independent of the decision problem in which they figure. That's false.

Let's have a look at the third scenario. Here is the decision matrix, with precise numbers to make it more concrete.

psycho ¬psycho
push A -90 10
push B -91 9

Whether you're a psychopath or not, pushing A is better than pushing B. According to CDT, you should therefore push A.

We might pause to ask how pushing A could be evidence that you're a psychopath, while pushing B is not (as the scenario assumes). Since you have a forced choice between A and B, the psychopaths will be killed either way. The only question is whether you want to additionally pay a small fee. Why would reluctance to pay the fee be an indication of psychopathy? To make sense of the scenario, we have to assume that there is a strong correlation between being a psychopath and following CDT.

Anyway, if you find yourself in this scenario, then (according to CDT) you should be (or become) confident that you are a psychopath and that you will push button A. By doing so, you can expect to realise -90 units of utility.

Compare the decision matrix for the first scenario, Psycho Button:

psycho ¬psycho
push A -90 10
do nothing 0 0

Here you should be (or become) undecided between the two options. More precisely, you should be 90% inclined towards doing nothing and 10% inclined towards pushing the button. This is the unique equilibrium state. In that state, you are 90% confident that you're not a psychopath. Pushing button A therefore has an expected utility of 0.

So, the value of pushing button A is not the same in the two scenarios. That you prefer A over B in the third scenario, and B over nothing in the second therefore doesn't imply that you must prefer A to nothing in the first. This only follows if we hold fixed your beliefs – if we don't allow the decision situation you face to influence your beliefs about what you will do, and thereby about whether you are a psychopath.

The lesson is that if we don't want to say that you should push the button in Psycho Button then we should assume that your reflective beliefs about whether you will choose a certain act may depend on what other options are available.

It's odd to even call this an assumption. Of course your beliefs about whether you will choose an act should depend on whether a better alternative is available. In EDT, this effect can be ignored, because EDT makes what you should do independent of what you believe you will do. But the effect is there nonetheless. In CDT, it sometimes makes a big difference.

(What, Arif asks, would happen in a choice between all three options – pushing A, pushing B, and doing nothing? Answer: You should be 90% inclined towards doing nothing and 10% inclined towards pushing A.)

On pp.69-73, Arif comments on Arntzenius's presentation of CDT (in Arntzenius (2008)), which correctly takes into account the interaction between rational choice and rational credence.

Arif announces two objections. However, the second objection is not really an objection to the theory, but to a certain money-pump argument that Arntzenius tentatively presents in support of his theory. I don't really care about that argument, so we're left with the other objection.

This objection (on pp.69f.) is that there is something wrong with what Arntzenius says should happen in Psycho Button. According to Arntzenius (and according to any other sensible version of CDT, I would think), you should be 90% confident that you will not press the button, in which case pushing and not pushing have equal expected utility. How, Arif asks, could you be fairly confident that you'll do one thing if you are perfectly indifferent between that and the alternative? Wouldn't this mean that you have given up your agency?

I'm not too concerned about this worry. Consider a simple "Buridan's Ass" case where you are indifferent between two equally appealing stacks of hey. In that case, it seems unproblematic to me that you could rationally resolve to choose one of the stacks, without giving up your agency. Having resolved to choose stack A, you will be confident that you are going to choose that stack even though you are perfectly indifferent between stack A and stack B.

In addition, perhaps we should think of indecision states in unstable decision problems as involving a kind of "reduced agency". On the model I suggested in Schwarz (2015), the eventual act will be chosen by a sub-personal process. The idea is that since you can't rationally decide in favour of either option, something else in your cognitive system must make the choice. On a personal level (i.e., on a level on which you are responsive to reasons), you have full control over your state of decision or indecision, but only probabilistic control over your eventual body movement.

Arntzenius, Frank. 2008. “No Regrets, or: Edith Piaf Revamps Decision Theory.” Erkenntnis 68: 277–97.
Egan, Andy. 2007. “Some Counterexamples to Causal Decision Theory.” Philosophical Review 116: 93–114.
Schwarz, Wolfgang. 2015. “Lost Memories and Useless Coins: Revisiting the Absentminded Driver.” Synthese 192 (9): 3011–36.
Skyrms, Brian. 1990. The Dynamics of Rational Deliberation. Cambridge (Mass.): Harvard University Press.


No comments yet.

Add a comment

Please leave these fields blank (spam trap):

No HTML please.
You can edit this comment until 30 minutes after posting.