Or maybe we’re using the versions of the problems where the blackmailer is not entirely predictable and might still blackmail the functional decision theorist (but be more likely to blackmail the causal decision theorist), or where the Newcomb predictor is not a perfect predictor but only very likely to predict correctly, or where the other prisoner twin might be hit by a cosmic ray with low probability and not make the same decision as you. If so, situations where CDT does better than FDT are less likely than situations where FDT does better, so FDT still comes out ahead.

Let’s assume that we’re using the deterministic version of each of these problems, rather than the probabilistic version: the blackmailer is guaranteed to know what decision theory you use and to act accordingly, the Newcomb predictor is guaranteed to predict correctly, your twin is guaranteed to make the same prediction as you, your father is guaranteed to procreate if and only if you do.

Now let’s consider the blackmail problem. The post says, “If you face the choice between submitting to blackmail and refusing to submit (in the kind of case we’ve discussed), you fare dramatically better if you follow CDT than if you follow FDT.” This is true. The problem is that, if you are being blackmailed, this means that you are not going to follow FDT. If you were going to follow FDT, the blackmailer would not have blackmailed you. The fact that you have been blackmailed means you can be 100% certain that you will not follow FDT. In itself, being 100% certain that you will not follow FDT does not prevent you from following FDT. But it does make the situation where you follow FDT and come worse off impossible, which is relevant to our determination of which decision theory is better.

Let’s consider the Newcomb problem. If the Newcomb predictor is guaranteed to predict your choice correctly, it is impossible for an agent using CDT to see a million in the right-hand box.

It never does any good to dismiss a logical inconsistency and to consider what happens anyway.

What happens if we ignore this and suppose that the CDT agent does see a thousand in the left-hand box and a million in the right-hand box? Then using this supposition we can prove that they will get both amounts if they two-box. But since they are a CDT agent, we know that they will two-box, therefore there is nothing in the right-hand box, so we can prove that they will only get a thousand if they two-box. But suppose that they one-box instead. Since they are a CDT agent, we know that they will two-box, so we know that there is nothing in the right-hand box, so we can prove that if they one-box they will get nothing. However, we know that they see a million in the right-hand box, so we can prove that if they one-box, they will get a million. So we can prove that they should one-box, and we can prove that they should two-box. At this point we can conclude that a million and nothing are the same thing, and that a thousand is equal to a million plus a thousand. Avec des si, on mettrait Paris en bouteille.

The procreation example is harder to prove inconsistent because it relies on infinite regress.

Here’s a first way to resolve it. Should I procreate? If I do, my life will be miserable. But my father followed the same decision theory I do, so if I choose not to procreate, that means my father will have chosen not to procreate. So I will not exist. So I can prove that, if I end up choosing not to procreate, that means I do not exist. However, I do exist. That’s a contradiction. I guess that means I will not choose not to procreate. Knowing that I will not make that choice does not in itself prevent me from making the choice though. Should I choose not to procreate anyway? Well, I can prove that if I do not procreate, then I will not exist, and that if I do, then my life will be miserable. A miserable life is better than not existing, so I should procreate. However, I know that I exist, and that is the consequent of the implication “if I do not procreate, then I [will] exist”, so the implication is true, whereas if I choose to procreate I still exist but my life is miserable. A miserable life is worse than a non-miserable life, so I should not procreate. Oops, I can prove that I should procreate and that I should not procreate? That’s a contradiction, and this one doesn’t rely on the supposition that I made any particular choice. The world I am living in must be inconsistent.

We can also solve it by directly addressing the infinite regress.

Should I procreate? If I do, my life will be miserable. But my father followed the same thought process I did, would have made the same decisions, so if I choose not to procreate, that means my father will have chosen not to procreate. Then I would not exist, and a miserable life is better than not existing, so I should procreate.

Why did my father procreate, though, if that made his life miserable?

Oh, right. My grandfather followed the same thought process that my father did, so if he chose not to procreate, that means his father would have chosen not to procreate, and so he would not exist either. Since he too considered a miserable life better than not existing, he chose to procreate.

Why did my grandfather procreate, though, if that made his life miserable? What about my great-grandfather? What about—

The recursive buck stops *here*.

My {The Recursive Buck Stops Here}-great-. . .-great-grandfather did not choose to procreate because that would have made his life miserable. Therefore I do not exist. That’s a contradiction. The assumption that each generation of ancestry uses FDT and only exists if the previous chose to procreate is inconsistent with the assumption that any of them exist. No FDT agent can ever face this problem, and no designer can ever have to pick a decision theory for an agent that could have to face this problem. And if we only assume that it is unlikely that the father made a different decision from you, and not that it is certain that he did not, then FDT makes it less likely that you will not exist, and so it again comes out ahead of CDT.

There is one category of situations (the one exception I mentioned) where FDT can leave you worse off than CDT, and that is what happens when “someone is set to punish agents who use FDT, giving them choices between bad and worse options, while CDTers are given great options”. FDT can change your decisions to make them optimal, but it can’t change the initial decision theory you used to make the decisions. It can only pick decisions identical to those of another decision theory. That doesn’t prevent an environment from knowing what your initial decision theory was and punishing you on that basis. This is unsolvable by any decision theory. Therefore it can hardly be taken as a point against FDT.

I said that it never does any good to dismiss a logical inconsistency. I want to clarify that this is not the same as saying that we should dismiss thought experiments because their premises are unlikely. “Extremism In Thought Experiment Is No Vice”. Appealing to our intuitions about extreme cases is informative. But logical impossibility is informative too, and is what we care about when comparing decision theories. Nate Soares has claimed “that *all* decision-making power comes from the ability to induce contradictions: the whole reason to write an algorithm that loops over actions, constructs models of outcomes that would follow from those actions, and outputs the action corresponding to the highest-ranked outcome is so that it is contradictory for the algorithm to output a suboptimal action.”]]>

Anyways, that's of course not to dismiss the rest of your paper.]]>

I actually don't know Nate Soares, but Eliezer Yudkowsky is a
celebrity in the "rationalist" community. Many of his posts on the "Less Wrong blog are
gems. I also enjoyed his latest book, "*Inadequate Equilibria*.
Yudkowsky seems to be interested in almost everything, but he regards
decision theory as his main area of research. I also work in decision
theory, but I've always struggled with Yudkowsky's writings on this
topic.

Before I explain what I found wrong with the paper, let me review the main idea and motivation behind the theory it defends.

Standard lore in decision theory is that there are situations in which it would be better to be irrational. Three examples.

Blackmail.Donald has committed an indiscretion. Stormy has found out and considers blackmailing Donald. If Donald refuses and blows Stormy's gaff, she is revealed as a blackmailer and his indiscretion becomes public; both suffer. It is better for Donald to pay hush money to Stormy. Knowing this, it is in Stormy's interest to blackmail Donald. If Donald were irrational, he would blow Stormy's gaff even though that would hurt him more than paying the hush money; knowing this, Stormy would not blackmail Donald. So Donald would be better off if here were (known to be) irrational.

Prisoner's Dilemma with a Twin.Twinky and her clone have been arrested. If they both confess, each gets a 5 years prison sentence. If both remain silent, they can't be convicted and only get a 1 year sentence for obstructing justice. If one confesses and the other remains silent, the one who confesses is set free and the other gets a 10 year sentence. Neither cares about what happens to the other. Here, confessing is the dominant act and the unique Nash equilibrium. So if Twinky and her clone are rational, they'll each spend 5 years in prison. If they were irrational and remained silent, they would get away with 1 year.

Newcomb's Problem with Transparent Boxes.A demon invites people to an experiment. Participants are placed in front of two transparent boxes. The box on the left contains a thousand dollars. The box on the right contains either a million or nothing. The participants can choose between taking both boxes (two-boxing) and taking just the box on the right (one-boxing). If the demon has predicted that a participant one-boxes, she put a million dollars into the box on the right. If she has predicted that a participant two-boxes, she put nothing into the box. The demon is very good at predicting, and the participants know this. Each participant is only interested in getting as much money as possible. Here, the rational choice is to take both boxes, because you are then guaranteed to get $1000 more than if you one-box. But almost all of those who irrationally take just one box end up with a million dollars, while most of those who rationally take both boxes leave with $1000.

The driving intuition behind Yudkowsky and Soares's paper is that
decision theorists have been wrong about these (and other) cases: in
each case, the supposedly irrational choice is actually rational.
Whether a pattern of behaviour is rational, they argue, should be
measured by how good it is for the agent. In *Newcomb's Problem with
Transparent Boxes*, one-boxers fare better than two-boxers. So we
should regard one-boxing as rational. Similarly for the other
examples. Standard decision theories therefore get these cases wrong.
We need a new theory.

Functional Decision Theory (FDT) is meant to be that theory. FDT
recommends blowing the gaff in *Blackmail*, remaining silent in
*Prisoner's Dilemma with a Twin*, and one-boxing in *Newcomb's
Problem with Transparent Boxes*.

Here's how FDT works, and how it differs from the most popular form
of decision theory, Causal Decision Theory (CDT). Suppose an agent
faces a choice between two options A and B. According CDT, the agent
should evaluate these options in terms of their possible consequences
(broadly understood). That is, the agent should consider what might
happen if she were to choose A or B, and weigh the possible outcomes
by their probability. In FDT, the agent should not consider what would
happen if she were to choose A or B. Instead, she ought to consider
what would happen if *the right choice according to FDT were A or
B*.

Take *Newcomb's Problem with Transparent Boxes*. Without loss
of generality, suppose you see $1000 in the left box and a million in
the right box. If you were to take both boxes, you would get a million
and a thousand. If you were to take just the right box, you would get
a million. So Causal Decision Theory says you should take box boxes.
But let's suppose you follow FDT, and you are certain that you do. You
should then consider what would be the case if FDT recommended
one-boxing or two-boxing. These hypotheses are not hypotheses just
about your present choice. If FDT recommended two-boxing, then any FDT
agent throughout history would two-box. And, crucially, the demon
would (probably) have foreseen that you would two-box, so she would
have put nothing into the box on the right. As a result, if FDT
recommended two-boxing, you would probably end up with $1000. To be
sure, you know that there's a million in the box on the right. You can
see it. But according to FDT, this is irrelevant. What matters is what
*would* be in the box relative to different assumptions about
what FDT recommends.

To spell out the details, one would now need to specify how to compute the probability of various outcomes under the subjunctive supposition that FDT recommended a certain action. Yudkowsky and Soares are explicit that the supposition is to be understood as counterpossible: we need to suppose that a certain mathematical function, which in fact outputs A for input X, instead were to output B. They do not explain how to compute the probability of outcomes under such a counterpossible supposition. So we don't get any details spelled out. This is flagged as the main open question for FDT.

It is not obvious to me why Yudkowsky and Soares choose to model
the relevant supposition as a mathematical falsehood. For example, why
not let the supposition be: *I am the kind of agent who chooses A in
the present decision problem*? That is an ordinary contingent
(centred) propositions, since there are possible agents who do choose
option A in the relevant problem. These agents may not
follow FDT, but I don't see why that would matter. For some reason, Yudkowsky and
Soares assume that an FDT agent is certain that she follows FDT, and
this knowledge is held fixed under all counterfactual suppositions. I
guess there is a reason for this assumption, but they don't tell
us.

Anyway. That's the theory. What's not to like about it?

For a start, I'd say the theory gives insane recommendations in
cases like *Blackmail*, *Prisoner's Dilemma with a Twin*,
and *Newcomb's Problem with Transparent Boxes*. Take
*Blackmail*. Suppose you have committed an indiscretion that
would ruin you if it should become public. You can escape the ruin by
paying $1 once to a blackmailer. Of course you should pay! FDT says
you should not pay because, if you were the kind of person who doesn't
pay, you likely wouldn't have been blackmailed. How is that even
relevant? You *are* being blackmailed. Not being blackmailed
isn't on the table. It's not something you can choose.

Admittedly, that's not much of an objection. I say you'd be insane not to pay the $1, Yudkowsky and Soares say you'd be irrational to pay. Neither of us can prove that their judgement is right from neutral premises.

What about the fact that FDT agents do better than (say) CDT agents? I admit that if this were a fact, it would be somewhat interesting. But it's not clear if it is true.

First, it depends on how success is measured. If you face the
choice between submitting to blackmail and refusing to submit (in the
kind of case we've discussed), you fare dramatically better if you
follow CDT than if you follow FDT. If you are in *Newcomb's Problem
with Transparent Boxes* and see a million in the right-hand box,
you again fare better if you follow CDT. Likewise if you see nothing
in the right-hand box.

So there's an obvious sense in which CDT agents fare better than FDT agents in the cases we've considered. But there's also a sense in which FDT agents fare better. Here we don't just compare the utilities scored in particular decision problems, but also the fact that FDT agents might face other kinds of decision problems than CDT agents. For example, FDT agents who are known as FDT agents have a lower chance of getting blackmailed and thus of facing a choice between submitting and not submitting. I agree that it makes sense to take these effects into account, at least as long as they are consequences of the agent's own decision-making dispositions. In effect, we would then ask what decision rule should be chosen by an engineer who wants to build an agent scoring the most utility across its lifetime. Even then, however, there is no guarantee that FDT would come out better. What if someone is set to punish agents who use FDT, giving them choices between bad and worse options, while CDTers are given great options? In such an environment, the engineer would be wise not build an FDT agent.

Moreover, FDT does not in fact consider only consequences of the
agent's own dispositions. The supposition that is used to evaluate
acts is that FDT *in general* recommends that act, not just that
the agent herself is disposed to choose the act. This leads to even
stranger results.

Procreation.I wonder whether to procreate. I know for sure that doing so would make my life miserable. But I also have reason to believe that my father faced the exact same choice, and that he followed FDT. If FDT were to recommend not procreating, there's a significant probability that I wouldn't exist. I highly value existing (even miserably existing). So it would be better if FDT were to recommend procreating. So FDT says I should procreate. (Note that this (incrementally) confirms the hypothesis that my father used FDT in the same choice situation, for I know that he reached the decision to procreate.)

In *Procreation*, FDT agents have a much worse life than CDT
agents.

All that said, I agree that there's an apparent advantage of the
"irrational" choice in cases like *Blackmail* or *Prisoner's
Dilemma with a Twin*, and that this raises an important issue. The
examples are artificial, but structurally similar cases arguably come
up a lot, and they have come up a lot in our evolutionary history.
Shouldn't evolution have favoured the "irrational" choices?

Not necessarily. There is another way to design agents who refuse
to submit to blackmail and who cooperate in Prisoner Dilemmas. The trick is
to tweak the agents' utility function. If Twinky cares about her
clone's prison sentence as much as about her own, remaining silent
becomes the dominant option in *Prisoner's Dilemma with a Twin*.
If Donald develops a strong sense of pride and would rather take
Stormy down with him than submit to her blackmail, refusing to pay
becomes the rational choice in *Blackmail*.

FDT agents rarely find themselves in *Blackmail* scenarios.
Neither do CDT agents with a vengeful streak. If I wanted to design a
successful agent for a world like ours, I would build a CDT agent who
cares what happens to others. My CDT agent would still two-box in
*Newcomb's Problem with Transparent Boxes* (or in the original
Newcomb Problem). But this kind of situation practically never arises
in worlds like ours.

The story I'm hinting at has been well told by others. I'd
especially recommend Brian Skyrms's *Evolution of the Social
Contract* and chapter 6 of Simon Blackburn's *Ruling
Passions*.

So here's the upshot. Whether FDT agents fare better than CDT agents depends on the environment, on how "faring better" is measured, and on what the agents care about. Across their lifetime, purely selfish agents might do better, in a world like ours, if they followed FDT. But that doesn't persuade me that the insane recommendations FDT are correct.

So far, I have explained why I'm not convinced by the case for FDT. I haven't explained why I didn't recommend the paper for publication. That I'm not convinced is not a reason. I'm rarely convinced by arguments I read in published papers.

The standards for deserving publication in academic philosophy are relatively simple and self-explanatory. A paper should make a significant point, it should be clearly written, it should correctly position itself in the existing literature, and it should support its main claims by coherent arguments. The paper I read sadly fell short on all these points, except the first. (It does make a significant point.)

Here, then, are some of the complaints from my referee report, lightly edited for ease of exposition. I've omitted several other complaints concerning more specific passages or notation from the paper.

A popular formulation of CDT assumes that to evaluate an option A we should consider the probability of various outcomes

*on the subjunctive supposition*that A were chosen. That is, we should ask how probable such-and-such an outcome would be if option A were chosen. The expected utility of the option is then defined as the probability-weighted average of the utility of these outcomes. In much of their paper, Yudkowsky and Soares appear to suggest that this is exactly how expected utility is defined in FDT. The disagreement between CDT and FDT would then boil down to a disagreement about what is likely to be the case under the subjunctive supposition that an option is chosen.For example, consider

*Newcomb's Problem with Transparent Boxes*. Suppose (without loss of generality) that the right-hand box is empty. CDT says you should take both boxes because*if you were to take only the right-hand box you would get nothing*whereas*if you were to take both boxes, you would get $1000*. According to FDT (as I presented it above, and as it is presented in*parts*of the paper), we should ask a different question. We should ask would be the case*if FDT recommended one-boxing*, and what would be the case*if FDT recommended two-boxing*. For much of the paper, however, Yudkowsky and Soares seem to assume that these questions coincide. That is, they suggest that you should one-box because*if you were to one-box, you would get a million*. The claim that you would get nothing if you were to one-box is said to be a reflection of CDT.If that's really what Yudkowsky and Soares want to say, they should, first, clarify that FDT is a

*special case*of CDT as conceived for example by Stalnaker, Gibbard & Harper, Sobel, and Joyce, rather than an alternative. All these parties would agree that the expected utility of an act is a matter of what would be the case if the act were chosen. (Yudkowsky and Soares might then also point out that "Causal Decision Theory" is not a good label, given that they don't think the relevant conditionals track causal dependence. John Collins has made "essentially the same point.)Second, and more importantly, I would like to see some arguments for the crucial claim about subjunctive conditionals. Return once more to the Newcomb case. Here's the right-hand box. It's empty. It's a normal box. Nothing you can do has any effect on what's in the box. The demon has tried to predict what you will do, but she could be wrong. (She has been wrong before.) Now, what would happen if you were to take that box, without taking the other one? The natural answer, by the normal rules of English, is:

*you would get an empty box*. Yudkowsky and Soares instead maintain that the correct answer is:*you would find a million in the box*. Note that this is a claim about the truth-conditions of a certain sentence in English, so facts about the long-run performance of agents in decision problems don't seem relevant. (If the predictor is highly reliable, I think a "backtracking" reading can become available on which it's true that you would get a million, "as Terry Horgan has pointed out. But there's still the other reading, and it's much more salient if the predictor is less reliable.)Third, later in the paper it transpires that FDT can't possibly be understood as a special case of CDT along the lines just suggested because in some cases FDT requires assessing the expected utility of an act by looking exclusively at scenarios in which that act is not performed. For example, in

*Blackmail*, not succumbing is supposed to be better because it decreases the chance of being blackmailed. But any conditional of the form*if the agent were to do A, then the agent would do A*is trivially true in English.Fourth, in other parts of the paper it is made clear that FDT does not instruct agents to suppose that a certain act were performed, but rather to suppose that FDT always were to give a certain output for a certain input.

I would recommend dropping all claims about subjunctive conditionals involving the relevant acts. The proposal should be that the expected utility of act A in decision problem P is to be evaluated by subjunctively supposing not A, but the proposition that FDT outputs A in problem P. (That's how I presented the theory above.) The proposal then wouldn't rely on implausible and unsubstantiated claims about English conditionals.

[I then listed several passages that would need to be changed if the suggestion is adopted.]

I'm worried that so little is said about how subjunctive probabilities are supposed to be revised when supposing that FDT gives a certain output for a certain decision problem. Yudkowsky and Soares insist that this is a matter of subjunctively supposing a proposition that's mathematically impossible. But as far as I know, we have no good models for supposing impossible propositions.

Here are three more specific worries.

First, mathematicians are familiar with reductio arguments, which appear to involve impossible suppositions. "Suppose there were a largest prime. Then there would be a product x of all these primes. And then x+1 would be prime. And so there would be a prime greater than all primes." What's noteworthy about these arguments is that whenever B is mathematically derivable from A, then mathematicians are prepared to accept 'if A were the case then B would be the case', even if B is an explicit contradiction. (In fact, that's where the proof usually ends: "If A were the case then a contradiction would be the case; so A is not the case.")

If that is how subjunctive supposition works, FDT is doomed. For if A is a mathematically false proposition, then anything whatsoever mathematically follows from A. (I'm ignoring the subtle difference between mathematical truth and provability, which won't help.) So then anything whatsoever would be the case on a counterpossible supposition that FDT produces a certain output for a certain decision problem. We would get:

*If FDT recommended two-boxing in Newcomb's Problem, then the second box would be empty*, but also*If FDT recommended two-boxing in Newcomb's Problem, then the second box would contain a million*, and*If FDT recommended two-boxing in Newcomb's Problem, the second box would contain a round square*.A second worry. Is a probability function revised by a counterpossible supposition, as employed by FDT, still a probability function? Arguably not. For presumably the revised function is still certain of elementary mathematical facts such as the Peano axioms. (If, when evaluating a relevant scenario, the agent is no longer sure whether 0=1, all bets are off.) But some such elementary facts will logically entail the negation of the supposed hypothesis. So in the revised probability function, probability 1 is not preserved under logical entailment; and then the revised function is no longer a classical probability function. (This matters, for example, because Yudkowsky and Soares claim that the representation theorem from Joyce's

*Foundations of Causal Decision Theory*can be adapted to FDT, but Joyce's theorem assumes that the supposition preserves probabilistic coherence.)Another worry. Subjunctive supposition is relatively well-understood for propositions about specific events at specific times. But the hypothesis that FDT yields a certain output for a certain input is explicitly not spatially and temporally limited in this way. We have no good models for how supposing such general propositions works, even for possible propositions.

The details matter. For example, assume FDT actually outputs B for problem P, and B' for a different problem P'. Under the counterpossible supposition that FDT outputs A for P, can we hold fixed that it outputs B' for P'? If not, FDT will sometimes recommend choosing a particular act because of the advantages of choosing a

*different*act in a*different*kind of decision problem.Standard decision theories are not just based on brute intuitions about particular cases, as Yudkowsky and Soares would have us believe, but also on general arguments. The most famous of these are so-called representation theorems which show that the norm of maximising expected utility can be derived from more basic constraints on rational preference (possibly together with basic constraints on rational belief). It would be nice to see which of the preference norms of CDT Yudkowsky and Soares reject. It would also be nice if they could offer a representation theorem for FDT. All that is optional and wouldn't matter too much, in my view, except that Yudkowsky and Soares claim (as I mentioned above) that the representation theorem in Joyce's

*Foundations of Causal Decision Theory*can be adapted straightforwardly to FDT. But I doubt that it can. The claim seems to rest on the idea that FDT can be formalised just like CDT, assuming that subjunctively supposing A is equivalent to supposing that FDT recommends A. But as I've argued above, the latter supposition arguably makes an agent's subjective probability function incoherent. More obviously, in cases like*Blackmail*, A is plausibly false on the supposition that FDT recommends A. These two aspects already contradict the very first two points in the statement of Joyce's representation theorem, on p.229 of*The Foundations of Causal Decision Theory*, under 7.1.a.Yudkowsky and Soares constantly talk about how FDT "outperforms" CDT, how FDT agents "achieve more utility", how they "win", etc. As we saw above, it is not at all obvious that this is true. It depends, in part, on how performance is measured. At one place, Yudkowsky and Soares are more specific. Here they say that "in all dilemmas where the agent's beliefs are accurate [??] and the outcome depends only on the agent's actual and counterfactual behavior in the dilemma at hand -- reasonable constraints on what we should consider "fair" dilemmas -- FDT performs at least as well as CDT and EDT (and often better)". OK. But how we should we understand "depends on ... the dilemma at hand"? First, are we talking about subjunctive or evidential dependence? If we're talking about evidential dependence, EDT will often outperform FDT. And EDTers will say that's the right standard. CDTers will agree with FDTers that subjunctive dependence is relevant, but they'll insist that the standard Newcomb Problem isn't "fair" because here the outcome (of both one-boxing and two-boxing) depends not only on the agent's behavior in the present dilemma, but also on what's in the opaque box, which is entirely outside her control. Similarly for all the other cases where FDT supposedly outperforms CDT. Now, I can vaguely see a reading of "depends on ... the dilemma at hand" on which FDT agents really do achieve higher long-run utility than CDT/EDT agents in many "fair" problems (although not in all). But this is a very special and peculiar reading, tailored to FDT. We don't have any independent, non-question-begging criterion by which FDT always "outperforms" EDT and CDT across "fair" decision problems.

FDT closely resembles Justin Fisher's ""Disposition-Based Decision Theory" and the proposal in David Gauthier's

*Morals by Agreement*, both of which are motivated by cases like*Blackmail*and*Prisoner's Dilemma with a Twin*. Neither is mentioned. It would be good to explain how FDT relates to these earlier proposals.The paper goes to great lengths criticising the rivals CDT and EDT. The apparent aim is to establish that both CDT and EDT sometimes make recommendations that are clearly wrong. Unfortunately, these criticisms are largely unoriginal, superficial, or mistaken.

For example, Yudkowsky and Soares fault EDT for giving the wrong verdicts in simple medical Newcomb problems. But defenders of EDT such as Arif Ahmed and Huw Price have convincingly argued that the relevant decision problems would have to be highly unusual. Similarly, Yudkowsky and Soares cite a number of well-known cases in which CDT supposedly gives the wrong verdict, such as Arif's "Dicing with Death. But again, most CDTers would not agree that CDT gets these cases wrong. (See "this blog post for my response to

*Dicing with Death*.) In general, I am not aware of any case in which I'd agree that CDT -- properly spelled out -- gives a problematic verdict. Likewise, I suspect Arif does not think there are any cases in which EDT goes wrong. It just isn't true that both CDT and EDT are commonly agreed to be faulty. If Yudkowsky and Soares want to argue that they are, they need to do more than revisit well-known scenarios and make bold assertions about what CDT and EDT say about them.The criticism of CDT and EDT also contains several mistakes. For example, Yudkowsky and Soares repeatedly claim that if an EDT agent is certain that she will perform an act A, then EDT says she must perform A. I don't understand why. I guess the idea is that (1) if P(B)=0, then the evidential expected utility of B is undefined, and (2) any number is greater than undefined. But lots of people, from Kolmogoroff to "Hajek, have argued against (1), and I don't know why anyone would find (2) plausible.

For another example, Yudkowsky and Soares claim that CDT (like FDT) involves evaluating logically impossible scenarios. For example, "[CDTers] are asking us to imagine the agent's physical action changing while holding fixed the behavior of the agent's decision function". Who says that? I would have thought that when we consider what would happen if you took one box in Newcomb's Problem, the scenario we're considering is one in which your decision function outputs one-boxing. We're not considering an impossible scenario in which your decision function outputs two-boxing, you have complete control over your behaviour, and yet you choose to one-box. There are many detailed formulations of CDT. Yudkowsky and Soares ignore almost all of them and only mention the comparatively sketchy theory of Pearl. But even Pearl's theory plausibly doesn't appeal to impossible propositions to evaluate ordinary options. Lewis's or Joyce's or Skyrms's certainly doesn't.

I still think the paper could probably have been published after a few rounds of major revisions. But I also understand that the editors decided to reject it. Highly ranked philosophy journals have acceptance rates of under 5%. So almost everything gets rejected. This one got rejected not because Yudkowsky and Soares are outsiders or because the paper fails to conform to obscure standards of academic philosophy, but mainly because the presentation is not nearly as clear and accurate as it could be.

]]>I'm not convinced that believing that it might be snowing requires taking an explicitly second-level attitude towards a proposition about epistemic possibility, but I agree that if Betty has never thought about the weather then it sounds odd to say that she believes that it might be snowing. So that needs to be explained.

The problem might be related to the problem of logical omniscience. Intuitively, there are things we believe "implicitly" insofar as they are entailed by our beliefs, but we don't believe them "explicitly", whatever that means. The logic of implicit belief is plausibly KD45, the logic of explicit belief clearly isn't. But the ordinary concept of belief is closer to explicit belief.

That is, I'm tempted to say that Betty really does implicitly believe that it might be snowing, even if she's never considered the question.

I still need a better explanation then for why we don't have a dual for explicit belief. Perhaps I can make a similar divide-and-conquer move as in the case of knowledge:

Often, when it would be useful to say that someone does not believe that not-p, we know that the subject has considered the question, and then 'believes that might' is equivalent to the dual of 'believes' because the issue you raise doesn't arise.

In other cases, if we know that the subject has not considered the question, it would typically be more informative to say something like 'she has not considered whether p'.

Not sure. Anyway, thanks for the comment!

]]>

A thought: I think to really express the compatibility of the above 'possibility modal' in the initial sense you suggest you would need to say both "Betty does not believe that it is not snowing" and "Betty does not believe that it is snowing," or maybe best is "Betty does not believe either that it is snowing nor that it is not snowing." In the way you have it, saying merely "Betty does not believe that it is not snowing," you have to place emphasis on the second 'not' to get the right pragmatic force. This may be part of why it is hard to parse.

I like the idea that 'believes that it might be snowing' ascribes to the believer a belief-compatibility position with respect to the proposition that it is snowing, but I worry that 'believes that it might be snowing' expresses a genuine propositional attitude toward the proposition that it might be snowing. It seems to me that 'believes that it might' expresses an attitude that can indicate the believer's compatibility position, but doesn't out and out express it (instead it expresses the propositional attitude). This is because of the possibility of being in a position where one's belief state is compatible with both p and not p, but one doesn't believe that it might be that p. That is, that one takes no attitude toward the proposition expressed by "it might be the case that p." In such a case, it would be false to say "Betty believes that it might be snowing" even though her belief state is compatible with its snowing/not snowing.

It seems that to say "believes that it might be snowing" truly of Betty, we must know something about Betty's thoughts about the weather. Namely, that she is having thoughts about the weather and finds herself in a position where she isn't committed one way or the other. If she has no thoughts about the weather it would be odd to say of her that she believes that it might be snowing.

]]>

What do we use if we want to say that something is compatible with someone's beliefs? Suppose at some worlds compatible with Betty's belief state, it is currently snowing. We could express this by "Betty does not believe that it is not snowing". But (for some reason) that's really hard to parse.

Arguably, the most natural choice is: "Betty believes that it might
be snowing". Here, the possibility modal 'might' is embedded under the
necessity modal 'believes'. Clearly the embedded 'might' is relative
to Betty's belief state: "Betty believes that it might be snowing"
does not state that Betty believes that for all *we* know, it
might be snowing. So 'might', in effect, serves as the dual of
'believes', but it has to be embedded under 'believes' because we need
a transitive verb to indicate the person whose beliefs are compatible
with the relevant proposition.

But why does "believes that might" express the dual of belief, rather than a higher-order belief about belief? Because the logic of belief is arguably KD45, and in KD45, □◇p is equivalent to ◇p.

In fact, this is a nice argument in favour of assuming that the logic of belief is (at least) KD45: the assumption explains why "believes that might" is commonly used to express the dual of belief, and why there's no need to introduce a separate verb for the dual.

What about knowledge? There is also no dual for 'knows' in English. But here the situation is different.

First, unlike "Betty does not believe that it is not snowing", "Betty does not know that it is not snowing" is not too hard to understand.

Second, the logic of knowledge is plausibly weaker than KD45, so "knows that might" is plausibly not equivalent to "not knows not". Indeed, "Betty knows that it might be snowing" does suggest that Betty has higher-order knowledge concerning the possibility of snow, rather than simply a first-order knowledge state that is compatible with snow.

So why don't we have a dual for 'knows'? The reason, I suspect, is that absense of knowledge is less unified than absence of belief. There are different reasons why someone might fail to know not-p, and it's useful to have different expressions for the different cases.

One reason why Betty might fail to know that it is not currently snowing is that it is in fact snowing. If it snowing, then Betty can't know that it is not snowing, because knowledge entails truth. But in such a case, the norms of pragmatics imply that instead of '~K~p' we should simply say 'p': it is shorter and more informative.

Another reason why Betty might fail to know that it is not currently snowing is that she fails to believe that it isn't snowing. If knowledge entails belief, then lack of belief entails lack of knowledge. So it might be more informative to use the dual of belief ('believes that might') rather than the dual of knowledge, especially if we also don't know whether it is snowing.

Third, if we don't know whether it is snowing, and we know that
Betty doesn't know either, then it is usually better to say that Betty
doesn't know *whether* it is snowing, rather than that she
doesn't know that it is not snowing. Again, it's more informative, and
not more complicated.

These don't cover all possibilities. Sometimes we may know that it is not snowing, and we want to communicate that Betty is not aware of this fact. In that case, we seem to fall back on 'not knows not': "Betty doesn't know that it is not snowing".

In sum, here's my conjecture:

1. We don't have a designated dual of 'believes' because we already have 'believes that might', which serves the same purpose.

2. We don't have a designated dual of 'knows' because there are usually more informative things to say, and we have the means to say these more informative things.

]]>https://www.aliusresearch.org/uploads/9/1/6/0/91600416/alius_bulletin_n%C2%B02__2018_.pdf]]>

Sly Pete and Mr. Stone are playing poker on a Mississippi riverboat. It is now up to Pete to call or fold. My henchman Zack sees Stone's hand, which is quite good, and signals its content to Pete. My henchman Jack sees both hands, and sees that Pete's hand is rather low, so that Stone's is the winning hand. At this point, the room is cleared. A few minutes later, Zack slips me a note which says "If Pete called, he won," and Jack slips me a note which says "If Pete called, he lost." I know that these notes both come from my trusted henchmen, but do not know which of them sent which note. I conclude that Pete folded.

One puzzle raised by this scenario is that it seems perfectly appropriate for Zack and Jack to assert the relevant conditionals, and neither Zack nor Jack has any false information. So it seems that the conditionals should both be true. But then we'd have to deny that 'if p then q' and 'if p then not-q' are contrary.

Frank Jackson (in conversation) pointed out that Gibbard's passage raises another puzzle that is commonly overlooked. That puzzle is about confirmation.

Let C→W be the conditional 'if Pete called, he won'.

Let E1 be Zack's information -- more specifically, the information that Pete knows Mr. Stone's hand.

Let E2 be Jack's information -- specifically, that Mr. Stone has the better hand.

Intuitively,

(1) E1 strongly supports C→W.

(2) E2 strongly supports ~(C→W).

(3) E1 doesn't strongly support ~E2.

(4) E2 doesn't strongly support ~E1.

But if we read "strongly support" as "making highly probable" then these four assumptions are probabilistically inconsistent. (The proof is left as an exercise.)

You might question (3) or (4). Here's a simpler example where (3) and (4) are not in doubt.

We toss two independent, fair coins. There are four possible outcomes: { H1,T1 } x { H2,T2 }.

Let Same be the proposition (H1 & H2) v (T1 & T2).

Let E1 be Same.

Let E2 be T2.

Let H1→Same be the conditional 'if H1 then Same'.

Intuitively,

(1) E1 strongly supports H1→Same: P(H1→Same/E1) > 0.8 (say).

(2) E2 strongly supports ~(H1→Same): P(~(H1→Same)/E1) > 0.8.

But the following is easily provable:

(3) E1 doesn't strongly support ~E2: P(E2/E1) = 1/2.

(4) E2 doesn't strongly support ~E1: P(E1/E2) = 1/2.

(1)-(4) are probabilistically inconsistent. So (1) and (2) can't be true: either E1 doesn't make H1→Same highly probable or E2 doesn't make ~(H1→Same) highly probable (or both).

The lesson is that our intuitions about whether some piece of evidence supports a given conditional cannot be trusted.

The usual contextualist responses to Gibbard's puzzle seem to be of no help here. The only way to block the lesson would be to give up probabilistic measures of evidential support. But even then we retain the lesson that we can't trust intuitions about whether some evidence renders some conditional probable.

The lesson generalizes. If we can't trust these intuitions, then we
also can't trust intuitions about the probability of a conditional in
a given hypothetical scenario -- for that just *is* an intuition
about the extent to which the assumptions of the scenario makes the
conditional probable. And then we plausibly also can't trust outright
intuitions about the probability of a conditional, since that's the
probability of the conditional given our total evidence.

The lesson is more or less the same as the lesson taught by Lewisian triviality results. But the Gibbard-Jackson route is different from Lewis's route. In particular, we have never assumed that the intuitive probability of a conditional is the corresponding conditional probability.

That said, there is also a way of turning the Gibbard-Jackson argument into an argument against "Stalnaker's Thesis", that for any rational credence function P, P(A→B) = P(B/A). Here is how.

Return to the coin toss scenario. It is easy to see that

(5) P(Same/H1) = 1/2,

(6) P(Same/H1 & Same) = 1

(7) P(Same/H1 & T2) = 0

By Stalnaker's Thesis, it follows that

(8) P(H1→Same / Same) = 1 and

(9) P(H1→Same / T2) = 0,

since P(*/Same) and P(*/T2) are rational credence functions.

(8) and (9) are stronger versions of (1) and (2), and we know that these can't be true. So Stalnaker's Thesis is also false.

]]>very interesting post.

I have several questions, but here I will pose only one, I hope not too confused.

Suppose I have NON separable preferences across time, as in the rational addiction model of Becker and Murphy (1988). Preferences in that model are also stationary.

Suppose I can somehow test Time Invariance in preferences.

Can I say that if Time Invariance is satisfied, than those preference are also time consistent? And if Time Invariance is NOT satisfied preference are Time Inconsistent?

Thank you for your reply.]]>

So the 'ought' of objective consequentialism evaluates acts "causally", rather than "evidentially". This provides some (intuitive) motivation for using a causal evaluation for the decision-theoretic 'ought' as well. Can we strengthen this observation? How bad would it be to combine objective consequentialism with evidential decision theory?

Here's one attempt to bring out a tension. Imagine an agent whose personal utility function orders possible states of the world in just the way some form of objective consequentialism does, giving highest utility to the "best" states and lowest to the "worst" ones. Suppose also the agent has perfect information about which state would result from each of the options presently available to her. Intuitively, what this agent ought to do in light of her beliefs and desires is precisely what she ought to do according to objective consequentialism. That is, the subjective 'ought' of decision theory and the objective 'ought' of objective consequentialism should here coincide.

In fact, however, the two oughts plausibly do coincide even in evidential decision theory. That's because, as Lewis pointed out in "Causal Decision Theory", conditional on any particular dependency hypothesis (about what the available options would bring about), evidential expected utility and causal expected utility are plausibly equivalent.

So we need a different case to bring out the tension. Here's such a case, inspired by "Jack Spencer and Ian Wells.

Consider a Newcomb Problem in which the outcomes are measured not in dollars but in consequentialist utilities. As before, assume the agent facing the problem has subjective utilities that match the consequentialist utilities.

It is clear what the agent ought to do, from the perspective of objective consequentialism: she ought to take both boxes. (Recall that the 'ought' of objective consequentialism evaluates acts causally, by looking at the outcomes the acts would bring about, given all relevant facts about the world -- known and unknown. One relevant fact is the content of the opaque box. If the opaque box is in fact empty, then one-boxing would lead to zero consequentialist utilities and two-boxing to a thousand; if the opaque box is non-empty, then one-boxing would lead to 1 million utilities and two-boxing to 1 million and 1 thousand. Either way, two-boxing would lead to the better state.)

Now here we have an agent with perfectly consequentialist values
who *knows* that she ought to two-box, in the objective
sense. Yet evidential decision theory says it would be irrational for
her to two-box! That's not a logical contradiction. But it surely
sounds unappealing. It would be better to have a decision theory on
which it can't happen that a morally perfect agent is irrational for
choosing an act of which she knows that she morally ought to choose
it.

The argument generalizes. For one thing, it generalizes beyond evidential decision theory to other decision theories that recommend one-boxing, such as ""timeless decision theory", ""disposition-based decision theory", "Spohn's recent spin on causal decision theory, and whatever decision theory Teddy Seidenfeld thinks is right.

The argument also generalizes beyond objective consequentialism, given that almost every (sensible) moral theory can be consequentialised. In general, if you think the notion of an objective moral ought is coherent, you probably shouldn't say that one-boxing is the rational choice in Newcomb's Problem.

]]>I am puzzled about these efforts, for two reasons.

First, as Lewis and Kratzer pointed out in the 1970s and 80s,
if-clauses often (according to Kratzer, always) function as
restrictors of quantificational and modal operators. So when we see an
if-clause in the vicinity of a modal like 'the probability that', the
*first* thing we should consider is whether the if-clause
restricts the modal.

How would an if-clause restrict a probability modal? Well, what is the probability of B restricted by A? An obvious answer is that it's the probability of B given A. So if the if-clause in 'the probability that if A then B' restricts the probability modal, then the expression denotes the conditional probability of B given A. Which is just what we find.

In other words, there is independent evidence about if-clauses suggesting that 'the probability that if A then B' should be analysed as 'P(B/A)' rather than 'P(if A then B)'. If that's correct, then what's expressed by

(*) the probability that if A then B equals the conditional probability of B given A

is the trivial identity 'P(B/A) = P(B/A)'. There's no need to make a big effort trying to make (*) true.

The case of subjunctive conditionals is parallel. We have the intuition that

(**) the probability that if A were the case then B would be the case equals the conditional probability of B on the subjunctive supposition that A.

Again, the first thing we should check is whether the if-clause restricts the modal. And, plausibly, subjunctive if-clauses restrict probability modals by subjunctive supposition (aka imaging). And then (**) expresses the trivial 'P(B//A) = P(B//A)'.

When people try to give a semantics motivated by (*) or (**), they practically never explain what's wrong with the simple and obvious explanation of (*) and (**) that I've just given.

That's one reason why I'm puzzled by these efforts. Here's a second reason.

For concreteness, let's look at subjunctive conditionals, which I'll write 'A > B'. As Lewis shows towards the end of "Probabilities of Conditionals and Conditional Probabilities", if you want to validate 'P(A > B) = P(B//A)', you have to assume a Stalnaker-type semantics for '>' on which, for any world w and any proposition A, there is a unique A-world that is "closest" to w; 'A > B' is true at w iff B is true at the closest A-world.

But if we assume a Stalnaker-type semantics of would counterfactuals 'A > B', then what should we say about might counterfactuals, 'if A were the case then B might be the case' -- for short, 'A *> B'?

Clearly, 'A *> B' can't be the dual of 'A > B', otherwise the two would be equivalent. The only option I can think of is to say, with Stalnaker, that 'A *> B' must be analysed as Might(A > B).

But that's unappealing, especially in the present context.

For one, the idea that 'A *> B' means 'Might(A > B)' is incompatible with a broadly Kratzerian treatment of 'if' and 'might'.

Moreover, syntactically, 'would' and 'might' seem to play similar roles in 'if A then would B' and 'if A then might B'. One would at least like to see some more evidence that 'might' scopes over the conditional and 'would' does not. Relatedly, (as I mentioned in an earlier post), it seems to me that

What if A were the case? It might be that B

is equivalent to 'if A were the case then it might be that B'. But surely 'might' in the second sentence doesn't somehow scope over 'if' in the first.

Moreover, let's look at the probability of might counterfactuals. Assuming that 'Might' in 'Might(A > B)' is epistemic, 'Might(A > B)' is true relative to an information state s iff s is compatible with A > B. What is the probability that s is compatible with A > B, relative to s? Unless the information state is unsure about itself, it will be either 0 or 1. Specifically, we get the prediction that P(A *> B)) = 1 if P(A > B) > 0 and P(A *> B) = 0 if P(A > B) = 0. But intuitively, 'the probability that if A then might B' is not always 1 or 0.

So what you have to say, if you want to analyse 'A *> B' as 'Might(A > B)', is that despite surface appearance, the expression 'the probability that if A then might B' does not denote the probability of the embedded might counterfactual 'if A then might B'. Perhaps the two epistemic modals merge and the expression denotes the probability of 'if A then would B'. Or whatever. But in the present context, it's funny that you have to say such a thing, given that your whole approach is motivated by your commitment to the idea that 'the probability that if A then would B' denotes the probability of the embedded would counterfactual.

]]>