The term is poorly chosen and may lead to considerable confusion, especially for those versed in thermodynamics.

Even aside from the existing connotations of the term, Friston's application bares little conceptual connection to his intended meaning.]]>

But first let me explain Meek and Glymour's proposal.

Causal models encode causal information by a probability measure over a directed acyclic graph. The nodes in the graph are random variables whose values stand for relevant (possible) events in the world; the probability measure stands for the objective chance (or frequency) of various values and combinations of values. In many cases one can assume the "Causal Markov Condition", which ensures that conditional on values for its causal parents, any variable is probabilistically dependent only on its effects.

For the application to decision theory, it is important that an adequate model need not explicitly represent all causally relevant factors. If a variable X can be influenced through multiple paths, one may only represent some of these and fold the others into an "error term". The error term must however be "d-separated" from the explicitly represented causal ancestors of X, which effectively means that it is probabilistically independent of those other causes.

In causal reasoning, we often need to distinguish two ways of updating on a change of a given variable. To illustrate, suppose we know that there's a lower incidence of disease X among people who take substance Y. One hypothesis that would explain this observation is that there's a common cause of reduced X incidence and taking Y. For instance, those who take Y might be generally more concerned about their health and therefore exercise more, which is the real cause of the reduced incidence in X. On this hypothesis, taking Y is evidence that an agent is less likely to have disease X, but if we made a controlled experiment in which we gave some people Y and others a placebo, the correlation would be expected to disappear. That's how we would test the present hypothesis. To predict what will happen in the experiment on the assumption of the hypothesis, we have to treat taking or not taking Y as an "intervention" that breaks the correlation with possible common causes. (The fact that somebody in the treatment group of the experiment takes Y is no evidence that they're more concerned with health than people in the control group.)

In general, an *intervention* on a variable makes it
independent of its parent variables. What makes this possible are
error terms. In the X and Y example, agents in the treatment group take
Y because they are paid to do so as part of the experiment. This
causal factor is an error term in the model. As required, the error
term is probabilistically independent of the explicitly represented
other cause for taking Y, namely general concern for one's health.

Now Meek and Glymour's suggestion is that everyone should use Jeffrey's formula for computing expected utilities via conditional probabilities. The disagreement between Evidential and Causal Decision Theory (EDT and CDT), they suggest, is not a normative disagreement about rational choice, but rather a disagreement over whether the relevant acts are considered as interventions.

For example, in Newcomb's problem, there is a correlation (due to a
common cause) between one-boxing and the opaque box containing a
million dollars. Let B=1 express that the agent chooses to
one-box. Conditional on B=1, there is a high probabiliy that there's a
million in the box. However, conditional on *an intervention to
one-box*, the probability of the million is equal to its
unconditional probability: the correlation disappears, just as it does
in the X and Y example.

Now for the problems.

The fist (and most obvious) is that there is no guarantee that
interventions of the relevant kind are available. We can't just assume
that for every value x of any variable A that represents an act, there
is an intervention event *do(A=x)* distinct from *A=x*.

The required assumption is obscured by misleading terminology. If
an agent faces a genuine choice between A=1 and A=2, then one
naturally thinks that she must be free to "intervene" on the value of
A; that she can make *do(A=1)* true or false at will. But
'intervening' and '*do(A=1)*' are technical terms, and in the
required technical sense it is not at all obvious that genuine choices
are always choices between interventions.

Return to Newcomb's problem. The obvious hypothesis about the causal relationships in Newcomb's problem is captured in the following graph.

"Here, B is the variable for one-boxing or two-boxing, P is the prediction, O is the outcome, and C is the common cause of prediction and choice: the agent's disposition to one-box or two-box. Let's assume that the predictor is fallible. How does the fallibility come about? There are two possibilities (which could be combined). Either the predictor has imperfect access to the common cause C, or C does not determine B. Suppose the fallibility is of the first kind. That is, we assume that there are causal factors C which fully determine the agent's choice, but the predictor does not have full access to these factors. That's easy to imagine. The causal factors C cause the predictor's evidence E which in turn causes her prediction, but E is an imperfect sign of C: it is possible that E=1 even though C=2, or that E=2 even though C=1. We could model this by introducing an error term on E, or directly on P (if we don't mention E explicitly).

In this version of Newcomb's Problem, there is no error term on B. So there is no possibility of "intervening" on B in the technical sense of causal models. This does not mean that the agent has no real choice. To be sure, the agent doesn't have strong libertarian freedom, since her choice is fully determined by the causal factors C. But who cares? It's highly contentious whether the idea of strong libertarian freedom is even coherent. It's even more contentious that ordinary humans are free in this sense. And almost nobody believes that robots have that kind of freedom. But robots still face decisions. Many are interested in decision theory precisely because they want to program intelligent artificial agents. An adequate decision theory should not presuppose that the relevant agent has libertarian free will.

That's the first problem. Here is the second. Suppose there are
error terms on the right-hand side in Newcomb's problem. More
specifically, let C be the agent's general disposition to follow CDT
or EDT, and suppose acts of one-boxing can be caused not just by C but
also by random electromagnetic fluctuations in the agent's
muscles. These fluctuations are proper error terms because they
decorrelate B from C. That's just what the interventionist seems to
want. But if that's the causal story, it would be wrong to assess the
choiceworthiness of one-boxing and two-boxing by conditionalizing on
*do(B=1)* and *do(B=2)* respectively. For that means to
effectively conditionalize on the relevant electromagnetic fluctuation
events, which are in no sense under the agent's control. They are not
even sensitive to the agent's beliefs and desires (we may assume).

Here the technical nature of the expressions 'intervention' and
'do' become obvious. In the technical sense, the random
electromagnetic fluctuations are interventions, and they realize
*do(B=1)*. But they are not interventions or doings on part of
the agent in any ordinary sense.

The third problem is pointed out in "Stern 2017. I'll try to make it a little more explicit than Stern does.

Consider the following causal structure.

"Here A represents the agent's possible actions, which may be smoking and not smoking. These are evidentially correlated with some desirable or undesirable outcome O (cancer or not cancer) via a common cause C (as in Fisher's hypothesis about the relationship between smoking and cancer). I is an intervention variable, which, we assume, decorrelates A from C and therefore O. Think of I as something like the agent's libertarian free will.

The depicted structure is not yet a causal model because it doesn't specify the chances. Suppose the agent's credence is evenly divided between two hypotheses about the relevant chances, H1 and H2. According to H1, I=1 and O=1 both have probability 0.9; according to H2 they both have probability 0.1. (It doesn't matter what else H1 and H2 say.)

By the Principal Principle,

Cr(O=1) = Cr(O=1 / H1)Cr(H1) + Cr(O=1 / H2)Cr(H2) = .9 * .5 + .1 * .5 = .5

Cr(I=1) = Cr(I=1 / H1)Cr(H1) + Cr(I=1 / H2)Cr(H2) = .9 * .5 + .1 * .5 = .5

Since both H1 and H2 treat O and I as independent, it follows again from the Principal Principle that

Cr(O=1 / I=1 & H1) = Cr(O=1 / H1) = .9 Cr(O=1 / I=1 & H2) = Cr(O=1 / H2) = .1

By Bayes' Theorem,

Cr(H1 / I=1) = Cr(I=1 / H1) Cr(H1) / Cr(I=1) = .9 * .5 / .5 = .9

Cr(H2 / I=1) = Cr(I=1 / H2) Cr(H2) / Cr(I=1) = .1 * .5 / .5 = .1

Finally, by the probability calculus,

Cr(O=1 / I=1) = Cr(O=1 / I=1 & H1)Cr(H1 / I=1) + Cr(O=1 / I=1 & H2)Cr(H2 / I=1).

Putting all this together, we have

Cr(0=1) = .5

Cr(O=1 / I=1) = .9 * .9 + .1 * .1 = .82

So although the agent assigns credence 1 to causal hypothesis on which I and O are probabilistically independent, the two variables are not independent in her beliefs.

This means that conditional on *do(A=1)*, which is tantamount
to *I=1*, the agent assigns much greater probability to O=1 than
conditional on *do(A=2)*. According to Meek & Glymour et al,
the agent should therefore choose A=1 (via I=1). *But this means to
act on a spurious correlation.*

(The argument does not require an explicit intervention variable. An evidential correlation between A and H1 would do just as well as the assumed correlation between I and H1.)

Stern's observation puts the nail in the coffin of Meek and Glymour's conjecture that CDT and EDT agree on the validity of Jeffrey's formula for calculating expected utilities, but disagree over whether the relevant acts are understood as interventions or ordinary events. In the present example, conditionalizing on interventions in Jeffrey's formula doesn't yield a recognizably causal decision theory.

As a corrolary, we can see that there's an important difference
between conditionalizing on *do(A=1)* and *subjunctively
supposing* A=1, what "Joyce
1999 would write as P( * \ A=1), with a backslash. "Joyce
2010 suggests that if P( * \ A=1) is understood in terms of
imaging or expected chance then there's a close connection between
P( * / do(A=1)) and P( * \ A=1), so that the the operation of
conditionalizing on *do(A=1)* may actually be understood as subjunctive
supposition rather than conditionalizing on an intervention event. But
the discussion presupposes that we are certain of the objective
probabilities. If we are not, conditionalizing on *do(A=1)* is
not at all the same as subjunctively supposing A=1.

To get around the third problem, Stern proposes to use Lewis's K-partition formula for calculating expected utilities, on which Jeffrey's formula is applied locally within each "dependency hypothesis" K and expected utility is the weighted average of the results, weighted by the agent's credence in the relevant dependency hypotheses. In Stern's "interventionist decision theory", the depedency hypotheses are identified with causal models. So expected utility is computed as follows (again, I'm slightly more explicit here than Stern himself):

EU(A) = \sum_K Cr(K) Cr(O / do(A) & K) V(O)

(Since causal models are effectively hypotheses about chance, this account is perhaps even closer to Skyrms's version of CDT than to Lewis's.)

This gets around the problem because any evidence A may provide for or against a particular causal model becomes irrelevant.

Notice that Stern's proposal is "doubly causal", as it were. First, it replaces Jeffrey's formula by the Lewis-Skyrms formula, in order to factor out spurious correlations between acts and causal hypotheses. Second, it replaces ordinary acts A by interventions, do(A). Do we really need both?

Arguably not. Return to Newcomb's problem. Here the Lewis-Skyrms
approach already recommends two-boxing because it distinguishes
*two* relevant dependency hypotheses. According to the first, the
opaque box is empty and so there's a high chance of getting $0 through
one-boxing; according to the second, the opaque box contains $1M and
so there's a high chance of getting $1M through one-boxing.

Can the interventionist also treat these as two different causal models? Yes. Easily. The two models would have the same causal graph, but different objective probabilities. In one model, it is certain that the predictor predicts one-boxing, in the other it is certain that the predictor predicts two-boxing. This may not fit the frequentist interpretation of probabilities in causal models, but this interpretation spells trouble for interventionist accounts of decision anyway, since (a) the Principal Principle for frequencies is much more problematic than for more substantive chances, (b) population level statistics make it even harder to find suitable error terms for intervening (as "Papineau 2000 points out). If instead we think of the probabilities more along the lines of objective chance (though it could be statistical mechanical chance), it is quite natural to think that at the time of the decision, the contents of the box are no longer a matter of non-trivial chance.

So there are good reasons for the interventionist to follow Lewis and Skyrms and model Newcomb's problem as involving two relevant causal hypotheses K. And then we get two-boxing as the recommendation even if, conditional on each hypothesis, we conditionalize on B rather than do(B).

This is nice because it also solves the first two problems for the interventionist: the availability and eligibility of interventions. On the revised version of Stern's account, we don't need interventions any more.

Of course, the revised version of Stern's account is basically the decision theory of Lewis and Skyrms. The only difference is that dependency hypotheses are spelled out as causal models.

Upshot: The theory of causal models can indeed be useful for thinking about rational choice, because causal models are natural candidates to play the role of dependency hypotheses in K-partition accounts of expected utility. The supposedly central concept of an intervention, however, is not only problematic in this context, but also redundant. We can do better without it.

]]>Howard Raiffa's "Decision Analysis; Introductory lectures" has a section in chapter 9, the Art of Implementation, addressing this.

I'm not going to retype it here, but see "https://books.google.com/books?id=TDwdAQAAMAAJ&focus=searchwithinvolume&q=%22can+you+do+a+decision+analysis+of%22

The other related resource is Chris Sim's "Why There Are No True Bayesians", which is funny, and is available in full here - "http://sims.princeton.edu/yftp/Bayes250/NoRealBayesians.pdf

"That is, Bayesian decision theory pays no attention to costs of

computation or to the possibility that we can be uncertain about

something just because we don’t know how to perform a calculation

in the available time."

]]>

But I seem to be alone in thinking that fission is the right paradigm for modeling Sleeping Beauty. A much more popular assumption is that Sleeping Beauty is essentially a problem about "losing track of time": as a result of the potential memory loss, it is claimed, Beauty can't tell upon awakening whether it is Monday or Tuesday, and that's what makes her case special. I don't agree that this adequately sums up Beauty's predicament. Surprisingly, though, I think this way of modeling Sleeping Beauty still supports halfing. (That's surprising because almost all authors who endorse the present interpretation are thirders).

Let's begin with an ordinary case where someone loses track of time.

Noisy Awakening I (incomplete). A loud noise wakes you up at night. You have a vague sense that you've slept for a few hours, but the sensation is equally compatible with it being 2 am or 3 am.

It's important to realize that this is not yet a case in which you're necessarily lost in time. For suppose you knew when you fell asleep that a loud noise was going to wake you up at 2am. Remembering this information, you should be confident upon awakening that it is 2am, despite your unspecific sensation of how much time has passed.

So your new credence in what time it is should be affected by two features. First there is your broadly sensory evidence of how long you've slept, as well as other pieces of "new evidence": your perception that it is still dark, etc. Second, there are your previous beliefs about when you might wake up. Since our topic is losing track of time and not forgetting or irrational priors, we can assume that these earlier beliefs were rational and you have no trouble recalling the reasons on which they were based.

It is not obvious how exactly these two factors determine the new beliefs. But the following special case should be uncontroversial.

(*) If before falling asleep you rationally gave credence x to waking up at t1 and 1-x to waking up at t2, and if upon awakening your new evidence is neutral between t1 and t2, then you should now give credence x to the time being t1 and 1-x to the time being t2.

So let's complete our first scenario.

Noisy Awakening I (complete). A loud noise wakes you up at night. You have a vague sense that you've slept for a few hours, but the sensation is equally compatible with it being 2 am or 3 am. Before going to sleep, you rationally gave credence 1/2 to the noise waking you at 2 am and 1/2 to the noise waking you at 3 am.

What should you believe about the time? Answer: you be 50% confident that it is 2 am and 50% confident that it is 3 am.

Now consider a simplified variant of the Sleeping Beauty problem in which Beauty is rationally certain that the coin lands tails. Before falling asleep, she then assigns credence 1/2 to waking up on Monday and 1/2 to waking up on Tuesday. Upon awakening, any sensations she may have about how much time has passed are presumably defeated by her knowledge of the setup, so we may as well assume that she has no relevant new evidence at all about the time.

If we model this as a "losing track of time" scenario, it is obviously
analogous to *Noisy Awakening*.

The real Sleeping Beauty problem is a little more complicated because there is also the possiblity of heads. Let's make the corresponding adjustments to Noisy Awakening.

InNoisy Awakening II. Before going to sleep on Sunday evening, you were given the following information. A fair coin will be tossed twice. If it comes up heads at least once, a loud noise will wake you up at 2 am, otherwise (if the coin lands tails twice) the noise will wake you up at 3 am. You fall asleep and are awakened by a loud noise. You have a vague sense that you've slept for a few hours, but the sensation is equally compatible with it being 2 am or 3 am.

What should you believe in *Noisy Awakening II* when you are
woken up by the noise?

Fortunately, we don't need any new principles. For the upshot of the information about the coin tosses is that on Sunday you give credence 3/4 to waking at 2am and 1/4 to waking at 3am. By (*), absent relevant new evidence, you should give credence 3/4 to the hypothesis that it is 2am.

If Sleeping Beauty is a problem about losing track of time, we should give the parallel answer: Beauty should give credence 3/4 to the hypothesis that it is Monday. That's what halfers say. They hold that Beauty's Monday credence should be divided 1/2 - 1/4 - 1/4 between the possibilities Heads & Monday - Tails & Monday - Tails & Tuesday; thirding says it should be divided 1/3 - 1/3 - 1/3.

Just to be clear: this is an argument by analogy. We can't directly
apply (*) to Sleeping Beauty, precisely because Sleeping Beauty is not
a straightforward case in which someone is otherwise perfectly
rational but loses track of time. My point is that *if* we model
Sleeping Beauty as nonetheless analogous to such cases (as I think we
shouldn't: we should rather model it as a case of epistemic fission),
we still get an argument for halfing.

(Sleeping Beauty as a case of fission is interesting because it
arguably reveals the tension between evidentialism and
conservatism. Case of merely losing track of time don't: in *Noisy
Awakening II*, your evidence upon awakening plausibly supports the
2 am hypothesis to degree 3/4.)

The other side -- popular mainly in theoretical philosophy -- puts no such restrictions on the individuation of outcomes. In the simplest version of this view (Jeffrey's), an outcome is a conjunction of an act and a state. In any case, on this view the specification of outcomes must not omit anything the agent cares about, and we don't assume agents only care about a fixed set of local features such as monetary payoff.

When people criticize decision theory, the target is almost always localist decision theory.

Other well-known examples are putative cases of intransitive preferences (as discussed e.g. in "Broome 1991: 100f.) or cases where agents care about fairness or risk: "Weirich 1986 nicely explains how e.g. Allais's and Ellsberg's counterexamples to decision theory are really only counterexamples to localist decision theory.

Personally, I find some of the counterexamples to localist decision theory completely convincing. Actual people don't just care about local features of outcomes, and that does not make them irrational.

The puzzle is why anyone ever thought localism is a good idea, and why so many theorists still hold on to it.

As I mentioned in the beginning, there is sadly little discussion about this. Anti-localists like Weirich or Broome or Joyce have rightly pointed out that localism is unmotivated and problematic, but there has been little response from localists.

Let's begin with the problem of normative force. Here it matters where we want to see that force applied. An obvious location is at the interface between beliefs, desires, and choices: decision theory says that if you have such-and-such beliefs, such-and-such desires, and such-and-such options, you should choose option so-and-so. Here we assume that the beliefs and desires are simply given -- perhaps treated as theoretical primitives as e.g. "Eriksson and Hajek 2007 have argued for the case of belief. This normative application of decision theory clearly doesn't depend on localism.

Alternatively, we might want to say that decision theory puts normative constraints directly on one's preferences, without linking them to beliefs and desires. Again, non-localist decision theory does endorse such constraints -- for example in the form of the Bolker-Jeffrey axioms.

The only loss in normative force comes when we seek direct constraints on choice behaviour alone. Localist decision theory prohibits behaviour that manifests risk aversion or loss aversion, non-localist decision theory does not. Without bringing in beliefs, desires, or preferences, non-localist decision theory plausibly allows for any choice behaviour whatsoever. But is that a problem?

If we follow Hume and hold that rationality puts no constraints on one's basic desires, I'd say it clearly isn't. If someone has a basic desire to choose option A in decision problem 1 and option B in decision problem 2, then surely the right thing to do by the lights of their beliefs and desires is to choose A in problem 1 and B in problem 2.

But we don't need to go all the way with Hume. I have some sympathy for the idea that rationality puts constraints on one's basic desires. But not the constraints imposed by localism. Localism in effect assumes that all basic desires -- at least to the extent that they are represented by the agent's utility function -- pertain to local features of outcomes. It rules out rational agents with a basic desire for risk, for fairness, for integrity, for doing what one would like everyone to do, for living a life that fits a certain pattern. We may want a special label for such desires, but "irrational" is really not an useful label. (I'll use "non-local" instead.)

The point is that giving up localism does not mean giving up all constraints on the features of outcomes that may be tracked by an agent's utility function. It merely means allowing utilities to track some non-local features.

In fact, it seems to me that non-localist decision theory provides
a much better framework to think about the rationality of desires and
choices. For example, consider the familiar observation that it's good
to be risk-neutral if you're playing poker or trading stocks: here a
propensity either for or against risky choices is likely to deliver
worse results. *In itself*, I don't think this makes it
irrational to be, say, risk-seeking at a particular point in a
particular game of poker. After all, you may well play for the thrill
and not care about money. There's nothing intrinsically irrational
about that. However, many people who actually play poker or trade
stocks care a lot about long-run monetary payoff. They want to be the
kind of person who earns a lot of money in the long run. That's a
legitimate desire. For those agents, risk-seekingness in individual
choices is irrational insofar as it clashes with the "higher level"
desire about the person they want to be. The irrationality lies in the
tension between different desires. Maybe that's not the most
illuminating story, but it certainly strikes me as a lot better than
the localist story which ignores *both* the desire for long-run
performance *and* the desire for risk in a particular choice,
since neither is local.

In sum, on two of the three normative applications of decision theory, localist and non-localist theories are on a par. On the third application, non-localist theories are preferable since the normative claim of localist decision theory is not only incompatible with the Humean doctrine of desire, but should also be rejected by those who want to put substantive constraints on rational desire.

Let's turn to Buchak's second worry: that non-localist decision theory is unsuitable as an "interpretative" theory for defining beliefs and desires from an agent's choice disposition. The reason, which I'm happy to grant, is that without prior constraints on utility one can generally find many radically different beliefs and desires that would fit the agent's choices.

In response, the first thing to be said is that I don't know anybody who seriously believes that one can define beliefs and desires in terms of choice dispositions alone. Even philosophers who (unlike e.g. Eriksson and Hajek) like functionalist theories of mind generally accept that the relevant functional role of beliefs and desires is not exhausted by the production of behaviour. In particular, there's also an important link between attitudes and perceptual input. (See e.g. "Lewis 1974 or "Lewis 1983.) So we really shouldn't expect beliefs and desires to be determined by choice dispositions, and it's not a problem for non-localist decision theory if that becomes impossible.

Secondly, let's grant for the sake of the argument what nobody believes: that beliefs and desires can be defined by choice dispositions alone. It's true that this won't work if we don't impose prior constraints on beliefs and desires. But it is still a long way from this observation to an argument for localism. Why not impose other constraints on basic desires? Intuitively, many desires that are allowed by localism are quite bizarre -- for example, a basic desire to either own 48526 concert tickets or an amount of dollars that is divisible by 217. Why not rule out desires like this and allow for some intuitively sensible non-local desires such as a desire to maintain one's integrity or a desire for risk?

So like Buchak's first worry, the second worry also hardly provides a reason for siding with localism.

All that said, I can see a certain appeal in localism -- especially in the extreme versions on which utility is a simple function of money or happiness. This turns decision theory ("Expected Utility Theory") into an elegantly simple model with great predictive power. The model may even have a rough fit with real people's behaviour under certain conditions. If we want a tractable model of those situations, the simple model will be much better than a more adequate model which allows for the whole range of what real people really value. But of course the simple model will often reach its descriptive limits, and it has zero normative force. For other contexts, slightly more complex models may be better that take into account a few other things people may desire. For example, I can envisage Buchak's "Risk-Weighted Expected Utility Theory" to be useful for modelling certain contexts where attitudes towards risk play an important role but we still want to bracket other non-local desires. (I find the reformulation of Buchak's theory in "Pettigrew 2015 more perspicuous though, because it correctly identifies a desire for risk as a desire, not as a third kind of attitude besides belief and desire.) But again this model will have descriptive limits, and has little normative force.

For no context, however, do I think it makes sense to draw the line around local features of outcomes.

]]>But yes, that might be a partial way out: say that unlike fragility, fundamental dispositions must be individuated in such a way that they are not present at any time unless their manifestation is present later. That doesn't strike me as a plausible claim about properties like mass or charge though. It seems perfectly possible (or at least coherent) to assume that a massive object disappears. Moreover, this move still doesn't seem to rule out entirely new objects appearing out of nowhere.

]]>

The problem is not new. The underlying point is expressed nicely by Simon Blackburn in ""Hume and Thick Connexions" (1990), where it is attributed to Hume:

[S]uppose we grant ourselves the right to think in terms of a thick connexion between one event and another: a power or force whereby an event of the first kind brings about events of the second. Nevertheless there is no contradiction in supposing that the powers and forces with which events are endowed at one time cease at another, nor in supposing that any secret nature of bodies upon which those powers and forces depend itself changes, bringing their change in its wake. Hume emphasises this point in both the Enquiry and the Treatise. (p.242)

A little later:

Equally if the 'nature of matter' is to help, it must also be so that the continuation of matter is not just one more contingency. (p. 244)

Let's make explicit how this applies to dynamical laws. Such laws narrow down the possible futures of physical systems, ideally (if they are deterministic) to a single possibility. Let's say we have a law according to which state S1 at time t1 will lead to state S2 at time t2. And suppose the world at t1 is full of non-Humean whatnots: relations of necessitation, fundamental dispositions, Aristotelian forces, or whatever. It seems perfectly conceivable that all these things either change or simply disappear between t1 and t2. So the presence of the non-Humean whatnots is compatible with many different futures. And so these whatnots cannot ground the dynamical law which reduces the number possible futures to one.

If this argument is sound, it not only means that the non-Humean accounts in question fail to capture the most important laws of science, but also that they fall prey to what is often claimed to be the main disadvantage of Humean accounts: to offer no explanation for the regularity of the world. If the presence of non-Humean whatnots at any point of time allows for many radically different futures, the dynamical regularity of the world can't be explained by the presence of these whatnots.

We need to have a closer look at the central premise in the above argument, that the presence of the non-Humean whatnots at a given time does not fix what is going to happen later.

Arguably, the premise is more plausible for some whatnots than for
others. Philosophers who like primitive laws might object that these
laws are not things that are "present" at a particular time and
possibly absent or altered afterwards. Perhaps laws do their governing
from outside time. So this view *may* be safe from the problem,
although I would feel a little uneasy to stipulate that it's
absolutely impossible for the basic laws of nature to change.

The problem is more acute for necessitarianism a la Dretske-Tooley-Armstrong and for the recently fashionable movement of dispositionalism.

On the Dretske-Tooley-Armstrong view, one might insist that the second-order necessitation relation between universals is atemporal and so couldn't possibly change. (Again, this would put an absolute ban on changing laws.) But the supposed atemporality of necessitation is arguably at odds with the contingency of this relation (as Helen Beebee points out in " "Necessary Connections and the Problem of Induction" (2011), pp.511f.). Worse, it is hard to see how atemporal necessitation relations between universals could rule out the possibility of alien "invaders" (from "Handfield 2001): new objects with new properties that suddenly appear out of nowhere. But deterministic dynamical laws rule out that possibility.

To get around the problem, dispositionalists would have to postulate a special kind of fundamental essential property that somehow guarantees a particular future evolution. Most simply, they could say that the present state of the world is essentially such that its future will be so-and-so. Alexander Bird mentions something like this in ""The dispositionalist conception of laws" (2005) as a way of accounting for Principles of Least Action:

It might be that the intrinsic properties of the initial state make only one evolution possible, thanks to the dispositional essences of those properties. (p.367)

But notice, first, that ordinary intrinsic properties present in the initial state -- the distribution of mass, charge, spin, etc. -- do not suffice. (As we saw.) We would have to postulate new fundamental properties that look nothing like mass or charge or spin. The property I suggested, of being essentially such that the future will be so-and-so, isn't even dispositional in any meaningful sense.

Second, postulating such properties is intuitively implausible and ad hoc. Dispositionalism draws much of its popularity from the intuitive appeal of saying that a property couldn't be mass if it didn't behave like mass. But now it turns out that what does most of the metaphysical work (in grounding dynamical laws and ensuring the regularity of the world) is an entirely different claim that has no comparable pull: that the present physical state of the world couldn't exist unless the future is so-and-so.

Third, even if we grant that certain facts can be explained by appeal to the essence of the objects that are involved, explanations of this kind aren't automatic. For example, suppose we ask why Kripke became a philosopher. (Another dynamical question.) Suppose somebody answers that Kripke is essentially a philosopher. That is, Kripke is essentially such that he would eventually become a philosopher. Set aside whether this essentialist hypothesis is plausible in itself. (From a Lewisian perspective on essence, it is fine, in a suitable context.) Is it a good answer to my question? I'd say not. The answer doesn't shed any real light on how or why Kripke became a philosopher. Similarly, I think, for the supposed fact that the present state of the world is essentially such that the future is so-and-so: this doesn't really explain why the world evolves the way it does.

Finally (but relatedly), the whole idea of trying to get around the problem by postulating newfangled essential properties assumes that the problem is one about "metaphysical" modality: that we're merely looking for something that rules out metaphysically possible alternative futures. But it's not at all clear that our problem is a problem about metaphysical possibility. If one holds that the actual world is the only metaphysically possible world, would that put to rest the question why the world is regular, or what its dynamical laws are? I don't think so. Our dynamical laws don't just rule out metaphysically possible futures, they also rule out merely epistemically possible futures.

So the problem looks serious. I'm surprised that it is so little discussed.

]]>Informally, the argument goes as follows. Suppose an agent faces a
choice between a number of *straight* options (going left, going
right, taking an umbrella, etc.), as well as the option of calculating
the expected utility of all straight options and then executing
whichever straight option was found to have greatest expected
utility. Now this option (whichever it is) could also be taken
directly. And if calculations are costly, taking the option directly
has greater expected utility than taking it as a result of the
calculation.

Let's fill in the details by working through an example, "without loss of generality".

There are two straight options, Left and Right. In addition, there's the option Calculate which eventually leads to either Left or Right, depending on which of these if found to maximize expected utility. (If they are tied, let's stipulate that Calculate leads to Left.) In a non-trivial decision problem, the outcome of choosing Left and of choosing Right depends on the state of the world. We assume there are two relevant states, Bright and Dark. The expected utility of the straight options is given by (1) and (2). Here I've abbreviated Bright and Dark as 'B' and 'D', respectively; 'BL' denotes the outcome that results from going Left in a Bright world (similarly for 'DL', 'BR', and 'BL'). It will be useful to think of outcomes a collections of features the agent cares about.

(1) EU(Left) = P(B) U(BL) + P(D) U(DL).

(2) EU(Right) = P(B) U(BR) + P(D) U(DR).

What if Left is taken as a result of Calculate? In principle, completely different outcomes could come about. For example, there might be a state of the world in which the agent gets richly rewarded if she goes Left as a result of Calculating, but punished if she goes Left without Calculating. But clearly that's not the kind of case we're interested in, and it's not a case where calculating expected utilities is (known to be) costly. We're interested in cases where the outcomes that may result from going Left as a result of Calculate coincide with those that may result from going Left directly except for one feature in which they are worse, reflecting the cost of calculation.

So Calculate can lead to four possible outcomes which coincide with the four outcomes in (1) and (2) except for one respect in which they are worse. I'll therefore abbreviate these outcomes as 'BL-', 'DL-', 'BR-', and 'DR-', respectively. Thus BL- is the outcome of going Left in a Bright world as a result of Calculating, which is somewhat worse than BL (going Left in a Bright world without Calculating).

Let's keep track of the present assumption about the intrinsic cost of Calculating:

(3) U(BL-) < U(BL); U(DL-) < U(DL); U(BR-) < U(BR); U(DR-) < U(DR);

We have four possible outcomes because the result of Calculate depends not only on whether the world is Bright or Dark (B or D), but also on whether Calculate leads to Left or to Right (CL or CR). So we need to extend out state space from { B, D } to the product of { B, D } with { CL, CR }. Then:

(4) EU(Calculate) = P(B & CL)U(BL-) + P(D & CL)U(DL-) + P(B & CR)U(BR-) + P(D & CR)U(DR-).

Now we need another assumption, namely that the immediate result of Calculate (going Left or going Right) is probabilistically independent of whether the world is Bright or Dark:

(5) P(B & CL) = P(B)P(CL); P(B & CR) = P(B)P(CR); P(D & CL) = P(D)P(CL); P(D & CR) = P(D)P(CR).

This may not always be the case, but the exceptions seem highly unusual. After all, our agent knows that the immediate result of Calculate is not sensitive to the external state of the world: it is fixed by her own probabilities and utilities.

Using (5), we can rearrange (4) as (6).

(6) EU(Calculate) = P(CL)[P(B)U(BL-) + P(D)U(DL-)] + P(CR)[P(B)U(BR-) + P(D)U(DR-)].

So EU(Calculate) is a mixture of EU(Left) and EU(Right), except that each term is made worse by the cost of calculation, as per (3). As a result, EU(Calculate) is always less than either EU(Left) or EU(Right) or both. So -- with the possible exceptions of cases where assumption (5) fails -- Calculate is never a rational option. QED.

What shall we make of this strange result? Here are two lines of response, not necessarily exclusive.

First, perhaps it's wrong to model the agent's options as Left,
Right, and Calculate. Instead, we should distinguish between genuine
*act options*, Left and Right, and *process options* such as
Calculate. Calculate is a process option because it's a possible way
of reaching a decision between the act options. Alternative process
options are, for example: trusting one's instincts, or calculating
which option has the best worst-case outcome and then going ahead with
that option. Arguably you can't go Left without choosing any process
option at all. You have to either follow your instinct, calculate
expected utility, or use some other process. So it's wrong to compare
Calculate with Left and Right. We should rather compare Calculate with
other process options like trusting your instinct. Doing that, we'd
probably get the intuitive result that it's sometimes rational to
calculate expected utilities (to varying levels of precision), and
sometimes to trust one's instincts.

The main problem with this line of response (I think) is that it's far from clear that one can't choose Left without first choosing a process for choosing between Left and Right. For how does one choose a process? By first choosing a process for choosing a process? The regress this starts is clearly absurd: when we make a decision, we don't go through an infinite sequence of choosing processes for choosing processes etc. And if the regress can stop at one level, why can't it also stop at the level before? Why can't one simply choose Left, without choosing any process for choosing between Left and Right?

That's not just a theoretical worry. When you come to a fork in the road, it really seems that you can do three things (among others): go left, go right, or sit down and calculate the expected utilities. Each of these is a genuine option. Of course, whatever you end up doing, there will be a psychological explanation of why you did it. Perhaps you did it out of habit, or out of instinct, or as the result of some further computation. But that's equally true for all three options. So I'm not convinced by the first line of response, although I'm also not convinced it can't be rescued.

Here's the second line of response. Calculating expected utilities is a form of a priori (mathematical) reasoning, and there's a well-known problem of making sense of such reasoning in the standard model of Bayesian agents.

More concretely, consider what the agent in the above example should believe about CL and CR. If she knows her own probabilities and utilities, and she knows (as we can assume) that Calculate would lead to choosing an option with greatest expected utility (or to Left in case of ties), then she must also know either that Calculate would lead to Left or that Calculate would lead to Right, for this follows from what she knows and probability 1 is closed under logical consequence. And of course you shouldn't sit down and go through a costly calculation if you already know the result! From a strict Bayesian perspective, a priori reasoning is always a waste of time because the result is always already known.

When we think about whether an agent should calculate expected utilities, the agent we have in mind does not already know the answer. That seems to leave two possibilities: either the agent does not know her own probabilities and utilities, or she is not probabilistically coherent. But if the agent doesn't know her probabilities and utilities, it is unclear how calculating expected utilities is supposed to help. Moreover, intuitively the kind of agent we have in mind need not be uncertain about her own beliefs and basic desires. So it would seem that she must be probabilistically incoherent. But if we're dealing with incoherent agents, it's no longer clear that expected utility maximization is the right standard of choice. We can't assume that the agent should calculate expected utilities iff doing so would maximize expected utility.

The general point is that when we think about whether it's rational to calculate expected utilities, we have implicitly left behind the domain of perfect Bayesian rationality and turned to bounded rationality. Contrary to widespread thought, perfect Bayesian agents don't always calculate expected utility. They never calculate anything, because they already know the result. Before we can say what agents with bounded rationality should do -- including whether and when they should calculate expected utilities -- we need a good model of such agents.

]]>And the argument that it's so "easy" to make an account and that it's "free" anyway is just pathetic and full of lies: no, it's not free. If youâre getting something for free, then YOU are the product being sold. If euthanasia were for free, would you try it too? :P]]>