wo's weblog

If then else

Sat, 26 Apr 2025 15:53:38 +0000

Bare indicative conditionals are bewildering, but they become surprisingly well-behaved if we add an 'else' clause.

Intuitively, 'if A then B' doesn't make an outright claim about the world. It says that B is the case if A is the case – but what if A isn't the case?

An 'else' clause resolves this question. 'If A then B else C' makes an outright claim. It says that either B or C is the case, depending on whether A is the case. That is: the world is either an A-world, in which case it is also a B-world, or it is a ¬A-world, in which case it is a C-world. For short: (A∧B)∨(¬A∧C).

An example. Someone has drawn a card from a shuffled deck (with 26 black cards and 26 red cards). I say:

(1)If the card is red, it's a six, else it's a seven.

This says that the card is either a red six or a black seven.

So 'if .. then .. else ..' is truth-functional.

Probability judgements seem to confirm this. Intuitively, the probability of (1) is the probability of drawing either a red six or a black seven, which is 1/13.

But isn't 'else' just an abbreviation of 'and otherwise'? Isn't 'if A then B else C' equivalent to 'if A then B and if ¬A then C'?

It surely seems so. (1) and (2) seem interchangeable:

(2)If the card is red, it's a six, and if it's not red, it's a seven.

We get the same judgements about truth-conditions and probabilities.

This is good news for the material analysis of conditionals: the conjunction of the material conditionals A→B and ¬A→C is indeed equivalent to (A∧B)∨(¬A∧C).

But what should we say if we don't think that ordinary conditionals are material conditionals?

It's hard to see, for example, how conjoining two conditionals with a selection semantics, as in "(Stalnaker 1968), could turn them into material conditionals.

Or suppose we go with an expressivist semantics in the tradition of "(Adams 1975) and "(Edgington 1995), on which conditionals don't express propositions at all: why do we get a proposition if we conjoin two non-propositions with 'and'?

It doesn't help much to tinker with 'and'. We can, for example, consider a discourse with two separate conditionals:

(3)If the card is red, it's a six. If the card is black, it's a seven.

Intuitively, (3) says the same as (1) and (2). The probability that both sentences in (3) are true is 1/13.

Another puzzle:

How do we assess the probability of 'if then else' conditionals (or of conjunctions or sequences or sets of conditionals)?

There's a standard answer to how we evaluate the probability of 'if A then B' (without 'else'): we evaluate the probability of B conditional on A. We evidently can't use this "Ramsey test" method for evaluating 'if A then B else C'.

We could apply the Ramsey test in stages: first evaluate B conditional on A, then evaluate C conditional on ¬A; but then what? multiply the results?

In any case, this can't be what we're doing. Take the card example again. The probability of a six given red is 1/13 (26 red cards, two of them sixes); the probability of a seven given non-red is also 1/13; how do we get from there to an overall probability of 1/13 for the conjunction?

More dramatically, suppose we assess the probability of (4).

(4)If the card is red, it's a six. If it's not red, it's not a six.

Using the Ramsey test, the probability that the first statement is true is 1/13. But together, the two statements say that the card is either a red six or a non-red non-six. This has probability 1/2. So the probability that the report is true increases from 1/13 to 1/2 when we add the second sentence! Adding information can never increase the probability of truth. Does the second sentence reduce the information in the report?

PS: There seem to be cases where 'if .. then .. else ..' isn't truth-functional:

(5)If you mow the lawn you get £1000, else you get nothing.

Suppose you don't mow the lawn and you get nothing. Does this make (5) true? Is (5) highly probably as long as it's highly probable that you don't mow and get nothing? Maybe not. (5) seems to have a reading on which it entails that you would have gotten £1000 if you had mowed the lawn.

I suspect the difference is that this reading of (5) isn't a purely "epistemic" conditional, in contrast to (1). But I'm not sure what it is instead.

Adams, Ernest W. 1975. The Logic of Conditionals.

Edgington, Dorothy. 1995. “On Conditionals.” Mind 104 (414): 235–329. "https://doi.org/10.1093/mind/104.414.235.

Stalnaker, Robert. 1968. “A Theory of Conditionals.” In Studies in Logical Theory, edited by N. Rescher, 98–112. Oxford: Blackwell.

A new kind of Neo-Fregeanism?

Fri, 07 Mar 2025 13:35:46 +0000

Frege argued that number concept are, in the first place, second-order predicates. When we talk about numbers as objects, we use a logical device of "nominalization" that introduces object-level representations of higher-level properties. In Grundgesetze, he assumed that every first-order predicate can be nominalized: for every first-order predicate F, there is an associated object – the "extension" of F – such that F and G are associated with the same object iff ∀x(Fx ↔︎ Gx). The number N is then identified with the extension of 'having an extension with N elements'. Unfortunately, the assumption that every predicate has an extension turned out to be inconsistent, so the whole approach collapsed.

But we can introduce more cautious principles of nominalization. "Button and Trueman (2024) show how one can conservatively (and hence consistently) extend a theory in a Fregean typed language, without singular terms for properties, to a theory in an extended language in which all predicates expressible in the original language have a nominalization. I'll give a brief summary of the construction.

Let T be the original theory. For any expression F of a higher type, we introduce an object-type expression nom(F). We also introduce an operation app so that app(nom(F), a) = F(a). Finally, we restrict all quantifiers in T by a new predicate real. To the resulting theory, we add some axioms governing nom and app and real. The most important ones, for present purposes, are these (slightly simplified):

Nom-real. ∀F(real(F) ↔︎ E!nom(F)).

Nom-nonreal. ∀F ¬real(nom(F)).

Nom-inj. ∀F∀G(nom(F) = nom(G) ↔︎ F = G).

B&T prove that the new theory is a conservative extension of the original theory.

B&T assume that statements like app(nom(F), a) are literally false, like metaphors or fictional statements. I prefer a different conception of their machinery. We can regard app(nom(F), a) as a convoluted way of saying F(a), which is true as long as the object a is F. The old language is a perspicuous representation of reality. The new language is a stipulative extension with no extra metaphysical commitments.

Now. Can we recover Frege's logicism with this machinery?

Not quite. We'd like to define N as the nominalization of 'having a nominalization that applies to N things'. But we can't do that: nom isn't defined for 'having a nominalization that applies to N things'.

We could follow Frege's idea in Grundlagen and define zero as the nominalization of 'being non-self-identical': 0 =_df nom(λx.¬(x=x)). We'd then like to define 1 as nom(λx.x=1), but again we can't do that because nom is not defined for λx.x=1.

So we need to change the construction.

We could proceed in stages. At stage 1, we introduce nominalizations for all predicates in the original language. At stage 2, we introduce nominalizations for all predicates in the language of stage 1, and so on.

Since λx.x=1 is a stage-1 predicate, it can be nominalized at stage 2. So the number 1 can be defined at stage 2. In general, each number n is definable at stage n+1.

To allow quantification over all numbers, we need to add a transfinite stage:

stage 0: the old theory in the old language;
stage n+1: add nominalization of all predicates in stage n;
stage ω: take the union of all previous stages.

If we generalize the last clause to all limit ordinals, will we reach a fixed point at which every predicate expressible in the language has a nominalization? I think so, because I think the language remains countable.

In any case, we have to adjust some other parts of the B&T machinery. The Nom-real axiom has to go. And I think we have to use different app and nom predicates for each stage, to avoid paradox.

It might be worth spelling out this construction. It seems to have a few advantages over traditional Neo-Fregeanism (as in "Hale and Wright (2001)). In particular, it doesn't rely on the magic of abstraction principles (with the bad company of Axiom V). The introduction of numbers as objects is not an instance of abstraction, but of nominalization, and nominalization is a pervasive feature of our language. (We also don't have a special "Julius Caesar problem": that Julius Caesar is not a number is entailed by Nom-nonreal.)

(Thanks to Rob Trueman for discussion.)

Button, Tim, and Robert Trueman. 2024. “A Fictionalist Theory of Universals.” In Higher-Order Metaphysics, edited by Peter Fritz and Nicholas K. Jones, 0. Oxford University Press. "doi.org/10.1093/oso/9780192894885.003.0007.

Hale, Bob, and Crispin Wright. 2001. The Reason’s Proper Study: Essays Towards a Neo-Fregean Philosophy of Mathematics. Oxford: Clarendon Press.

Lewis on Quasi-Realism

Fri, 28 Feb 2025 13:09:58 +0000

In "Quasi-Realism is Fictionalism" ("Lewis (2005)), Lewis seems to suggest that Blackburn's quasi-realism about moral discourse is a kind of fictionalism. The suggestion is bizarre. Has Lewis made silly mistake? (Spoiler: No.)

Let's compare what quasi-realism and fictionalism say about moral discourse.

Blackburn's quasi-realism (as presented, e.g., in "Blackburn (1984, ch.6) and "Blackburn (1993)) is a brand of expressivism. According to Blackburn, moral statements like (1) don't serve to describe special facts, but to express moral attitudes.

(1)Eating people is wrong.

The exact nature of moral attitudes won't matter, except that they are not beliefs.

Fictionalism is harder to pin down. Different authors give different definitions; Lewis gives none. But we get a sense of what he has in mind. According to Lewis, a fictionalist is disposed to utter sentences of a certain type even though she doesn't believe that they are true, understanding them as tacitly "prefixed" or "prefaced" by a disclaimer which cancels the commitment to truth. Lewis cites "Joyce (2001) as an example or moral fictionalism. Joyce suggests that we should keep uttering things like (1), but clarify – when pressed in the philosophy seminar – that these utterances are only pretend-assertions, not real assertions: that we only make-believe what we say.

These two views about (1) are obviously different. The quasi-realist does not think that (1) is, strictly speaking, false, but that we may nonetheless utter it with an understanding that we don't really believe what it says. "Blackburn (2005) repeats this point at length, in response to Lewis, but the point should have been thoroughly clear from Blackburn's other writings. How could Lewis have missed it?

So far, we've only looked at the title of Lewis's essay ("Quasi-Realism is Fictionalism"). Blackburn's response barely engages with the content. Does Lewis have an argument for his surprising claim?

If we skim the paper for such an argument, we find on p.319 what seems to be the central argument:

Blackburn's quasi-realism is just this kind of moral fictionalism. […] One of Blackburn's avowed aims is to earn the right to say what the 'moral realist' does: that means either being or make-believedly being a realist. Another of his avowed aims is to avoid the realist's errors: that means not being a realist. Taking these aims together, he aims to make-believedly be a moral realist.

As "Jenkins (2006) points out, this argument seems to rest on the false assumption that realism and fictionalism are the only options: either (1) expresses belief in a mind-independent moral fact, or it expresses make-believing such a fact. Expressivism denies both.

What a silly mistake!

Well, let's stop skimming and have a closer look at what Lewis actually says. I think it's clear that Lewis isn't talking about statements like (1) when he says that the quasi-realist wants to "say what the 'moral realist' does".

Lewis puts 'moral realism' in scare quotes because he uses it in a technical sense. This is explained on p.315f.:

Let us […] reserve the name 'moral realism' for a moral theory that is committed to [a distinctive error]: that there are properties […] such that we can detect them; and such that when we do detect them, that inevitably evokes in us pro- or con-attitudes towards the things that we have detected to have these properties.

So here is what the 'moral realist' says:

(2)There are properties that we can detect and that inevitably evoke pro- or con-attitudes towards the things that we have detected to have these properties.

When Lewis says that Blackburn's avowed aim is to earn the right to say what the 'moral realist' does, I think he obviously meant things like (2), not things like (1), which are not at all distinctive of 'moral realism'.

As the first paragraph of the paper makes clear, Lewis is interested in a well-known puzzle: if the quasi-realist really echoes everything the realist says, if "he even echoes all the realist says about moral psychology and metaethics" (p.314), how is the position different from realism?

This puzzle does not arise for old-fashioned expressivism (as in "Ayer (1936), for example). Old-fashioned expressivism is easy to tell apart from realism by that fact that it declares, for example, (3) and (4) to be false.

(3)It is true that eating people is wrong.

(4)It is a fact that eating people is wrong.

The aim of Blackburn's quasi-realist program is to extend the expressivist semantics so as to vindicate our apparently realist moral discourse, including statements like (3) and (4). Lewis's paper begins with the supposition that this program succeeds: that the expressivist semantics for (1) can be extended to (3) and (4) and beyond, up to the point where the quasi-realist has "earned the right to echo everything the moral realist says" (p.314).

One might wonder whether Blackburn really wants to vindicate statements like (2). In his response to Lewis, Blackburn distinguishes between our practice itself and speculative philosophical theorizing about our practice. (2) looks like a piece of philosophical theory, rather than something that's integral to our ordinary practice. The aim of quasi-realism, he explains (on pp.331f. of "Blackburn (2005)), is not to vindicate erroneous philosophical theories.

But the example Blackburn gives of an erroneous philosophical theory is not like (2). And there is certainly pressure towards vindicating (2). Blackburn explicitly does want to vindicate statements like (5) and (6) and (7) and (8).

(5)Eating people has the property of being wrong.

(6)Whether something is wrong is independent of us and our attitudes.

(7)We can recognize whether an act is wrong.

(8)If someone recognizes that an act is wrong, they inevitably have a con-attitude towards it.

Isn't (2) just a logical consequence of statements like these? Aren't we allowed to draw the inference?

In any case, if 'moral realism' subscribes to (2) and quasi-realism does not then there is no puzzle. We could easily tell apart the two views by what they say about (2). The premise of Lewis's paper is that realism and quasi-realism can't be told apart in such a simple way.

By hypothesis, then, the quasi-realist is prepared to accept and utter (2). But doesn't the quasi-realist also want to deny (2)? To make the point even more obvious, consider (9).

(9)Hume's projectivist account of morality is deeply mistaken.

It is only a short step from (5) and (6) and (2) and uncontroversial facts about Hume, to (9). So 'anti-realism' endorses (9), and quasi-realism – ex hypothesi – follows along. But wouldn't Blackburn want to deny (9)? At the level of belief: doesn't Blackburn believe that Hume's projectivist account is essentially right?

Again, one might respond that quasi-realism was never meant to be that far-reaching: the quasi-realist only wants to echo harmless statements like (3) and (4) and (5), not things like (2) and (9). And again, it will be hard to draw the line, and Lewis simply sets this possibility aside, because it doesn't raise an interesting puzzle. Let's set it aside as well.

Another thing the quasi-realist could do is accept (2) and (9) as true, and leave it at that. Projectivism, she might say, was the ladder on which we've climbed to quasi-realism, but once we have climbed the ladder, we had to kick it away. We have to disavow projectivism, and presumably expressivism, and naturalism.

This looks deeply unappealing. Is there any other way out?

There is: fictionalism!

The quasi-realist might utter (2) and (9), in a suitable context. But she might also clarify, when pressed in the philosophy classroom, that she doesn't genuinely believe (2) and (9).

That's why quasi-realism is fictionalism. The argument goes like this.

'Moral realism' is committed to statements like (2).
If quasi-realism succeeds, it licenses uttering these statements.
But the quasi-realist doesn't believe things like (2).
So the quasi-realist only make-believes things like (2).

This is not unlike the argument quoted above. And it's not a stupid argument.

The crucial point is that (2), unlike (1), is a suitable object of genuine belief, even by the quasi-realist's lights. Quasi-realism is motivated, to a large extent, by the belief that (2) is false. If the quasi-realist's utterance of (2) were to express genuine belief, she would believe that (2) is both true and false.

(It wouldn't help much to declare, implausibly, that (2) expresses a moral attitude. The quasi-realist who doesn't want to kick away the ladder still wants to reject (2) when explaining and motivating her position.)

Here is Lewis's concluding comment, coming right after the argument:

[Quasi-realism] earns the right to agree with all the moral realist says in just the same way explicit fictionalism does, whether or not it goes on to earn that right twice over by offering its special semantics. (pp.319f.)

Lewis here acknowledges that the quasi-realist may already have a way to agree with the realist by means of their special (expressivist) semantics, and that this way is distinct from the fictionalist way. (Hence 'twice over'.) So Lewis clearly didn't think, as Blackburn and Jenkins and other commentators assume, that the quasi-realist interpretation of moral discourse amounts to a fictionalist interpretation. (Jenkins at least raises this as a puzzle, at the end of her paper.)

How does the quasi-realist's special semantics explain why it's OK to utter (2)? I don't know. To my knowledge, Blackburn has nowhere offered a sufficiently detailed semantics (or semantics+pragmatics) that would cover statements like (2).

In fact, one might suspect that the quasi-realist's special semantics only covers statements which express moral attitudes. Since (2) doesn't express a moral attitude, it would follow that the special semantics doesn't license uttering (2). So it wouldn't license echoing everything the realist says. Quasi-realism would have failed on its own terms, as Lewis understands them.

This means that quasi-realism may require fictionalism not only to remain consistent, but also to fulfil its ambition of echoing realism. The special semantics does that job for moral statements, but perhaps not for statements like (2). Here, fictionalism may be needed to fill the gap.

I think that's why Lewis says "whether or not it goes on to earn that right twice over by offering its special semantics", suggesting that it's an open question whether the special semantics alone is enough.

So. Is the quasi-realist account of first-order moral discourse a fictionalist account of that discourse? Of course not. But if the quasi-realist wants to echo more than the realist's first-order discourse, if she also wants to echo more theoretical statements that may seem to follow from our first-order discourse, then she arguably must endorse a kind of fictionalism.

That was Lewis's point.

Ayer, Alfred Jules. 1936. Language, Truth and Logic. Dover Publications.

Blackburn, Simon. 1984. Spreading the Word. Oxford: Clarendon Press.

Blackburn, Simon. 1993. Essays in Quasi-Realism. Oxford: Oxford University Press.

Blackburn, Simon. 2005. “Quasi-Realism No Fictionalism.” In Fictionalism in Metaphysics, edited by Mark Eli Kalderon, 322–38. Oxford University Press.

Jenkins, C. S. 2006. “Lewis and Blackburn on Quasi-Realism and Fictionalism.” Analysis 66 (4): 315–19. "doi.org/10.1093/analys/66.4.315.

Joyce, Richard. 2001. The Myth of Morality. Cambridge University Press.

Lewis, David. 2005. “Quasi-Realism Is Fictionalism.” In Fictionalism in Metaphysics, edited by Eli Kalderon, 314–21. Oxford: Clarendon Press.

De Finetti's theorem without symmetries?

Fri, 21 Feb 2025 14:53:16 +0000

Bruno de Finetti ("de Finetti (1970)) suggested that chance is objectified credence. The suggestion is explained and defended in "Jeffrey (1983, ch.12), "Skyrms (1980 ch.I), "Skyrms (1984, ch.3), and "Diaconis and Skyrms (2017, ch.7), but I still find it hard to understand. It seems to assume that rational credence functions are symmetrical in a way in which I think they shouldn't be.

Let's imagine that the world is a countably infinite sequence of coin flips. De Finetti's Theorem states that if a probability measure P over this space is exchangeable, i.e., invariant under finite permutations, then it can be represented as a mixture of iid Bernoulli processes:

P(\omega) \;=\; \int_0^1 \bigl[\text{iid Bernoulli}(\theta)\bigr](\omega)\,d\mu(\theta).

That is, \(P\) reasons about the sequence as if it were generated by a stable chance process. The suggestion is that when we reason about chance, we are using this representation of our credence function. We're not really reasoning about an objective physical magnitude.

I want to set aside this philosophical point; I'm confused about a more technical point: I don't think a rational credence over the sequences should be exchangeable.

At this point, Skyrms (and Jeffrey) tell us that one can replace full exchangeability by weaker invariance conditions: partial exchangeability only requires invariance under a subgroup of permutations, stationarity only requires invariance under "time shifts", Markov exchangeability only requires invariance under "2-block permutations". But I don't think a rational credence should have any of these symmetries!

Suppose the first 20 outcomes are HTHTHTHTHTHTHTHTHTHT. What do you think: is the next outcome more likely to be H or T? I think it's more likely to be H. From what we know, this sequence doesn't look random. It might be random, of course, but I'd give significant credence to the hypothesis that it keeps alternating between H and T.

By contrast, the sequence THHTHTHHTTTHHTTHTHHT could just as well continue with H or T. Since it's a permutation of the alternating sequence, exchangeability requires giving the same credence to both. But the first is more likely to continue with H than the second. So we don't have exchangeability.

Intuitively, without any further information, I'm unsure whether the events in the sequence are (a) outcomes of a stable chance process, or (b) deterministically generated, or (c) outcomes of a Markov process in which the next outcome depends on the previous outcome, and so on. My credence is a mixture of these possibilities. I think any rational credence function should be such a mixture.

Such a mixture will have no non-trivial invariance group. It's not exchangeable, or partially exchangeable, or stationary, or Markov exchangeable. So how is de Finetti's proposal supposed to get off the ground?

Here's an idea. Ignore the Markov possibility for a moment, and suppose we restrict the possible deterministic patterns to a fixed finite class \(D\). The class will include the alternating sequence HTHTHTHTHT…, the sequence THTHTHTHTHT…, the sequence HHTHHTHHTHHT…, and other sequences with a clear pattern (including perhaps the prime number pattern mentioned in "Jeffrey (1983, 207) – I don't understand what Jeffrey says about it). Let \(P\) be a credence that is unsure whether the sequence is (a) a member of \(D\), or (b) the result of a Bernoulli process. \(P\) won't have any interesting symmetries, but \(P\) conditional on \(\neg D\) will.

One might even think that \(P(\cdot \mid \neg D)\) is exchangeable. But it is not, as it rules out the nicely patterned permutations. I think de Finetti's theorem still goes through, though: \(P(\cdot\mid \neg D)\) is a mixture of iid Bernoulli processes. That's because the restriction to \(\neg D\) only rules out a finite set of possibilities that gets measure 0 anyway.

So perhaps friends of de Finetti could say this. A rational prior credence function itself may not have any interesting symmetries. But conditional on \(\neg D\), it probably will. And then we can represent \(P\) as undecided between a range of deterministic hypotheses and a range of chance hypotheses.

Can we add the Markov possibility back into the mix? I suppose we can. Technically, an iid process is a special case of a Markov process. But if we start with a noninformative prior over the parameters of a Markov process, the iid hypothesis will have measure 0. So we should really treat the iid hypothesis as a separate part of \(P\). Still, de Finetti's theorem for Markov exchangeable measures should tell us how to determine the two parts of \(P(\cdot \mid \neg D)\), assuming we have Markov exchangeability.

I've assumed that there's a hard cut-off between the random and the deterministic possibilities. This may seem problematic. But I'm not assuming that \(P\) treats them as on a par. Realistically, the "deterministic part" of \(P\) should give high credence to simple patterns and increasingly low credence to complex patterns. The above reasoning goes through as long as there are only finitely many deterministic patterns. I suspect one can fix the reasoning to not rely on this requirement. But even so, we can allow for extremely complicated deterministic patterns to which the deterministic part of \(P\) assigns negligible probability.

Is this how the proposal is supposed to work?

de Finetti, Bruno. 1970. Theory of Probability. New York: John Wiley & Sons.

Diaconis, Persi, and Brian Skyrms. 2017. Ten Great Ideas about Chance. Princeton University Press.

Jeffrey, Richard. 1983. The Logic of Decision. 2nd ed. Chicago: University of Chicago Press.

Skyrms, Brian. 1980. Causal Necessity. A Pragmatic Investigation of the Necessity of Laws. New Haven: Yale University Press.

Skyrms, Brian. 1984. Pragmatics and Empiricism. Yale: Yale University Press.

Are recalcitrant worlds less probable?

Fri, 07 Feb 2025 17:03:44 +0000

The Best-Systems Account of chance promises to explain why beliefs about chance should affect our beliefs about ordinary events, as formalized by the Principal Principle. In this post, I want to discuss a challenge to any such explanation.

First, some background.

For any candidate chance function f, let [f] be the set of worlds of which f is (part of) the best system. According to the Best-Systems Account (BSA), the hypothesis "Ch=f" that f is the true chance function expresses the proposition [f]. In what follows, I'll assume that a world is simply a history of "outcomes", and that the candidate systems can be compressed into a single (possibly parameterized) chance function.

In essence, the Principal Principle then says that for any rational prior credence Cr and history h, Cr(h | Ch=f) = f(h). The BSA promises to explain this link because it implies that Ch=f contains valuable information about the history: If Ch=f is true, the actual history must lie in [f]. On the best-systems interpretation, the Principle says that Cr(h | [f]) = f(h).

We note an immediate problem: f generally assigns positive probability to histories outside [f]. But Cr(h | [f]) is obviously 0 for any h outside [f]. This is the well-known "undermining problem". In response, "Hall (1994) and "Lewis (1994) suggested that we should reformulate the Principal Principle to say that Cr(h | [f]) = f(h | [f]). This is the "New Principle". I want to set this issue aside.

To explain why Cr(h | [f]) should equal either f(h) or f(h | [f]), we need to have a look at the histories in [f]. What do they look like?

The answer depends on how we spell out the standards for good systems. One important criterion is fit. Lewis suggests that the fit of a candidate chance function f to a history h is measured by the probability f(h) that f assigns to the history.

Another important criterion is simplicity. Without simplicity, the best systematization of a history would simply describe the exact history. If we allow the two criteria to trade off against each other, the best systematization of a sufficiently random-looking history will often be probabilistic.

In easy cases, the relative frequencies in histories among [f] will exactly match the f-chances: if they don't, there's an alternative chance function f' with greater fit, so the history belongs to [f'] rather than [f]. The only way this wouldn't happen is if the better-fitting chance function is less simple.

Suppose, for example, that a history is a sequence of coin flips in which the coins are distinguishable by their "weight". We might have a history in which there's a noisy statistical dependency between outcomes and weight: coins with greater weight tend to land heads more often. A good way to systematize such a history uses a parameterized chance function that expresses the chance of heads and tails in terms of weight. In principle, one can always find a function whose chances perfectly match the frequencies. But that function might be horrendously complicated. It's easy to imagine cases where a linear function f would win the trade-off between simplicity and fit, even though the frequencies in the history don't precisely match the f-chances.

Now let's look again at the histories in [f], where f is this kind of linearly parameterized chance function. Some histories in f have frequencies that closely match the f-chances. Call these well-behaved. Other histories in [f] have frequencies that deviate from the f-chances. Call these recalcitrant.

The recalcitrant histories would be more accurately described by a non-linear chance function. They are in [f], even though f has comparatively low fit to them, because there's no simple alternative to f with greater fit.

Let h be some well-behaved history in [f]. Let h' some recalcitrant history in [f]. Since f has greater fit to h, and fit is measured by f(h), it follows that f(h) > f(h').

Now here's the challenge.

The Principal Principle requires that Cr(h | [f]) = f(h) and Cr(h | [f']) = f(h'). We've just seen that f(h) > f(h'). So Cr(h | [f]) > Cr(h' | [f]). And so Cr(h) > Cr(h'). The unconditional prior credence Cr must favour well-behaved members of [f] over recalcitrant members.

In fact, since conditioning preserves ratios, the ratio of priors Cr(h) / Cr(h') must be exactly f(h) / f(h')!

Is there any independent justification for this assumption?

I think there is an independent justification of disfavouring recalcitrant worlds. Recalcitrant worlds are less simple, and a rational prior credence should favour simpler worlds.

One way to think about the simplicity of a history (rather than a theory) is in terms of how its best systematization depends on the weight of simplicity in the criteria for best systems. If a world is highly irregular, it calls for a complex systematization. Only complex systematizations have good fit. If we gradually relax the weight of simplicity in the criteria for best systems, the best systematization will become more and more complicated. By comparison, if a world is more regular, the best system remains best even if we relax the weight of simplicity (up to a point).

For example, if a history's frequencies in the weighted-coins case fit a 5th-order polynomial a little better than a linear relationship, then the linearly parameterized chance function f may be best if simplicity has high weight, but not if the weight is relaxed. By contrast, if a history's frequencies closely fit the linear function f, then f remains the best system for a longer time as we continuously relax the weight of simplicity.

So recalcitrant worlds are less regular. And a rational prior should arguably favour regular worlds.

It's still surprising that the justification of the Principal Principle turns on this assumption about priors.

Also, while the above reasoning may explain why Cr(h) > Cr(h'), a lot more work would have to be done to explain why the ratio Cr(h) / Cr(h') has to exactly match f(h) / f(h'). There is a strong pressure towards a Uniqueness view about rational priors.

There might be another way out. I've assumed, with Lewis, that chance functions assign probabilities to entire histories. I don't think science needs such an ambitious concept of chance.

If we make chance functions more local in scope, we first have to revisit the formulation of the Principal Principle: f(h) is generally undefined for complete histories h. We also have to revisit the undermining problem. We can't move to the New Principle, because chance functions won't be defined conditional on [f]. I think we should simply stick with an approximate version of the old Principle: something like Cr(e | [f]) ≈ f(e) for any e in the domain of f.

These are the adjustments and concessions I assume in "Schwarz (2014). I still think the argument I sketched there for a derivation of this Principle from the BSA should work. It doesn't require favouring well-behaved over recalcitrant worlds.

I now think this is a good reason for friends of the BSA to assume that chance is local.

(I'm indebted to Eddy Chen for drawing my attention to the problem of recalcitrant worlds.)

Hall, Ned. 1994. “Correcting the Guide to Objective Chance.” Mind 103: 505–17.

Lewis, David. 1994. “Humean Supervenience Debugged.” Mind 103: 473–90.

Schwarz, Wolfgang. 2014. “Proving the Principal Principle.” In Chance and Temporal Asymmetry, edited by A. Wilson, 81–99. Oxford: Oxford University Press.