<<

Are the closest worlds the most probable worlds? That's not so obvious. But if so, then I can see how Lewis' semantics would make such a prediction. But if not, the it is hard to see why L-semantics would predict the second occasion.]]>

I regret how much time I have spent on this topic. I first noticed it in 2006, and thought I had a nice explanation. When I posted it "on the blog, Kai von Fintel kindly pointed me towards some literature. A little later, Paolo Santorio suggested that my explanation resembles the one in "Klinedinst (2007). This seemed right, but I had in mind a more pragmatic implementation. I eventually wrote up my proposal in "Schwarz (2021). Although my original interest was sparked by conditionals, that paper focuses on possibility modals, and only briefly mentions how the account might be extended to conditionals. When I got the invitation to write something for Al's festschrift, I thought I could spell out the application to conditionals, and compare it with Al's account. But I couldn't really make it work. So I ended up defending a more orthodox derivation based on "Kratzer and Shimoyama (2002).

Klinedinst, Nathan W. 2007. “Plurality and Possibility.” PhD thesis, Los Angeles.

Kratzer, Angelika, and Junko Shimoyama. 2002. “Indeterminate Pronouns: The View from Japanese.” In *Proceedings of the 3rd Tokyo Conference on Psycholinguistics*, 1–25. Tokyo: Hituzi Syobo.

Schwarz, Wolfgang. 2021. “Discourse, Diversity, and Free Choice.” *Australasian Journal of Philosophy* 99 (1): 48–67. "doi.org/10.1080/00048402.2020.1736108.

What didn't work? These come to mind:

- To make tex4ht compile the source, I had to replace the ifthen package with etoolbox and remove the 'breakable' options from tcolorboxes.

- tex4ht didn't find commands defined with \(re)newcommand in my custom cls file.

- The html produced for tcolorboxes sometimes contained an extra closing </div> and sometimes missed two closing </div>s.

- The links to subsections in the table of contents produced by tex4ht didn't work.

- \ref links to tcolorbox definitions and theorems didn't work.

- Links to sections and chapters and random places in the document didn't work.

I'm afraid I don't have a MWE for any of these, so this is probably not all that useful. You can find my tex source on github.com/wo/logic2. I use make4ht version 0.4 from the 2024 texlive distribution.]]>

I might do it your way for my old papers. It would be nice to have all of them in markdown/orgmode. Unfortunately, these formats don't seem ready yet for complex documents like my logic notes.]]>

It could just be my browser but some of the cross references aren’t showing up. See for instance the links back to earlier definitions in section 2.4. ]]>

I was right. But ah, LaTeX! There are, of course, multiple options. You can use "pandoc. Or "tex4ht. Or "lwarp. Or "LaTeXML. All of them *sort of* work, after some fiddling and consulting their thousand-page manuals. But none of them support all the packages I use. And shouldn't those *gather* lists have more line-spacing, etc.?

I ended up using tex4ht, through "make4ht. To make sure that it doesn't balk at my tex input and that the output looks the way I like, I wrote a "python script full of ugly regular expressions to preprocess the tex and postprocess the html. "Here is the result. It's OK. But why are these menial tasks always so hard?

]]>Now remember the case of Pollock's coat (introduced in "Nute (1980)). John Pollock considered 'if my coat had been stolen last night…'. He stipulates that there were two occasions on which the coat could have been stolen. By the standards of "Lewis (1979), worlds where it was stolen on the second occasion are more similar to the actual world than worlds where it was stolen on the first occasion. Lewis's similarity semantics therefore predicts that if the coat had been stolen, it would have been stolen on the second occasion. This doesn't seem right.

An obvious solution, adopted, for example, in "Bennett (2003), assumes that if the antecedent A describes a time interval ("last night"), then the fork time tends to be before that interval. That is, Pollock's counterfactual directs us to worlds that are much like the actual world until shortly before last night, and then deviate in some way to allow a theft of Pollock's coat.

I used to think that this solves the issue. But the solution leads to trouble, twice over.

The first kind of trouble is that it seems to invalidate certain inferences that are patently valid.

Let A be the hypothesis that Pollock's coat was stolen last night. Let A1 be the hypothesis that the coat was stolen on the first occasion, and A2 be the hypothesis that it was stolen on the second occasion. Let C be some ordinary consequent.

The following inference looks patently valid to me:

(1)A1 > C, A2 > C ⊨ A > C.

If the earlier and the later thefts would both have led to C, and there are no other theft possibilities, then a theft last night would surely have led to C!

The "early fork time" response does not validate this inference. It assumes that the fork time for A is more or less that of A1, while the fork time for A2 may be later. It follows that the worlds to which A2 directs us need not be among the worlds to which A directs us.

The inference (1) is valid according to the familiar similarity semantics of Lewis and Stalnaker. It follows that the "early fork time" response is incompatible with these accounts. It requires an antecedent relative account, discussed and rejected, for example, in "Stalnaker (1984, 129ff.) and "Bennett (2003, 298ff.). As Stalnaker points out, such accounts also invalidate, for example, the inferences (2) and (3):

(2)A > B, B > A, A > C ⊨ B > C.

(3)A > B, (A∧B) > C ⊨ A > C.

That was the first kind of trouble. To see the second, I need to bring in backtracking counterfactuals. Recall Lewis's adaptation of an example from Downing, in "Lewis (1979, 33):

Jim and Jack quarreled yesterday, and Jack is still hopping mad. We conclude that if Jim asked Jack for help today, Jack would not help him. But wait: Jim is a prideful fellow. He never would ask for help after such a quarrel; if Jim were to ask Jack for help today, there would have to have been no quarrel yesterday. In that case Jack would be his usual generous self. So if Jim asked Jack for help today, Jack would help him after all.

The final counterfactual ('…Jack would help') is a backtracking counterfactual. Informally, backtracking counterfactuals direct us to worlds where the antecedent becomes true in a way that is consistent with salient regularities about the world, such as Jim's pridefulness.

What's important for my purposes is that one can often generate a backtracking reading by moving the fork time back in time. (I think "Khoo (2017) suggests that this is how backtracking counterfactuals always work.)

Consider a world that is much like the world in the Downing-Lewis example, until *shortly before yesterday's quarrel*, and then deviates in a minimal way to allow Jim today to ask Jack for help. Given Jim's pridefulness, the world will plausibly deviate in a way that prevents the quarrel. And so Jack will help Jim.

Now let's return to Pollock's coat and the proposal that the relevant fork time for 'if my coat had been stolen last night…' lies at around the start of last night. If the second occasion for theft was towards the end of the night, we should expect backtracking effects when we consider thefts on that occasion. I claim that no such effects can be observed.

Of course, it's not clear what the relevant effects would be in this case. But take a variant of the Downing-Lewis example. Same story as before. Let's not emphasize Jim's pridefulness etc., and consider the statement: 'if Jim had asked Jack for help yesterday morning or today, Jack would have helped him'. Intuitively, this is false, because Jack wouldn't have helped if Jim had asked today, after yesterday's quarrel. But if the fork time for the counterfactual is yesterday morning, we should expect that the counterfactual is true.

What are we to make of this?

One could get around the logic problems by suggesting that the fork time is not semantically sensitive to the antecedent, but only to the conversational context. This would lead to a strict conditional account, in which 'if A' quantifies over all A-worlds that deviate at the contextually determined fork time.

OK. But this doesn't seem to help with the backtracking issue. And I worry that the move is not available for deontic conditionals, where analogous issues arise.

Another idea is that our judgements about conditionals like Pollock's are distorted by whatever gives rise to the Simplification of Disjunctive Antecedents effect – the appearance that 'if A or B, C' entails 'if A, C' and 'if B, C'. It has often been suggested that this is an implicature. Perhaps there is a general mechanism by which conditionals with unspecific antecedents implicate corresponding conditionals with more specific antecedents. This might explain why A>C seems to entail A1>C and A2>C, in the case of Pollock's coat.

But then what is the literal meaning of A>C? Was "Lewis (1979) right about the similarity standards after all? I don't think so. 'If my coat had been stolen, it would have been stolen on the second occasion' sounds robustly false, unless a theft on the second occasion was somehow more likely.

So I'm not sure what to think.

Bennett, Jonathan. 2003. *A Philosophical Guide to Conditionals*. New York: Oxford University Press.

Khoo, Justin. 2017. “Backtracking Counterfactuals Revisited.” *Mind*, fzw005. "doi.org/10.1093/mind/fzw005.

Lewis, David. 1979. “Counterfactual Dependence and Time’s Arrow.” *Noûs* 13: 455–76.

Nute, Donald. 1980. *Topics in Conditional Logic*. Vol. 20. Springer Science & Business Media.

Stalnaker, Robert. 1984. *Inquiry*. Cambridge (Mass.): MIT Press.

Let's look at Nick Bostrom's version of the argument, as presented for example in "Bostrom (2008).

We compare two possibilities about the prospects of humanity:

*Early Doom*: The total number of humans who will have ever lived is 100 billion.

*Late Doom*: The total number of humans who will have ever lived is 100 trillion.

The argument goes as follows.

Early Doom and Late Doom have roughly equal prior probability. Every Early Doom world is inhabited by 100 billion people; a priori, each of these positions is equally likely to be ours. Similarly for the 100 trillion positions in Late Doom worlds. If we now take into account the fact that there have only been around 50 billion humans so far (i.e., that our "birth rank" is around 50 billion), it follows by Bayes' theorem that Early Doom is vastly more probable than Late Doom.

More precisely, using 'E' for Early Doom, 'L' for Late Doom, and 'R' for the information that our birth rank is around 50 billion, Bayes' theorem gives us:

\[\begin{align*} P(E / R) &= \frac{P(R / E) P(E)}{P(R / E) P(E) + P(R / L) P(L)}\\ &= \frac{1/10^{11} \cdot 1/2}{1/10^{11} \cdot 1/2 + 1/10^{14} \cdot 1/2} \approx 0.999. \end{align*} \]Can we conclude that it is 99.9% likely that we will soon go extinct?!

The most obvious problem with this argument is that E and L are not the only (a priori) possibilities. What do we get if we drop this assumption?

Let's use 'N' for the total number of humans who will have ever lived. Suppose we start with a uniform prior over N=1 to N=10^{100} (say), generalizing Bostrom's uniform prior over E and L. Within each N=k world, the prior is evenly divided over all humans. Each position in each N=k possibility then has probability 1/(k*10^{100}). This is also the unnormalized posterior probability of N=k after conditioning on our position (birth rank) r, for k>=r. The probability of N=k is therefore inversely proportional to k:

\[
P(N\!=\!k) = \frac{c}{k},
\]

where c is a constant.

This does imply that every small-world hypothesis N=k is much more probable than a corresponding large-world hypothesis N=1000k. On the other hand, there are many more large-world possibilities than small-world possibilities. For example, the probability of N=10^{11} is about equal to the probability that N is between 10^{14} and 10^{14}+1000. So we can be as confident that there will be 100 billion people as that there will be 100 trillion people plus or minus 500. It's not obvious that this should disturb us.

In fact, the calculation implies that we are a lot more likely to be among the first half of all humans than among the second half. On the face of it, this may seem unduly optimistic, given that (by definition) half of all humans in any world are among the second half.

One might respond that even if we're among the first half of all humans, we may still be close to extinction, given that the human population is so much larger today than it was in the distant past.

This points at the other obvious flaw in the argument: We have a lot of further information besides our birth rank.

Suppose all you know about a population of bacteria is that it has doubled every hour for the last few days and currently stands at 100 million. What's your probability distribution over how many bacteria will ever have existed in that population?

Hard to say, but the distribution should not be flat. We expect tendencies to project into the future. It's more likely that the population will double again in the next hour than that it will quadruple or halve.

Similarly, the fact that humanity has been growing favours futures with a lot more humans than futures with fewer humans.

But don't we know that the exponential growth of the human population will come to an end soon? Well, yes. We have *a lot* of further information. It's really hard to assess how it all adds up.

Once we see the obvious flaws in the argument, it's not clear why we might want to change the crucial assumption about priors that Bostrom and others have focussed on: that Early Doom and Late Doom have roughly equal prior probability.

In the future, I might use the following variation of the doomsday argument (inspired by some of the cases in "Bostrom (2001)):

Doom II.We have created a device that will either destroy all humans or ensure our interplanetary survival for millions of years. Which of these will happen depends on whether the Nth digit of a certain physical constant is even (doom) or odd (no doom). We have not been able to measure this digit. How confident should we be that it is even?

Here we can, for simplicity, assume that there are really just two possibilities, much like Early Doom and Late Doom. If we start with a uniform prior over whether the digit is even or odd – as seems reasonable – and take into account our early birth rank, as above, we get the seemingly unreasonable conclusion that the digit is almost certainly even.

Bostrom, Nick. 2001. “The Doomsday Argument Adam & Eve, UN++, and Quantum Joe.” *Synthese* 127 (3): 359–87. "doi.org/10.1023/A:1010350925053.

Bostrom, Nick. 2008. “The Doomsday Argument.” *Think* 6 (17-18): 23–28. "doi.org/10.1017/S1477175600002943.

I'll write 'A>C' for the conditional 'if A then C'. For the purposes of this post, we assume that 'A>C' is true at a world w iff all the closest A worlds to w are C worlds, by some contextually fixed measure of closeness.

It has often been observed that the simplification effect resembles the "Free Choice" effect, i.e., the apparent entailment of '◇A' and '◇B' by '◇(A∨B)', where the diamond is a possibility modal (permission, in the standard example). But there are also important differences.

According to standard modal semantics, '◇(A∨B)' is equivalent to '◇A ∨ ◇B'. But '(A∨B)>C' is not equivalent to '(A>C) ∨ (B>C)'. For example, suppose C is true at the closest A worlds but not at the closest B worlds, and the closest A∨B worlds are B worlds. Then 'A>C' is true, but '(A∨B)>C' is false.

In general, the truth-value of '(A∨B)>C' depends on three factors:

- whether the closest A worlds are C worlds,
- whether the closest B worlds are C worlds, and
- the relative closeness of A and B (i.e., whether the closest A worlds are closer than the closest B worlds or vice versa).

Nothing like the third factor is relevant for '◇(A∨B)'.

I'm not going to go over Franke's model of Free Choice again. What's important is that it involves the following three states:

- t
_{A}, where A is permitted but B is not, - t
_{B}, where B is permitted but A is not, and - t
_{AB}, where both A and B are permitted.

We have the following association between these states and the truth-value of relevant messages:

'◇A' | '◇B' | '◇(A∨B)' | |
---|---|---|---|

t_{A} |
1 | 0 | 1 |

t_{B} |
0 | 1 | 1 |

t_{AB} |
1 | 1 | 1 |

For conditionals, he says, the same kind of association holds, "provided we reinterpret the state names":

'(A>B)' | '(B>A)' | '(A∨B)>C' | |
---|---|---|---|

t_{A} |
1 | 0 | 1 |

t_{B} |
0 | 1 | 1 |

t_{AB} |
1 | 1 | 1 |

This is table 86 on p.44. But how are we supposed to interpret these state names?

There is no interpretation that would make the table correct. The table makes it look as if the truth-value of '(A∨B)>C' is determined by the truth-values of 'A>C' and 'B>C'. But it is not. For example, what about a state in which 'A>C' is true, 'B>C' is false, and '(A∨B)>C' is false, because B is closer than A? This possibility is nowhere to be found in the table.

So Franke's IBR model of Free Choice does not, in fact, carry over to SDA.

(I would assume that this problem has been noticed before, but it isn't mentioned in "Bar-Lev and Fox (2020) or "Fox and Katzir (2021), where Franke's model is discussed. Am I missing something?)

Anyway, let's move on.

As I said above, the truth-value of '(A∨B)>C' depends on

- whether the closest A worlds are C worlds,
- whether the closest B worlds are C worlds, and
- the relative closeness of A and B (i.e., whether the closest A worlds are closer than the closest B worlds or vice versa).

There are 12 possible combinations of these three factors. '(A∨B)>C' is true in five of them:

(S1)A is closer, A>C, ¬(B>C)

(S2)A is closer, A>C, B>C

(S3)B is closer, ¬(A>C), B>C

(S4)B is closer, A>C, B>C

(S5)A and B are equally close, A>C, B>C

Here, 'A is closer' means that the closest A worlds are closer than the closest B worlds, and 'A>C' means that the closest A worlds are C worlds.

Note that three of these five cases have both A>C and B>C. Imagine a speaker who thinks that their addressee has uniform priors over all twelve cases. Imagine the speaker knows A>C and B>C. Then '(A∨B)>C' is a already a better choice than, say, 'A>C' or 'B>C'. '(A>C) ∧ (B>C)' is better still, but if a higher-up hearer only compares the uttered message to its alternatives, we might expect to get an SDA effect, without any higher-order implicature.

This isn't quite right, though.

With uniform hearer priors, '(A∨B)>C' is a good option to convey A>C ∧ B>C, but it is also a good option to convey other states. In particular, it is the best option (at level 1) among its alternatives for conveying S1 and S3. That's because 'A>C' and 'B>C' (and their negations) are each true in six states and thus confer lower probability to S1 and S3 than '(A∨B)>C' would.

(Incidentally, this is why Franke's model doesn't work for SDA: '(A∨B)>C' is not a "surprise message" at level 2.)

Here's a simulation that confirms these claims:

var states = Cross({ closest: ['A', 'B', 'A,B'], Cness: ['A','B','A,B','-'] }) // C-ness 'A' means that the closest A worlds are C worlds var meanings = { 'A>C': function(s) { return s['Cness'].includes('A') }, 'B>C': function(s) { return s['Cness'].includes('B') }, 'AvB>C': function(s) { return s['closest'] == 'A' && s['Cness'].includes('A') || s['closest'] == 'B' && s['Cness'].includes('B') || s['closest'] == 'A,B' && s['Cness'] == 'A,B' }, 'A>C and B>C': function(s) { return s['Cness'].includes('A') && s['Cness'].includes('B'); }, '-': function(s) { return true } }; var alternatives = { 'A>C': ['A>C', 'B>C', '-'], 'B>C': ['A>C', 'B>C', '-'], 'AvB>C': ['A>C', 'B>C', 'AvB>C', '-'], 'A>C and B>C': keys(meanings), '-': ['-'] } var state_prior = Indifferent(states); var hearer0 = Agent({ credence: state_prior, kinematics: function(utterance) { return function(state) { return evaluate(meanings[utterance], state); } } }); var speaker1 = function(observation, options) { return Agent({ options: options || keys(meanings), credence: update(state_prior, observation), utility: function(u,s){ return learn(hearer0, u).score(s); } }); }; display('hearer0 -- A>C is compatible with six states, AvB>C with five:'); showKinematics(hearer0, ['A>C', 'AvB>C']); var s1 = { closest: 'A', Cness: 'A' }; var s2 = { closest: 'A', Cness: 'A,B' }; display('speaker1 -- prefers AvB>C if she knows the state is S1 or S2'); showChoices(speaker1, [s1, s2], [alternatives['AvB>C']]);

To derive the Simplification effect, we need to ensure that speakers don't use '(A∨B)>C' to convey S1 or S3.

There are different ways to achieve this. I'm going to invoke a QUD.

Recall, once more, that the truth-value of '(A∨B)>C' is determined by the truth-value of 'A>C' and 'B>C' and the relative closeness of A and B. Normally, however, we don't expect that speakers who utter '(A∨B)>C' are trying to convey anything about the relative closeness of A and B. What's normally under discussion is whether A>C and whether B>C, not which of A and B is closer.

So let's add a QUD to the model, as in "this post. Normally, the QUD is whether A>C and whether B>C. '(A∨B)>C' is then no longer a good option for a speaker who knows that the state is S1 or S3: 'A>C' is better in S1, 'B>C' is better in S3.

With this QUD, '(A∨B)>C' is the best option among its alternatives only in three of the 12 possible states: in S2, S4, and S5. In each of these, we have A>C and B>C. If the level-2 hearer assumes that the speaker chose the best option from among the alternatives of the chosen utterance, he will infer from an utterance of '(AvB)>C' that 'A>C' and 'B>C' are both true:

// continues #1 var quds = { 'state?': function(state) { return state }, 'A>C?B>C?': function(state) { return state['Cness'] } }; var makeHearer = function(speaker, state_prior, qud) { return Agent({ credence: state_prior, kinematics: function(utterance) { return speaker ? function(s) { var speaker = speaker(s, alternatives[utterance], qud); return sample(choice(speaker)) == utterance; } : function(s) { return evaluate(meanings[utterance], s); } } }); }; var makeSpeaker = function(hearer, state_prior, qud, cost) { return function(observation, options) { return Agent({ options: options || keys(meanings), credence: update(state_prior, observation), utility: function(u,s){ var qu = quds[qud]; return marginalize(learn(hearer, u), qu).score(qu(s)) - cost(u); } }); }; }; var cost = function(utterance) { return utterance == '-' ? 2 : utterance.length/20; }; var qud = 'A>C?B>C?'; var hearer0 = makeHearer(null, state_prior, qud); var speaker1 = makeSpeaker(hearer0, state_prior, qud, cost); var hearer2 = makeHearer(speaker1, state_prior, qud); showKinematics(hearer2, ['AvB>C']);

('Cness: "A,B"' means that the closest A worlds and the closest B worlds are both C worlds.)

I've defined this simulation with factory functions so that one can easily create more agents and check different parameters. For example, if you change `qud`

to `'state?'`

, the level-2 hearer doesn't become convinced of A>C and B>C.

I've assumed that the SDA effect arises because the relative closeness of A and B is normally not under discussion when we evaluate '(A∨B)>C'.

This might shed light on a puzzle about the distribution of SDA.

(1)If Spain had fought with the Axis or the Allies, it would have fought with the Axis.

A speaker who utters (1) would not be interpreted as believing that Spain would have fought with the Axis if it had fought with the Allies (even though it is theoretically possible to fight on both sides).

Similarly for (2), from "Lassiter (2018):

(2)If Spain had fought with the Axis or the Allies, it would probably have fought with the Axis.

Why don't we get SDA here?

We might, of course, say that the inference is cancelled due to the implausibility of the conclusion. But perhaps we can say more.

Clearly, when somebody utters (1) or (2), the relative closeness of the two possibilities is under discussion. The point of (1) is precisely to state that Spain joining the Allies is a more remote possibility than Spain joining the Axis.

In the context of (1) and (2), then, the QUD is not `'A>C?B>C?'`

. Perhaps it is `'state?'`

, or perhaps it is which of A and B is closer. The above model predicts that this breaks the derivation of SDA.

The hypothesis that SDA depends on the QUD is supported by the following observation, due to "Nute (1980).

Consider (3):

(3)If Spain had fought with the Axis or the Allies, Hitler would have been happy.

In a normal context, (3) conveys that Hitler would have been happy no matter which side Spain had fought on, which is false. So here the SDA effect is in place. Now Nute observes that (3) can become acceptable if it is uttered right after (1).

A similar point could be made with (4):

(4)If Spain had fought with the Axis or the Allies, Hitler would probably have been happy, for surely Spain would have chosen the Axis.

The 'for surely' explanation in (4) clarifies that relative remoteness is under discussion, so that SDA isn't licensed. Likewise, if (3) is uttered right after (1), the relative remoteness question that is raised by (1) is still in place,

(Can we explain why (1), (2), and (4) make the relative remoteness of A and B salient? Presumably the explanation is that these sentences would be infelicitous if the relative remoteness were irrelevant, so it becomes relevant by accommodation. Might be useful to write a simulation for this.)

I don't like the above model.

I'm not sure why. I think it's because the inference is driven by quantitative likelihood comparisons – for example, that A>C ∧ A>C holds in 5/12 cases where '(AvB)>C' is true, as opposed to 6/12 where 'A>C' is true. Is our language faculty really sensitive to these quantitative differences?

The likelihood dependence also means that the inference only works for certain kinds of state priors.

I've assumed that the state prior is uniform. But the most striking examples of Simplification are cases like (3), where one disjunct is clearly more remote than the other.

(3)If Spain had fought with the Axis or the Allies, Hitler would have been happy.

The above model runs into trouble here.

If it is common knowledge that A is closer than B, then '(AvB)>C' is semantically equivalent to 'A>C'. A hearer should be puzzled why the speaker would use the needlessly complex '(AvB)>C'.

The problem doesn't just arise if it is certain that A is closer than B. Here is a prior according to which it is *almost certain* that A is closer than B:

// continues #2 var state_prior = update(Indifferent(states), { closest: 'A' }, { new_p: 0.99 }); viz.table(state_prior);

(The call to `update`

Jeffrey-conditionalizes the uniform prior on the information that A is closer than B, with a posterior probability of 0.99.)

With this prior, a fully informed speaker would never utter '(AvB)>C' if the QUD is 'A>C?B>C?':

// continues #3 var hearer0 = makeHearer(null, state_prior, 'A>C?B>C?'); var speaker1 = makeSpeaker(hearer0, state_prior, 'A>C?B>C?', cost); var hearer2 = makeHearer(speaker1, state_prior, 'A>C?B>C?'); showKinematics(hearer2, ['AvB>C']);

This isn't a decisive objection. One might argue that the computation of SDA is insulated from the worldly knowledge that A is closer than B. One could also argue that a hearer might be unsure about whether the speaker intrinsically prefers uttering 'A>C' over the slightly more complex '(AvB)>C'. We can still predict SDA if there's no preference for simpler utterances:

// continues #4 var no_cost = function(utterance) { return 0 }; var speaker1 = makeSpeaker(hearer0, state_prior, 'A>C?B>C?', no_cost); var hearer2 = makeHearer(speaker1, state_prior, 'A>C?B>C?'); showKinematics(hearer2, ['AvB>C']);

But let's try a different approach.

In section 1, I emphasized some differences between SDA and Free Choice.

In particular, '(AvB)>C' is not (literally) equivalent to 'A>C ∨ B>C', whereas '◇(A∨B)' is equivalent to '◇A ∨ ◇B'.

Still, '(AvB)>C' *entails* 'A>C ∨ B>C'. A literal-minded speaker would therefore only utter '(AvB)>C' if she knows that at least one of A>C and B>C obtains.

Let's assume that the speaker has a preference for simpler utterances, that A worlds are likely to be closer than B worlds, as in the prior from source block #3, and that the relative closeness of A and B is not under discussion. As we saw in simulation #4, a literal-minded speaker who is fully informed about the state would then always prefer 'A>C' or 'B>C' over '(AvB)>C'.

What if the speaker isn't fully informed? Suppose all she knows is that at least one of A>C and B>C obtains. In that case, 'A>C' and 'B>C' would be bad. '(AvB)>C' would be better. The speaker doesn't know that it is true, but *with respect to the QUD* it wouldn't communicate anything false.

Or suppose the speaker knows that either A is closer than B and A>C holds, or B is closer than A and B>C holds. In this case, she knows that '(AvB)>C' is true, without knowing that 'A>C' is true or that 'B>C' is true.

In sum, a literal-minded speaker would prefer '(AvB)>C' among its alternatives iff (i) she knows that at least one of A>C and B>C obtains, and (ii) she lacks a certain kind of further information.

Imagine a hearer who believes himself to be addressed by such a speaker. Hearing '(AvB)>C', he could infer (i) and (ii).

This is analogous to what the hearer would infer from an utterance of '◇(A∨B)' in the case of Free Choice. It doesn't involve any frequency comparisons.

What would the hearer infer from 'A>C'? Intuitively, he should be able to infer that 'B>C' is false. Recall that the QUD is whether A>C and whether B>C. There seems to be a general mechanism by which, if the QUD is whether X and whether Y, and a speaker says X, one can infer that ¬Y.

So let's assume that 'A>C' would prompt an inference to ¬(B>C). This is analogous to the inference from '◇A' to ¬◇B.

Now imagine a higher-level speaker who thinks that he is addressing such a hearer. Imagine she knows that A>C and B>C both obtain. Uttering 'A>C' would be bad, as it would convey ¬(B>C). Uttering 'B>C' would be equally bad. '(AvB)>C' would be better. It would be the best option among its alternatives.

As a result, a hearer on the next level who presumes that the speaker is well-informed would regard '(AvB)>C' as indicating A>C and B>C.

On this model, the derivation of SDA really is a lot like "the derivation of Free Choice in the previous post.

Let's write a simulation to check that it works.

We need to allow for imperfectly informed speakers. But we have 12 possible states now. This means that there are 2^{12}-1 = 4095 ways to be informed or uninformed. If we consider all possibilities, the simulation becomes painfully slow.

To speed things up, I'll only consider six kinds of speaker information:

- The speaker is fully informed.
- The speaker is fully informed about which of A and B is closer, but lacks any information about A>C and B>C.
- The speaker is fully informed about A>C and B>C, but lacks information about which of A and B is closer.
- The speaker knows whether at least one of 'A>C' and 'B>C' is true.
- The speaker knows whether '(A∨B)>C' is true.
- The speaker knows nothing.

// continues #5 var access = { // maps states to observations 'full': function(s) { return s }, 'closest': function(s) { return { closest: s.closest } }, 'Cness': function(s) { return { Cness: s.Cness } }, 'A>CvB>C': function(s) { return s.Cness == '-' ? { Cness: '-' } : function(t) { return t.Cness != '-' } }, 'AvB>C': function(s) { var tv = evaluate(meanings['AvB>C'], s); return function(t) { return evaluate(meanings['AvB>C'], t) == tv }; }, 'none': function(s) { return states } } var access_prior = { 'full': 0.4, 'closest': 0.03, 'Cness': 0.03, 'A>CvB>C': 0.02, 'AvB>C': 0.02, 'none': 0.5 };

As in earlier posts, I assume a default presumption that the speaker is fully informed.

// continues #6 var makeHearer = function(speaker, state_prior, qud) { return Agent({ credence: join({ 'state': state_prior, 'access': access_prior }), kinematics: function(utterance) { return speaker ? function(s) { var obs = evaluate(access[s.access], s.state); var speaker = speaker(obs, alternatives[utterance], qud); return sample(choice(speaker)) == utterance; } : function(s) { return evaluate(meanings[utterance], s.state); } } }); }; var makeSpeaker = function(hearer, state_prior, qud, cost) { return function(observation, options) { return Agent({ options: options || keys(meanings), credence: update(state_prior, observation), utility: function(u,s){ var qu = quds[qud]; var hearer_state_credence = marginalize(learn(hearer, u), 'state'); return marginalize(hearer_state_credence, qu).score(qu(s)) - cost(u); } }); }; }; var state_prior = update(Indifferent(states), { closest: 'A' }, { new_p: 0.99 }); // var state_prior = Indifferent(states); var qud = 'A>C?B>C?'; // var qud = 'state?'; var hearer0 = makeHearer(null, state_prior, qud); var speaker1 = makeSpeaker(hearer0, state_prior, qud, cost); var hearer2 = makeHearer(speaker1, state_prior, qud); var speaker3 = makeSpeaker(hearer2, state_prior, qud, cost); var hearer4 = makeHearer(speaker3, state_prior, qud);

(You can see how the effect depends on the state prior and the QUD by uncommenting `var state_prior = Indifferent(states);`

or `var qud = 'state?';`

. With a uniform state prior, we would get SDA by the same mechanism as in section 2.)

// continues #7 display('hearer2:'); showKinematics(hearer2, ['A>C', 'AvB>C']);

As predicted, upon hearing 'AvB>C', the level-2 hearer infers that (i) at least one of A>C and B>C obtains, and that (ii) the speaker lacks information.

Upon hearing 'A>C', the level-2 hearer only has a slight tendency to think that 'B>C' is false.

To get the desired effect, 'AvB>C' should be a better choice for communicating A>C ∧ B>C than 'A>C' and 'B>C', at the next level up. In the case of Free Choice, a slight tendency to infer that '◇B' is false based on an utterance of '◇A' was not enough, because ◇A ∧ ◇B was even more unlikely conditional on '◇(A∨B)'. In the present case, 'AvB>C' turns out to yield a comparatively high credence of around 40% in A>C ∧ B>C. This is enough to derive SDA:

// continues #8 display('hearer4:'); showKinematics(hearer4, ['AvB>C']);

Like the first model, this model relies on subtle likelihood comparisons, and therefore on specific assumptions about the priors. For example, the derivation doesn't work in a painfully slow model that treats all ways of being uninformed as equally likely:

// continues #8 var access_prior = { 'full': 0.45, 'partial': 0.05, 'none': 0.5 }; var get_observation = { 'full': function(state) { return state }, 'partial': function(state) { // return uniform distribution over all partial observations compatible with state var observations = filter(function(obs) { obs.includes(state) && obs.length > 1 && obs.length < states.length }, powerset(states)); return uniformDraw(observations); }, 'none': function(state) { return states } }; var makeHearer = function(speaker, state_prior, qud) { return Agent({ credence: join({ 'state': state_prior, 'access': access_prior }), kinematics: function(utterance) { return function(s) { var obs = evaluate(get_observation[s.access], s.state); var speaker = speaker(obs, alternatives[utterance], qud); return sample(choice(speaker)) == utterance; } } }); }; var hearer2 = makeHearer(speaker1, state_prior, qud); showKinematics(hearer2, ['A>C', 'AvB>C']);

Here, A>C ∧ B>C has a slightly greater credence under 'A>C' than under 'AvB>C', so the level-3 speaker would prefer 'A>C' to communicate A>C ∧ B>C, and we won't get an SDA effect.

A better model would make sure that 'A>C' strongly conveys ¬(B>C). The non-arbitrariness requirement from my previous post crudely serves this purpose:

// continues #8 var makeHearer = function(speaker, state_prior, qud) { return Agent({ credence: join({ 'state': state_prior, 'access': access_prior }), kinematics: function(utterance) { return function(s) { var obs = evaluate(access[s.access], s.state); var speaker = speaker(obs, alternatives[utterance], qud); return bestOption(speaker) == utterance; } } }); }; var hearer2 = makeHearer(speaker1, state_prior, qud); display('hearer2:'); showKinematics(hearer2, ['A>C', 'AvB>C']); var speaker3 = makeSpeaker(hearer2, state_prior, qud, cost); var hearer4 = makeHearer(speaker3, state_prior, qud); display('hearer4:'); showKinematics(hearer4, ['AvB>C']);

Bar-Lev, Moshe E., and Danny Fox. 2020. “Free Choice, Simplification, and Innocent Inclusion.” *Natural Language Semantics* 28 (3): 175–223. "doi.org/10.1007/s11050-020-09162-y.

Fox, Danny, and Roni Katzir. 2021. “Notes on Iterated Rationality Models of Scalar Implicatures.” *Journal of Semantics* 38 (4): 571–600. "doi.org/10.1093/jos/ffab015.

Franke, Michael. 2011. “Quantity Implicatures, Exhaustive Interpretation, and Rational Conversation.” *Semantics and Pragmatics* 4: 1:1–82. "doi.org/10.3765/sp.4.1.

Lassiter, Daniel. 2018. “Complex Sentential Operators Refute Unrestricted Simplification of Disjunctive Antecedents.” *Semantics and Pragmatics* 11: 9:EA–. "doi.org/10.3765/sp.11.9.

McKay, Thomas, and Peter Van Inwagen. 1977. “Counterfactuals with Disjunctive Antecedents.” *Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition* 31 (5): 353–56. "doi.org/10.1007/BF01873862.

Nute, Donald. 1980. “Conversational Scorekeeping and Conditionals.” *Journal of Philosophical Logic* 9 (2): 153–66. "doi.org/10.1007/BF00247746.