**Example 1**. My left arm is paralysed. 'I can't lift my (left)
arm any more', I tell my doctor. In fact, though, I *can* lift
the arm, in the way I can lift a cup: by grabbing it with the other
arm. When I say that I can't lift my left arm, I mean that I can't
lift the arm *actively*, using the muscles in the arm. I said
that I can't do P, but what I meant is that I can't do Q, where Q is
logically stronger than P.

**Example 2**. I have bought a piano and just took my first
lesson. So I can play a few basic tunes. Can I play the piano? In most
contexts, the answer would be no. Normally, when we say that someone
can play the piano, we mean that they can *play reasonably well*,
which is logically stronger than *play*.

(Without context, the standards vary widely, as "this forum thread illustrates, where someone asked, 'at what point can you say that you can play the piano?'. Answers range from 'when you can play all of Chopin's etudes in one sitting' to 'anyone can play the piano'.)

**Example 3**. I'm standing in front of a safe, but I don't know
the combination. I only have a minute. 'I can't open the safe', I
say. But whatever the right combination is, I *can* dial that
combination. So I can open the safe, but only by luck. What I can't do
is *open the safe deliberately* or *at will*, which is
logically stronger than *open the safe*.

**Example 4**. You don't know the way to the train station. I
could walk you there, but due to a disability I can only walk very
slowly, so that you would miss your train. 'I can't walk you to the
station', I say, meaning that I can't walk you there *in
time*.

It is not obvious that these are examples of a single phenomenon. But they all have in common that a 'can P' statement is interpreted as 'can Q', where Q is logically stronger than P.

The effects arguably also arise for deontic 'can' and 'must'. For
example, 'you must (or must not) raise your arm' is naturally
understood as conveying an obligation to *actively* raise the
arm.

What might explain these effects? I can think of five explanations, none of them very good.

**Explanation 1**. The appearance of a strengthened prejacent
comes about through a contextual restriction on the domain over which
the modal quantifies. I don't see how this could work, if we want to
retain the idea that the domain of the modal is given by contextually
salient worlds compatible with relevant circumstances (or the most
"ideal" of these worlds relative to some salient ordering). Consider
example 1. The circumstances surely allow me to lift my left arm with
my right arm. We might try to say that worlds in which I don't
"actively" perform an act are ignored. But in worlds where I lift my
left arm with my right arm, I *am* actively performing that
act. And it is unclear what ideal the act might float. (The best
candidate is perhaps an ideal of normalcy, but that doesn't
generalise.)

**Explanation 2**. The prejacents are ambiguous. 'Lift an arm'
is ambiguous between actively lifting an arm and passively lifting an
arm; 'opening the safe' is ambiguous between opening the safe
deliberately and opening it by luck. But this doesn't seem right. If I
opened the safe by luck, the claim that I did not open the safe is
unambiguously false.

**Explanation 3**. The strengthened prejacent is an
implicature. After all, it can be cancelled: `I can open the safe, but
only by luck'; `I can play the piano, but only poorly'; `I can lift my
arm, but only with the other arm'. But I can't think of a Gricean
explanation for the supposed inference. In the three examples, a
sentence is uttered that is literally false (on the present
explanation). What kind of reasoning leads us from there to the
conclusion that the speaker wanted to convey an alternative
proposition that is true? Paradigm examples of implicatures
strengthen the content of an assertion; here it weakens the
content. Also, the supposed implicatures are equally present in
questions (`can you raise your arm?'), which makes it hard to explain
them in terms of norms of assertion.

**Explanation 4**: The prejacent is strengthened by a process of
"free enrichment". Recanati, Bach, Carston and other have argued that
when we process utterances, we often supplement the uttered sentence
by further, unarticulated constituents that don't have to be
pronounced because they can be taken for granted in the relevant
context. Perhaps this happens in the prejacent of our modals. For
example, `I can't raise my arm' is understood as `I can't raise my arm
actively' – with the adverb `actively' supplemented in the
contextual processing. This would explain the observed effects, but
the whole idea of free enrichment is controversial, and there is (to
my knowledge) no precise model that would predict when an enrichment
occurs, and what kind of enrichments can occur.

**Explanation 5**: Modals have a hidden parameter of adverbial
type that can either be left empty or supplied by conversational
context. If the parameter is supplied, it restricts the interpretation
of the prejacent. In the context of example 1, "active" actions are
relevant, so an unarticulated `actively' modifier is passed to `can',
which restricts not the accessible worlds but the interpretation of
the prejacent. This makes the right predictions, but one would like to
have some independent evidence for the postulated mechanism.

(*) If you always maximizeexpectedutility, then over time you're likely to maximizeactualutility.

Since "utility" is (by definition) something you'd rather have more of than less, (*) does look like a decent consideration in favour of maximizing expected utility. But is (*) true?

Not in full generality. A well-known counterexample is known as "gambler's ruin". Suppose your utility is measured in pounds sterling. Initially you have £1. Now a fair coin is tossed over and over. On each toss, you have the opportunity to bet your total wealth. If the coin lands heads, you get back three times what you bet. If the coin lands tails, you lose everything. As an expected utility maximizer, you would accept the bet each time. You are then practically certain to end up with £0 over time. So maximizing expected utility does not make it likely that in the long run you'll have a lot of actual utility.

So the long-run argument must be a little more complicated. Perhaps
(*) holds in a lot of normal cases. Then we could argue that in
*those* cases, one should maximize expected utility. And perhaps
we could cover the "non-normal" cases by arguing that the same
principle should be used for all cases.

So under what conditions is (*) true?

The only answer I've come across in conversation and in the literature refers to repeated decision problems and the Laws of Large Numbers. (This is one of two arguments for the expected utility norm discussed by Ray Briggs in their "Stanford Encyclopedia article on the norm.) The argument is simple.

Suppose you face the very same decision problem again and again, with the same options, same outcomes, same probabilities, and same utilities. Focus on a particular option, and assume it is chosen over and over. The "Law of Large Numbers implies that the relative frequency of every possible outcome is likely to converge to the probability of that outcome. Consequently, the expected utility of the option is likely to converge to the average actual utility of the option. Which is just what (*) says.

As Ray points out, the argument is not very convincing, because the conditions for (*) are so unusual. In real life, we practically never face the very same decision problem again and again.

In addition, the Laws of Large Number only tell us what happens *in
the limit*. So the argument does not actually favour expected
utility maximization over, say, the alternative strategy of
*minimizing* expected utility in the first 10^100 decisions and
thereafter maximizing expected utility. In the infinite limit, this
strategy converges to the same average utility as maximizing expected
utility.

But these problems can be fixed. Let's start with the easier one, the second.

Take any option X in the repeated decision problem, and let O be one
of the outcomes it might produce. Let p be the probability of O (given
X) in a single trial. The number of times that O comes about in n
trials then has a Binomial distribution with mean np and variance
np(1-p). As n gets larger, the relative frequency of O among all
trials is therefore likely to be close to the probability p –
and not just in the infinite limit. For example, with p=0.5 and n=100,
the probability that the relative frequency lies between 0.4 and 0.6
is 0.97. So, for any possible outcome of X, the relative frequency of
that outcome is likely to quickly approach its probability. And so the
average utility of X is likely to *quickly* approach its expected
utility.

Now let's see if we can drop the assumption that the same decision problem is faced again and again. With the help of some probability theory, this turns out to be relatively easy, once the question is expressed in the right way.

Suppose an agent faces n decision problems in a row; the problems need
not be identical. Let a *strategy* be a function that selects one
option in each problem. Hold fixed some such strategy S. Let U_i be a
random variable that maps each state in the i-th decision problem to
the utility of following strategy S in that problem. Let T = \sum_i
U_i. So T is a random variable for the total (actual) utility gained
by following strategy S across all n problems. We want to compare T
with the sum of the *expected* utilities of following S in all n
problems. Notice that the expected utility of following S in problem i
is simply the mean of U_i. So what we need to show is that

(**) As n gets large, the sum T of n random variables U_i is likely to (quickly) approach the sum of the mean of these variables.

We can't prove (**), because it is not generally true. But it
*is* true in a wide range of cases. In particular, suppose the
probabilities in the different decision problems are independent. Then
elementary probability theory already implies that the mean of T
equals the sum of the mean of the U_i, and the variance of T equals
the sum of the variances of the U_i, assuming these means and
variances exist. If the U_i distributions satisfy certain further
assumptions (such as "Lindeberg's
condition), then a generalised form of the "Central
Limit Theorem reveals that T will in fact approach a Gaussian
distribution with that mean and variance. And the "Berry-Esseen
Theorem reveals that under certain assumptions, the approximation
happens quickly.

So under the assumptions just mentioned, over time the total utility gained by following any given strategy S is indeed likely to (quickly) approach the sum of the expected utility of the option selected by S in the individual problems. In other words, you're likely to maximize total actual utility by maximizing expected utility in each decision problem.

We've still made some fairly strong assumptions. In particular:

(1) I've assumed that which n decision problems are faced does not depend on the choices made in earlier decision problems. This is not the case in the "gambler's ruin" scenario. It seems plausibly to me that the assumption could be weakened, but I'm not sure how.

(2) I've assumed that the agent has a probability over the joint space comprising the states of all individual decision problems, and that these probabilities are independent across different decision problems. In real life, one might have thought that our probabilities usually change between decision problems, and that the probabilities of the states are rarely independent. Again, I think these assumptions can plausibly be justified and/or relaxed. For example, the relevant joint probability doesn't have to be the agent's initial probability before the first problem; we could take the probabilities over the states in the second problem to be given by the agent's probabilities over these states after the first problem has been resolved. This needs to be spelled out more carefully though. There are also variants of the Central Limit Theorem that don't assume full independence.

We also needed to assume that the individual decision problems satisfy certain further constraints such as Lindeberg's condition. Concretely, this means that we have to rule out (for example) that the utilities in one decision problem are vastly greater than the utilities in all others; otherwise the actual total utility will be almost entirely determined by the choice in this single problem. That doesn't seem too unrealistic.

Anyway, I feel I'm reinventing the wheel. Surely all this has been noticed before. Strangely, I can't find any discussion of it anywhere. The only version of the long-run argument that I've seen in the literature is the silly one involving infinite repetitions of the very same decision problem.

Oops, I initially got both Ray Brigg's name and gender wrong. Sorry. Corrected.

]]>An allegedly attractive feature of realist structuralism is that it is faithful to mathematical practice. Unlike various forms of eliminativism or fictionalism, we can accept mathematical theorems as literally true statements about an objective, mind-independent part of reality. Unlike classical Platonism, we don't have to assume that there is a special realm of abstract particulars. According to realist structuralism, the number 2 is not a special particular, but a "place in a structure". In fact, the number 2 figures in different structures, and thus has different properties depending on whether we do arithmetic, real analysis, or complex analysis.

OK, but what exactly is a place in a structure? If we're realist structuralists, we don't want to say that the number 2 is an abstract particular. So presumably it is itself a structural property. What else could it be?

On this view, the number 2 is a property that can be instantiated by different particulars within an instantiation of a more complex structure. More specifically, let C(x) be the conjunction of all predicates true of the number 2 in the complex field, expressed in terms of the structural relations of addition, subtraction, etc., and logical expressions. Then the number 2 is the property of being an x such that C(x).

On reflection, though, this doesn't actually work. For one thing, if we want to take mathematics at face value, we now have to say that '2+0=2' states that the result of applying the addition operation to the aforementioned property C(x) and a similarly defined complex property C'(x) is the property C(x). That is, the addition operation must be defined to operate on structural properties. But note that the addition operation '+' also figures in C(x). And here it arguably can't be interpreted as an operation on structural properties. After all, we want to say that in any instantiation of the complex field by some particulars P, addition relates the elements of P, not abstract properties that remain constant from instantiation to instantiation.

I haven't seen this problem discussed anywhere. But another problem has recently received some attention. The problem is that the numbers i and -i have the exact same structural properties (in the complex field). That is, the complete structural description of i also completely describes -i, an vice versa. So if numbers are "places in a structure", and a place in a structure is a structural property, then it seems to follow that the number i is the very same number as the number -i. But we don't want to say that i = -i.

So as realist structuralists, we shouldn't say that numbers are structural properties.

But then what are the numbers, if we don't have anything else in our ontology than properties and concrete particulars? It's not helpful, I think, to keep talking about "places in a structure", or about "parts of complex properties". These are metaphors, and as far as I can tell there is no good explanation of what they could mean.

It seems to me that what a realist structuralist should say instead is that numbers – even numbers in the context of complex analysis – are not determinate things at all: numerals like '2' and 'i' are not straightforwardly referring terms.

Rather, we have the complex structure C, in Plato's heaven. This structure has two "places" for i and -i, in the sense that any instantiation of the structure will identify two individuals as i and -i. That's what we mean when we say that i is not equal to -i. The statement (i ≠ -i) is not a statement about two specific things – two individuals, or two properties. Translated into ontologese, it is a universally quantified statement about all (possible) instantiations of the the structure C.

But now we're two thirds of the way to eliminative structuralism. According to eliminative structuralism, mathematics is not the study of a special domain at all. Not of special, abstract particulars. Nor of special, abstract structures. Rather, mathematical statements are interpreted as universally quantified statements about all (possible) instances of relevant axioms.

The upshot, I think, is that it's harder to "take mathematics at face value" than many structuralists claim. A statement like 'i ≠ -i' seems to express the non-identity of two definite things. But it's hard to see how it could do that.

Indeed, it is hard to see how on *any* view the terms 'i' and
'-i' could have determinate reference. Even if you're a classical
Platonist and believe in a special domain of complex numbers, how does
our word 'i' manage to latch onto a specific element of that domain?

There might be a nice application here for Kit Fine's "semantic relationism".

]]>For example, suppose we have an indeterministic coin that we don't toss. In this context, I'd say (1) is true and (2) is false.

(1) If I had tossed the coin it might have landed heads.

(2) If I had tossed the coin it would have landed heads.

These intuitions are controversial. But if they are correct, then the
might counterfactual (1) can't express that the corresponding would
counterfactual is epistemically possible. For we know that the would
counterfactual is false. That is, the 'might' here doesn't scope over
the conditional. Rather, the might counterfactual (1) seems to express
the dual of the would counterfactual (2), as Lewis suggested in
*Counterfactuals*: 'if A then might B' seems to be equivalent to
'not: if A then would not-B'.

On the other hand, consider the following situation. We know that
the laws of nature entail *either* that whenever A happens then B
happens *or* that whenever A happens then C happens; we don't
know which. In the first case, if the laws of nature entail that
whenever A happens then B happens (for ordinary A and B), it seems to
me that (3) is true.

(3) If A had happened then B would have happened.

Similarly, in the second case, if the laws of nature entail that whenever A happens then C happens, then (4) is true.

(4) If A had happened then C would have happened.

So we know that one of (3) or (4) is true. Now consider the corresponding might counterfactuals.

(5) If A had happened then B might have happened.

(6) If A had happened then C might have happened.

Intuitively, these are both true as well. But if might counterfactuals are the dual of would counterfactuals, then (5) entails the negation of (4) and (6) entails the negation of (3), assuming B and C are logically incompatible. (By duality, (5) is equivalent to `it is not the case that if A had happened then not-B had happened'. This contradicts (4).) So the might counterfactuals (5) and (6) can't be the duals of the corresponding would counterfactuals.

Instead, the 'might' here does seem to scope over the corresponding would conditional: we don't know which of the would counterfactuals (3) and (4) is true, and that seems to be expressed by (5) and (6).

Are there also cases where 'might' takes narrow scope in the consequent of a would counterfactual? I remember that I used to think so, but sadly I can't remember any relevant example.

Over time, I changed my mind. Nowadays, I'd like to say that 'would' and 'might' are epistemic modals that are evaluated relative to a subjunctive supposition. That is, a subjunctive 'if' clause updates the information state of the utterance context by "imaging" on the antecedent; 'would' then expresses that the updated information state supports the consequent, while 'might' expresses that the updated information state is compatible with the consequent.

What does this view predict for the above two kinds of scenarios?

In the case of the indeterministic coin, the intuition that (1) is true and (2) false is vindicated. Supposing (subjunctively) that the coin is tossed, it is uncertain how it lands. So we can say that it might land heads, but not that it would land heads. In general, the new view essentially vindicates the duality of might and would counterfactuals: 'might B' is true relative to a certain subjunctive supposition A iff 'would not-B' is false relative to that supposition.

But now we run into trouble with the unknown laws scenario, where duality seems to fail.

To be sure, the intuitions here don't say that either (3) or (4)
is actually assertable. Subjunctively supposing A, we can't say that B
*would* have happened, nor that C *would* have happened. The
subjunctive supposition A supports 'would B' only if it is evaluated
under the indicative supposition that the laws of nature say 'if A
then B'. But are we right when we judge that either (3) or (4) is
true? Is the disjunction of (3) and (4) assertable even though neither
of the disjuncts is assertable?

There are different ways to go here. One possibility is to revise the account I have sketched and argue that (would and might) counterfactuals are evaluated not relative to our subjective information state imaged on the antecedent, but relative to some more objective information state -- the (actual) objective chance function, for example. But it's not clear how that helps. One of (3) or (4) will come out as clearly true. But (5) and (6) come out false, unless 'might' scopes over the conditional. And I don't think it does. Consider (5').

(5') What if A had happened? It might be that B had happened.

This seems to me to say the same thing as (5). But it's not plausible that the 'might' in the second sentence somehow scopes over the 'if' in the first.

I'd rather stick with the idea that counterfactuals are evaluated relative to subjective information states. The problem raised by the laws case is then related to a more general problem: to explain why counterfactuals intuitively seem to describe objective and possibly unknown facts about the world.

Let's have a closer look at the "imaging" function that defines subjunctive supposition. Roughly speaking, when we subjunctively suppose A, we shift the (subjective) probability of any world w to the A-world closest to w. Which A-world is closest to w is determined by intrinsic facts about w: the laws, the past, or whatever. Some worlds are such that the closest A-worlds are B-worlds, others are not. On the view I sketched, 'if A then would B' is assertable only if the worlds in our subjective information state are all of the first kind. That's how counterfactuals appear to describe an objective feature of the world (and that's how the Lewis-Stalnaker account comes out almost right on the new account).

In the scenario with the unknown laws, we know that the world is one of two ways: the laws either say 'if A then B' or 'if A then C'. On the supposition that it is the first way, (3) is assertable; on the supposition that it is the second way, (4) is assertable. If we read 'S is true at w' as 'S is assertable on the supposition w', then (3) is true at some worlds in our information state and (4) is true at the remaining worlds. If 'A or B' is true at w iff one (or more) of A and B is true at w, the disjunction of (3) and (4) comes out true at all worlds in our information state. The disjunction is true even though neither disjunct is assertable.

That looks promising to me. But it all needs to be spelled out more carefully.

Someday.

]]>3. Brendan Fong and David Spivak wrote a free textbook "on applied category theory. I've dabbled in category theory every now and then (e.g. when trying to work through some of "Silvio Ghilardi's work on counterpart semantics), but I never really got the hang of "why?". Peter Smith's (also free) "Gentle Introduction to Category Theory helped a little, but Fong and Spivak approach the question much more directly. John Baez is currently going through the book in an ""online course", giving his own commentary and exercises on each chapter. (Well, on chapter 1, so far.)

]]>