<?xml version="1.0" encoding="iso-8859-1"?>

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
         xmlns="http://purl.org/rss/1.0/">

<channel rdf:about="http://www.umsu.de/wo/">
  <title>wo's weblog</title>
  <link>http://www.umsu.de/wo/</link>
  <description>Musings in Analytical Philosophy</description>
  <items>
    <rdf:Seq>
    <rdf:li resource="http://www.umsu.de/wo/2012/578" />
<rdf:li resource="http://www.umsu.de/wo/2012/577" />
<rdf:li resource="http://www.umsu.de/wo/2012/576" />
<rdf:li resource="http://www.umsu.de/wo/2012/575" />
<rdf:li resource="http://www.umsu.de/wo/2011/574" />
    </rdf:Seq>
  </items>
</channel>

  <item rdf:about="http://www.umsu.de/wo/2012/578">
    <title>Counterexamples to Stalnaker's Thesis</title>
    <link>http://www.umsu.de/wo/2012/578</link>
    <dc:date>2012-04-21T15:27:00+02:00</dc:date>
    <description><![CDATA[<p>I like a broadly Kratzerian account of conditionals. On this account, the function of if-clauses is to restrict the space of possibilities on which the rest
of the sentence is evaluated. For example, in a sentence of the form
'the probability that if A then B is x', the if-clause restricts the
space of possibilities to those where A is true; the probability of B
relative to this restricted space is x iff the unrestricted
conditional probability of B given A is x. This account therefore
valides something that <i>sounds</i> exactly like
"Stalnaker's Thesis" for indicative conditionals:</p>

<blockquote>
   Thesis: P(if A then C) = P(C/A).
</blockquote>

<p>On the account I like, if you say 'P(if A then C)' in
English, you almost inevitably end up saying something that denotes
the conditional probability P(C/A), rather than the unconditional
probability of some proposition expressed by 'if A then C'.</p>

<p>So it's interesting that Vann McGee and Stefan Kaufmann have found
intuitive counterexamples to Stalnaker's Thesis. One of Kaufmann's
examples in <a href="http://philpapers.org/rec/KAUCAT">"Conditioning against the grain"</a> goes as follows. There are two bags. In bag X, most balls are
red, and most of the red balls have black spots. In bag Y, few balls
are red, and few of those balls have black spots. You are 75%
confident that the bag in front of you is bag Y. Now consider the
statement:</p>

<blockquote>
    (1) If you pick a red ball, it will not have black spots.
</blockquote>

<p>Many people apparently intuit (1) to have fairly high probability. I
take that to mean that they would assent to</p>

<blockquote>
    (1') Probably, if you pick a red ball, it will not have black spots.
</blockquote>

<p>This contradicts the Thesis, because getting a red ball is evidence
that the bag in front of you is bag X, in which case it is rather
likely that the ball has black spots.</p>

<p>As Kaufmann observes, if these facts are made salient -- if one
points out that picking a red ball is much more likely if it's bag X
rather than Y, and that most red balls in bag X have spots -- then
people's intuitions switch and they deem (1) to have low
probability. So it looks like the Thesis is right about some
contexts, but not about others.</p>

<p>Kaufmann's explanation is that there are two ways of evaluating
conditional probabilities, one "local" and one "global". Globally,
'P(if A then B)' denotes P(B/A); locally, 'P(if A then B)' denotes the
expectation of P(B/A) relative to a certain parition, here the
partition of bags { X, Y }:</p>

<blockquote>
    (L) P(if A then B) = P(B/AX)P(X) + P(B/AY)P(Y).
</blockquote>

<p>The idea, which sounds plausible, is that when we judge (1) to be probable, we hold fixed that
P(Y)=0.75 and note that P(No Spots / Red &amp; Y) is high, which by
(L) means that the probability of (1) is high.</p>

<p>But why would we use (L) to evaluate
conditional probabilities? The "global" evaluation that conforms to
Stalnaker's Thesis is predicted by the general Kratzer-style semantics of 'if' and 'probability'. Where does
the "local" reading come from?</p>

<p>Kaufmann suggests that the two evaluations corresponds to different
ways of supposing A, and also that the local evaluation can be understood
as giving the expectated conditional chance of B given A, since chance is credence conditionalised on the true
member of a relevant partition. Both of these remarks suggest that (L)
could give the <i>subjunctive</i> conditional probability of B given
A, P(A\B), rather than the indicative conditional probability P(A/B). Indeed,
the kind of compartmentalised conditioning that figures in (L) is
precisely what Lewis uses in "Causal Decision Theory" to define the
imaging function for subjunctive conditional probabilities.</p>

<p>So maybe that's what's going on: when people judge (1) to be
probable, they read the conditional as subjunctive. This isn't too
implausible, I think, because in English the distinction between the
subjunctive and indicative reading is usually only marked in the past
tense. Read subjunctively, the intuitive judgement about (1) is correct, as can
be seen if one enforces this reading by saying "if you <i>were</i> to
pick a red ball, it <i>would</i> not have black spots".</p>

<p>The hypothesis that the subjunctive reading is in play might
also be supported by the fact that the intuition about (1) becomes
much weaker -- I think -- if the sentence is put into the past. Suppose you've drawn a ball but haven't looked at it
yet. Consider:</p>

<blockquote>
    (1'') If you picked a red ball, then it does not have black spots.
</blockquote>

<p>The hypothesis also fits the phenomenon that people's intuitions
flip when it is pointed out that picking a red ball makes it more
likely that it's bag X than bag Y: this context, where the topic is
what is evidence for what, makes the indicative reading salient.</p>

<p>So far, so good. Unfortunately, the present story does not work for
McGee's examples. Here is one Kaufmann discusses as well. Initially,
you believe that Murdoch died in an accident. Then somebody who you
think is probably Sherlock Holmes says that Murdoch was killed, that
Brown is probably the murderer, and that in any case</p>

<blockquote>
    (2) If Brown didn't kill Murdoch, then someone else did.
</blockquote>

<p>According to McGee, most people now regard (2) as highly
probable. However, if it turns out that Brown didn't kill Murdoch,
then you'd lose your confidence that the speaker is Holmes, and thus
return to your judgment that Murdoch died in an accident. So the
(indicative) conditional probability corresponding for (2) is low.</p>

<p>Kaufmann doesn't find this problematic, since it conforms to his
local evaluation rule (L), this time using the partition { he's
Holmes, he's not Holmes }.  But this application of (L) cannot
plausibly be taken to give the subjunctive conditional probability of
someone else killing Murdoch given that Brown didn't kill him. The
subjunctive probability is surely low. If you think that Brown
probably killed Murdoch, you will not judge it very probable that if
Brown hadn't killed him then someone else would have. Moreover, it is
anyway implausible that people are reading (2) subjunctively, because
it is in the past tense.</p> 

<p>The reason why Kaufmann's rule (L) here doesn't yield subjunctive
conditional probability is that it uses a bad partition { Holmes, not
Holmes }. (This also makes it implausible to describe (L) as computing
expected conditional chance.) Roughly speaking, the cells of a good
partition would say enough about the the world and its causal
structure so that, combined with either the assumption that Brown did
kill Murdoch or that he didn't, each cell would entail whether someone
else killed Murdoch. Applying (L) to such a partition yields a low
conditional probability.</p>
    
<p>(L) is
partition-dependent: the "local" probability of a conditional depends
on the chosen partition. By choosing a suitable partition, we can let
the local probability have almost any value we like. Kaufmann stresses
that not all partitions are acceptable for (L), and that the right
partitions must somehow encode the "causal structure of the scenario"
[p.598]. But it isn't clear why this makes { Holmes, not Holmes }
acceptable.</p>

<p>Let's redescribe Kaufmann's first example with a different partition.
Again, you get to draw a ball from either bag X or bag Y; X contains
mostly red balls with mostly black spots, Y has few red balls, few of
which have black spots; based on your evidence, you are 75% certain
that the bag in front of you is bag Y. If the contents of the bags are
precisely specified (as Kaufmann does), it is possible to calculate
your probability for the hypothesis that you draw a red ball from bag
Y. Let this hypothesis be called RY. Given your evidence, the
probability of RY is quite low, say 0.05. So you're very confident
that not-RY is true. Moreover, if not-RY is indeed true and you draw a red
ball, then the ball can only come from bag X, in which case it
probably has black spots. Now consider</p>

<blockquote>
    (1) If you pick a red ball, it will not have black spots.
</blockquote>

<p>I suspect many would judge (1) to have low probability in this
context, lower than P(No Spots/Red) and much lower than the
subjunctive P(No Spots\Red). But the scenario is exactly the same as
Kaufmann's -- I've just made a different partition salient.</p>

<p>Here is one lesson we might draw. There aren't just two kinds of
conditional probabilities, indicative and subjunctive, but infinitely
many, one for each choice of a partition. Every partition induces an
imaging function and thereby a type of subjunctive supposition. We
could then also fold indicative conditional probability into the
subjunctive kind, induced by the single-membered partition. Context
usually determines which partition is salient for statements about
conditional probability (i.e. for statements that look like statements
about the probability of a conditional).</p>

<p>Maybe. But if that's true, I'd like it to follow from the general
semantics of 'if' and 'probability'. Neither of these, by itself, seems to
be sensitive to the contextually salient partition -- at least not to
the extent required for the present proposal to work.</p>

<p>I prefer another, perhaps more obvious, explanation: people who
intuit that (2) is probable and (1) very improbable (in the revised
context) have made a mistake.</p>

<p>Where does the mistake come from? In part, it may come from the
fact that the (standard, indicative) conditional probability is a bit
hard to determine in these cases, because one has to keep track of two
factors that pull in opposite directions. For example, in the case of
(2), the hypothesis that Brown didn't kill Murdoch supports that
someone else did it within the "Holmes" cell of the partition, but
simultaneously lowers the probability of that cell and thereby the
probability that Murdoch was killed.</p>

<p>More importantly, I think the mistake comes from the grammatical
illusion that a question about the the probability of a conditional is
a question about the probability of a certain proposition. If A is
a proposition and { X, Y } a partition, then of course</p>

<blockquote>    
       P(A) = P(A/X)P(X) + P(A/Y)P(Y).
</blockquote>

<p>So we can always evaluate the probability of a proposition by
considering its probability under different hypotheses and then take
the weighted average. The result never depends on the chosen
partition. When asked about the probability that if A then B, we
mistakenly apply the same recipe, not realising that 'the probability
that if A then B on the assumption that X' denotes P(B/AX) rather than
something of the form P(A-&gt;B/X).</p>

<p>Consider another of McGee's examples. Quantum mechanics entails
that</p>

<blockquote>
    (3) If all atoms in this table decay within the next second, then
        Z amount of energy is released,
</blockquote>

<p>for some particular value Z. McGee suggests that if we trust
quantum mechanics, then we will assign high probability to
(3). However, P(Z released / table decays) is low, since finding the
table suddenly decay would dramatically lower our confidence in
quantum mechanics.</p>

<p>If the probability of (3) is the probability of a certain
proposition that's entailed by quantum mechanics, then it is clear why
trusting quantum mechanics requires assigning high probability to
(3). But on the Kratzerian account, there is no such proposition, at
least not if the conditional is read indicatively. (It could also be
read as a nomologically strict conditional, in which case the failure
of Stalnaker's Thesis is unproblematic.) On the indicative reading,
there is no proposition that is (i) entailed by quantum mechanics and
(ii) whose probability is in question when we ask about the
probability of (3). Perhaps it is the prima facie plausibility that there is a proposition satisfying (i)
and (ii) that explains why we mistakenly think the probability of (3)
must be high, even on the indicative reading.</p>

]]></description>
  </item>
    <item rdf:about="http://www.umsu.de/wo/2012/577">
    <title>Possible worlds and non-principal ultrafilters</title>
    <link>http://www.umsu.de/wo/2012/577</link>
    <dc:date>2012-02-25T18:23:00+01:00</dc:date>
    <description><![CDATA[<p>It is natural to think of a possible world as something like an extremely specific story or theory. Unlike an ordinary story or theory, a possible world leaves no question open. If we identify a theory with a set of propositions, a possible world could be defined as a theory T which is<p>
<ol>
<li>maximally specific: T contains either P or ~P, for every proposition P;
<li>consistent: T does not contain P and ~P, for any proposition P;
<li>closed under conjunction and logical consequence: if T contains both P and Q, then it contains their conjunction P & Q, and if T contains P, and P entails Q, then T contains Q.
</ol>

<p>It is often useful to go in the other direction and identify propositions with sets of possible worlds. We can then analyse entailment as the subset relation, negation as complement and conjunction as intersection. Of course, we may not want to say that a world is a (non-empty) set of (consistent) propositions and also that a consistent proposition is a non-empty set of worlds, since these sets should eventually bottom out. But that doesn't seem very problematic, and it is easily fixed as long as there is a simple 1-1 correspondence between worlds and logically closed, consistent and maximally specific theories. In particular, one might suspect that on the present definitions, every logically closed, consistent and maximally specific theory uniquely corresponds to a possible world, namely the sole member of the intersection of the theory's members.</p>

<p>But it looks like this is false. Since there are infinitely many worlds, one can show (e.g. in ZFC) that there are sets of sets of worlds that are logically closed, consistent and maximally specific, but do not single out any particular world: the <a href="https://en.wikipedia.org/wiki/Ultrafilter">non-principal ultrafilters</a> on the space of worlds. The non-principal ultrafilters contain the negation of { W } for every world W. So these theories are true at no world whatsoever. They are nevertheless consistent, since they don't contain any proposition together with its negation.</p>

<p>This is odd. I would like to say that although I sometimes define theories as sets of propositions and propositions as sets of worlds, one can (if one wants) just as well go in the other direction and define possible worlds as logically closed, consistent and maximally specific theories. But the two definitions don't seem to line up. I somehow need to exclude the non-principal ultrafilters, without talking about their set-theoretic construction (which would presuppose my own order of definition). I suppose this could be done by strengthening the closure condition, e.g. by saying that whenever T contains some propositions, then it also contains the (possibly infinite and uncountable) conjunction of those propositions. Would that work? Is there a better response?</p>
]]></description>
  </item>
    <item rdf:about="http://www.umsu.de/wo/2012/576">
    <title>Poor one-boxers</title>
    <link>http://www.umsu.de/wo/2012/576</link>
    <dc:date>2012-02-18T20:12:00+01:00</dc:date>
    <description><![CDATA[<p>Imagine you're a hedonist who doesn't care about other people, nor
about your past or your distant future. All you care about is how much
money you can spend today. Fortunately, you're on a pension that pays
either $100 or $1000 every day, plus an optional bonus. How much you
get is determined as follows. Every morning, a psychologist shows up
to study your brain. Then he puts two boxes in front of you, one
opaque, the other transparent. You can choose to take either both boxes or
only the opaque one.  The transparent box contains a $10 bill. The
opaque box contains nothing if the psychologist has predicted that you
will take both boxes; if he has predicted that you will take one box,
it contains $100. The psychologist's predictions are about 99%
accurate. The content of your boxes is your bonus payment. In addition, you get your
ordinary payment, which is either $100 or $1000 depending on how many
boxes you took the previous day: if you took both, you now get $1000,
otherwise $100. The ordinary payment is given to you before the psychologist 
studies your brain, so by the time you choose between the two boxes, you already 
know how much you received. What do you do?</p>

<p>Evidential decision theory says that you should take only the opaque
box, while causal decision theory says you should take both. Since you already 
know whether your ordinary payment is $100 or $1000, your decision between the boxes has neither causal nor evidential impact on today's ordinary payment. 
It does have an impact on how much you will get tomorrow, but you don't care about
tomorrow. What's left is a standard Newcomb problem.</p>

<p>The upshot is that if you follow evidential decision theory, you
have $200 to spend most days, and occasionally only $100 (if the
prediction went wrong). If you follow causal decision theory, you have
$1010 most days, and occasionally $1110.</p>

<p>In this scenario, the two-boxers might start to ridicule the
one-boxers. "Look how much money we got again today, and how many
beautiful things we could buy. You, on the other hand, have barely
enough for food and rent. Wouldn't you rather be one of us?"</p>

<p>The ridicule is, of course, unfair. When the two-boxers take their
two boxes, they don't care what effect this has on their payment the
following day. They choose two boxes only because this is guaranteed
to give them $10 more today than whatever they would get if they chose
only one box. It just happens that their choice also makes them get
$1000 the following day. The payment setup rewards two-boxers
and punishes one-boxers -- but not because two-boxers are making the
better choice. The setup could just as well have been the other way
round, so that one-boxers would have been much better off than
two-boxers.</p>

<p>Nevertheless, the ridicule is quite similar to the ridicule
two-boxers often get from one-boxers in the standard Newcomb
problem. "If you're so smart", say the one-boxers, "why ain'cha rich?
Look how much money we got; wouldn't you rather be one of us?" We
two-boxers reply that we're poor not because we made the wrong choice,
but because the Newcomb setup rewards one-boxers and punishes two-boxers:
one-boxers mostly get to choose between $100 (in the opaque box)
and $110 (in both boxes together), while two-boxers are given a
choice between $0 (in the opaque box) and $10 (together). 
That's why one-boxers end up with more money -- not because
they made the better choice. (Indeed, if they had taken both boxes, they would
be even richer.)</p>

<p>It would be nice if we could add that the setup might just as well
have been the other way round, rewarding two-boxers rather than
one-boxers. However, it is difficult to come up with such a reversed
setup. Lewis (in "Why ain'cha rich?") conjectured that it can't be done. 
The story above at least seems to come close.</p>

<p>(The story can also be adapted to several agents rather than a
single agent at different times. Suppose infinitely many copies of you
have been created and assigned to the numbers ...-3,-2,-1,1,2,3,...,
with the number 0 reserved for yourself. Each of you first gets either
$100 or $1000. Then you face a Newcomb problem as above. If player i
takes both boxes, player i+1 gets $1000, otherwise they get $100. All you care
about is your own personal profit, and thus all your copies likewise
only care about their own personal profit. If you're a one-boxer (or
more generally an evidential decision theorist), you end up with
either $200 or $100. If you're a two-boxer (a causal decision
theorist), you get either $1010 or $1110.)</p>
]]></description>
  </item>
    <item rdf:about="http://www.umsu.de/wo/2012/575">
    <title>Travel plans</title>
    <link>http://www.umsu.de/wo/2012/575</link>
    <dc:date>2012-02-18T14:00:00+01:00</dc:date>
    <description><![CDATA[<p>I will probably be in Germany from about mid May until the end of June this year.</p>
]]></description>
  </item>
    <item rdf:about="http://www.umsu.de/wo/2011/574">
    <title>Two new papers</title>
    <link>http://www.umsu.de/wo/2011/574</link>
    <dc:date>2011-11-18T17:39:00+01:00</dc:date>
    <description><![CDATA[<p>One: <a href="https://github.com/wo/papers/blob/master/montagovian.pdf?raw=true">Variations on a Montagovian Theme</a>.<br>
Two: <a href="https://github.com/wo/papers/blob/master/fission-update.pdf?raw=true">Belief Dynamics across Fission</a>.</p>

<p>As always, comments are much appreciated.</p>]]></description>
  </item>
  </rdf:RDF>

