Wolfgang Schwarz

Blog

Posts on: Bayesianism

Dynamic rationality

The standard dynamic norm of Bayesianism, conditionalization, is clearly inadequate if credences are defined over self-locating propositions. How should it be adjusted?

This question was popular at around 2005-2015. Chris Meacham and I came up with the same answer, which we published in (Meacham 2010), (Schwarz 2012), and (Schwarz 2015). I showed that the replacement norm that we proposed has all the traditional virtues of conditionalization. For example, (under the usual idealized conditions) following the norm uniquely maximizes expected accuracy, and an agent is invulnerable to diachronic Dutch books iff they follow the norm.

De Finetti's theorem without symmetries?

Bruno de Finetti (de Finetti (1970)) suggested that chance is objectified credence. The suggestion is explained and defended in Jeffrey (1983, ch.12), Skyrms (1980 ch.I), Skyrms (1984, ch.3), and Diaconis and Skyrms (2017, ch.7), but I still find it hard to understand. It seems to assume that rational credence functions are symmetrical in a way in which I think they shouldn't be.

Are recalcitrant worlds less probable?

The Best-Systems Account of chance promises to explain why beliefs about chance should affect our beliefs about ordinary events, as formalized by the Principal Principle. In this post, I want to discuss a challenge to any such explanation.

First, some background.

For any candidate chance function f, let [f] be the set of worlds of which f is (part of) the best system. According to the Best-Systems Account (BSA), the hypothesis "Ch=f" that f is the true chance function expresses the proposition [f]. In what follows, I'll assume that a world is simply a history of "outcomes", and that the candidate systems can be compressed into a single (possibly parameterized) chance function.

The Wednesday Sleeping Beauty Problem

In 2009, at the ANU, Mike Titelbaum organized a small workshop on the Sleeping Beauty problem. I gave a talk in which I argued that the answer to the problem depends on whether we accept genuinely diachronic norms on rational belief: if yes, halfing is the most plausible answer; if no, we get thirding. A successor of this talk is now forthcoming in Noûs. Here's a PDF. In this post, I want to discuss a surprisingly hard question Kenny Easwaran raised in the Q&A after my talk:

How confident should Beauty be on Wednesday that the coin has landed heads?

Reasoning about doom

I occasionally teach the doomsday argument in my philosophy classes, with the hope of raising some general questions about self-locating priors. Unfortunately, the usual formulations of the argument are problematic in so many ways that it's hard to get to these questions.

Let's look at Nick Bostrom's version of the argument, as presented for example in Bostrom (2008).

Nair on adding up reasons

Often there are many reasons for and against a certain act or belief. How do these reasons combine to an overall reason? Nair (2021) tries to give an answer.

Nair's starting point is a little more specific. Nair intuits that there are cases in which two equally strong reasons combine to a reason that is twice as strong as the individual reasons. In other cases, however, the combined reason is just as strong as the individual reasons, or even weaker.

To make sense this, we need to explain (1) how strengths of reason can be represented numerically, and (2) under what conditions the strengths of different reasons add up.

Isaacs and Russell on updating without evidence

Isaacs and Russell (2023) proposes a new way of thinking about evidence and updating.

The standard Bayesian picture of updating assumes that an agent has some ("prior") credence function Cr and then receive some (total) new evidence E. The agent then needs to update Cr in light of E, perhaps by conditionalizing on E. There is no room, in this picture, for doubts about E. The evidence is taken on board with absolute certainty.

The standard picture thereby assumes that the agent's cognitive system is perfectly sensitive to a certain aspect of the world: if E is true, the agent is certain to update on E; if E is false, the agent is certain to not update on E.

The subjective Bayesian answer to the problem of induction

Some people – important people, like Richard Jeffrey or Brian Skyrms – seem to believe that Laplace and de Finetti have solved the problem of induction, assuming nothing more than probabilism. I don't think that's true.

I'll try to explain what the alleged solution is, and why I'm not convinced. I'll pick Skyrms as my adversary, mainly because I've just read Skyrms and Diaconis's Ten Great Ideas about Chance, in which Skyrms presents the alleged solution in a somewhat accessible form.

Belief downloaders and epistemic bribes

Greaves (2013) describes a case in which adopting a single false belief would (supposedly) be rewarded by many true beliefs.

Emily is taking a walk through the Garden of Epistemic Imps. A child plays on the grass in front of her. In a nearby summerhouse are n further children, each of whom may or may not come out to play in a minute. They are able to read Emily's mind, and their algorithm for deciding whether to play outdoors is as follows. If she forms degree of belief 0 that there is now a child before her, they will come out to play. If she forms degree of belief 1 that there is a child before her, they will roll a fair die, and come out to play iff the outcome is an even number. […]

Shangri La Variations

There are two paths to Shangri La. One goes by the sea, the other by the mountains. You are on the mountain path and about to enter Shangri La. You can choose how your belief state will change as you enter through the gate, in response to whatever evidence you may receive. At the moment, you are (rationally) confident that you have travelled by the mountains. You know that you will not receive any surprising new evidence as you step through the gate. You want to maximize the expected accuracy of your future belief state – at least with respect to the path you took. How should you plan to change your credence in the hypothesis that you have travelled by the mountains?

Higher-order evidence and non-ideal rationality

I've read around a bit in the literature on higher-order evidence. Two different ideas seem to go with this label. One concerns the possibility of inadequately responding to one's evidence. The other concerns the possibility of having imperfect information about one's evidence. I have a similar reaction to both issues. I haven't seen it in the papers I've looked at. Pointers very welcome.

I'll begin with the first issue.

Let's assume that a rational agent proportions her beliefs to her evidence. This can be hard. For example, it's often hard to properly evaluate statistical data. Suppose you have evaluated the data, reached the correct conclusion, but now receive misleading evidence that you've made a mistake. How should you react?

Some (e.g. Christensen (2010)) say you should reduce your confidence in the conclusion you've reached. Others (e.g. Tal (2021)) say you should remain steadfast and not reduce your confidence.

Self-locating priors, primary intensions, and cosmological measures

If a certain hypothesis entails that N percent of all observers in the universe have a certain property, how likely is it that we have that property – conditional on the hypothesis, and assuming we have no other relevant information?

Answer: It depends on what else the hypothesis says. If, for example, the hypothesis says that 90 percent of all observers have three eyes, and also that we ourselves have two eyes, then the probability that we have three eyes conditional on the hypothesis is zero.

This effect is easy to miss because many hypotheses that appear to be just about the universe as a whole secretly contain special information about us. Consider the following passage from Carroll (2010), cited in Arntzenius and Dorr (2017):

An alternative model of permissivism about epistemic risk

In the previous post I argued that rational priors must favour some possibilities over others, and that this is a problem for Richard Pettigrew's model of Jamesian permissivism. It also points towards an alternative model that might be worth exploring.

I claim that, in the absence of unusual evidence, a rational agent should be confident that observed patterns continue in the unobserved part of the world, that witnesses tell the truth, that rain experiences indicate rain, and so on. In short, they should give low credence to various skeptical scenarios. How low? Arguably, our epistemic norms don't fix a unique and precise answer.

Pettigrew on epistemic risk and the demands of rationality

Pettigrew (2021) defends a type of permissivism about rational credence inspired by James (1897), on which different rational priors reflect different attitudes towards epistemic risk. I'll summarise the main ideas and raise some worries.

(There is, of course, much more in the book than what I will summarise, including many interesting technical results and some insightful responses to anti-permissivist arguments.)

Decreasing accuracy through learning

Last week I gave a talk in which I claimed (as an aside) that if you update your credences by conditionalising on a true proposition then your credences never become more inaccurate. That seemed obviously true to me. Today I tried to quickly prove it. I couldn't. Instead I found that the claim is false, at least on popular measures of accuracy.

The problem is that conditionalising on a true proposition typically increases the probability of true propositions as well as false propositions. If we measure the inaccuracy of a credence function by adding up an inaccuracy score for each proposition, the net effect is sensitive to how exactly that score is computed.

How to serve two epistemic masters

In this 2018 paper, J. Dmitri Gallow shows that it is difficult to combine multiple deference principles. The argument is a little complicated, but the basic idea is surprisingly simple.

Suppose A and B are two weather forecasters. Let r be the proposition that it will rain tomorrow, let A=x be the proposition that A assigns probability x to r; similarly for B=x. Here are two deference principles you might like to follow:

Spelling out a Dutch Book argument

Dutch Book arguments are often used to justify various epistemic norms – in particular, that credences should obey the probability axioms and that they should evolve by condionalization. Roughly speaking, the argument is that if someone were to violate these norms, then they would be prepared to accept bets which amount to a guaranteed loss, and that seems irrational.

But it's hard to spell out how exactly the argument is meant to go. In fact, I'm not aware of any satisfactory statement. Here's my attempt.

Imaginary Foundations

My paper "Imaginary Foundations" has been accepted at Ergo (after rejections from Phil Review, Mind, Phil Studies, PPR, Nous, AJP, and Phil Imprint). The paper has been in the making since 2005, and I'm quite fond of it.

The question I address is simple: how should we model the impact of perceptual experience on rational belief? That is, consider a particular type of experience – individuated either by its phenomenology (what it's like to have the experience) or by its physical features (excitation of receptor cells, or whatever). How should an agent's beliefs change in response to this type of experience?

Simplicity and indifference

According to the Principle of Indifference, alternative propositions that are similar in a certain respect should be given equal prior probability. The tricky part is to explain what should count as similarity here.

Van Fraassen's cube factory nicely illustrates the problem. A factory produces cubes with side lengths between 0 and 2 cm, and consequently with volumes between 0 and 8 cm^3. Given this information, what is the probability that the next cube that will be produced has a side length between 0 and 1 cm? Is it 1/2, because the interval from 0 to 1 is half of the interval from 0 to 2? Or is it 1/8, because a side length of 1 cm means a volume of 1 cm^3, which is 1/8 of the range from 0 to 8?

Mechanistic evidence for probabilistic models

You observe a process that generates two kinds of outcomes, 'heads' and 'tails'. The outcomes appear in seemingly random order, with roughly the same amount of heads as tails. These observations support a probabilistic model of the process, according to which the probability of heads and of tails on each trial is 1/2, independently of the other outcomes.

How observations about frequencies confirm or disconfirm probabilistic models is well understood in Bayesian epistemology. The central assumption that does most of the work is the Principal Principle, which states that if a model assigns (objective) probability x to some outcomes, then conditional on the model, the outcomes have (subjective) probability x. It follows that models that assign higher probability to the observed outcomes receive a greater boost of subjective probability than models that assign lower probability to the outcomes.

Experts with self-locating beliefs

Imagine you and I are walking down a long path. You are ahead, but we can communicate on the phone. If you say, "there are strawberries here" and I trust you, I should not come to believe that there are strawberries where I am, but that there are strawberries wherever you are. If I also know that you are 2 km ahead, I should come to believe that there are strawberries 2 km down the path. But what's the general rule for deferring to somebody with self-locating beliefs?

Sleeping Beauty as losing track of time

What makes the Sleeping Beauty problem non-trivial is Beauty's potential memory loss on Monday night. In my view, this means that Sleeping Beauty should be modeled as a case of potential epistemic fission: if the coin lands tails, any update Beauty makes to her beliefs in the transition from Sunday to Monday will also fix her beliefs on Tuesday, and so the Sunday state effectively has two epistemic successors, one on Monday one on Tuesday. All accounts of epistemic fission that I'm aware of then entail halfing.

Beliefs, degrees of belief, and earthquakes

There has been a lively debate in recent years about the relationship between graded belief and ungraded belief. The debate presupposes something we should regard with suspicion: that there is such a thing as ungraded belief.

Compare earthquakes. I'm not an expert on earthquakes, but I know that they vary in strength. How exactly to measure an earthquake's strength is to some extent a matter of convention: we could have used a non-logarithmic scale; we could have counted duration as an aspect of strength, and so on. So when we say that an earthquake has magnitude 6.4, we characterize a central aspect of an earthquake's strength by locating it on a conventional scale.

Reduction and coordination

The following principles have something in common.

Conditional Coordination Principle.
A rational person's credence in a conditional A->B should equal the ratio of her credence in the corresponding propositions B and A&B; that is, Cr(A->B) = Cr(B/A) = Cr(B)/Cr(A&B).
Normative Coordination Principle.
On the supposition that A is what should be done, a rational agent should be motivated to do A; that is, very roughly, Des(A/Ought(A)) > 0.5.
Probability Coordination Principle.
On the supposition that the chance of A is x, a rational agent should assign credence x to A; that is, roughly, Cr(A/Ch(A)=x) = x.
Nomic Coordination Principle.
On the supposition that it is a law of nature that A, a rational agent should assign credence 1 to A; that is, Cr(A/L(A)) = 1.

All these principles claim that an agent's attitudes towards a certain kind of proposition rationally constrain their attitudes towards other propositions.

Confirmation and singular propositions

In discussions of the raven paradox, it is generally assumed that the (relevant) information gathered from an observation of a black raven can be regimented into a statement of the form Ra & Ba ('a is a raven and a is black'). This is in line with what a lot of "anti-individualist" or "externalist" philosophers say about the information we acquire through experience: when we see a black raven, they claim, what we learn is not a descriptive or general proposition to the effect that whatever object satisfies such-and-such conditions is a black raven, but rather a "singular" proposition about a particular object -- we learn that this very object is black and a raven. It seems to me that this singularist doctrine makes it hard to account for many aspects of confirmation.

Belief update: shifting, pushing, and pulling

It is widely agreed that conditionalization is not an adequate norm for the dynamics of self-locating beliefs. There is no agreement on what the right norms should look like. Many hold that there are no dynamic norms on self-locating beliefs at all. On that view, an agent's self-locating beliefs at any time are determined on the basis of the agent's evidence at that time, irrespective of the earlier self-locating belief. I want to talk about an alternative approach that assumes a non-trivial dynamics for self-locating beliefs. The rough idea is that as time goes by, a belief that it is Sunday should somehow turn into a belief that it is Monday.

Undermining and confirmation

Next, undermining. Suppose we are testing a model H according to which the probability that a certain type of coin toss results in heads is 1/2. On some accounts of physical probability, including frequency accounts and "best system" accounts, the truth of H is incompatible with the hypothesis that all tosses of the relevant type in fact result in heads. So we get a counterexample to simple formulations of the Principal Principle: on the assumption that H is true, we know that the outcomes can't be all-heads, even though H assigns positive probability to all-heads. In such a case, we say that all-heads is undermining for H.

Inadmissible evidence in Bayesian Confirmation Theory

Suppose we are testing statistical models of some physical process -- a certain type of coin toss, say. One of the models in question holds that the probability of heads on each toss is 1/2; another holds that the probability is 1/4. We set up a long run of trials and observe about 50 percent heads. One would hope that this confirms the model according to which the probability of heads is 1/2 over the alternative.

(Subjective) Bayesian confirmation theory says that some evidence E supports some hypothesis H for some agent to the extent that the agent's rational credence C in the hypothesis is increased by the evidence, so that C(H/E) > C(H). We can now verify that observation of 500 heads strongly confirms that the coin is fair, as follows.

The broken duplication machine

Fred has bought a duplication machine at a discount from a series in which 50 percent of all machines are broken. If Fred's machine works, it will turn Fred into two identical copies of himself, one emerging on the left, the other on the right. If Fred's machine is broken, he will emerge unchanged and unduplicated either on the left or on the right, but he can't predict where. Fred enters his machine, briefly loses consciousness and then finds himself emerge on the left. In fact, his machine is broken and no duplication event has occurred, but Fred's experiences do not reveal this to him.

Centred propositions and objective epistemology

Given some evidence E and some proposition P, we can ask to what extent E supports P, and thus to what extent an agent should believe P if their only relevant evidence is E. The question may not always have a precise answer, but there are both intuitive and theoretical reasons to assume that the question is meaningful – that there is a kind of (imprecise) "evidential probability" conferred by evidence on propositions. That's why it makes sense to say, for example, that one should proportion one's beliefs to one's evidence.

The lure of free energy

There's an exciting new theory in cognitive science. The theory began as an account of message-passing in the visual cortex, but it quickly expanded into a unified explanation of perception, action, attention, learning, homeostasis, and the very possibility of life. In its most general and ambitious form, the theory was mainly developed by Karl Friston -- see e.g. Friston 2006, Friston and Stephan 2007, Friston 2009, Friston 2010, or the Wikipedia page on the free-energy principle.

Against countable additivity

Imagine the universe has a centre that regularly produces new stars which then drift away at a constant speed. This has been going on forever, so there are infinitely many stars. We can label them by age, or equivalently by their distance from the centre: star 1 is the youngest, then comes star 2, then star 3, and so on, without end. The stars in turn produce planets at regular intervals. So the older a star, the more planets surround it. Today, something happened to one (and only one) of the planets. Let's say it exploded. Given all this, what is your credence that the unfortunate planet belonged to the first 100 stars? What about the second 100? It would be odd to think that the event is more likely to have happened at one of the first 100 stars than at one of the next 100, since the latter have far more planets. Similarly if we compare the first 1000 stars with the next 1000, or the first million with the next million, and so on. But there is no countably additive (real-valued) probability measure that satisfies this constraint.

Conditional chance and rational credence

Two initially plausible claims:

  1. Sometimes, a possible chance function conditionalized on a proposition A yields another possible chance function.
  2. Any rational prior credence function Cr conditional on the hypothesis Ch=f that f is the (actual, present) chance function should coincide with f; i.e., Cr(A / Ch=f) = f(A) for all A (provided that Cr(Ch=f)>0).

Claim 1 is a supported by the popular idea that chances evolve by conditionalizing on history, so that the chance at time t2 equals the chance at t1 conditional on the history of events between t1 and t2. Claim 2 is a weak form of the Principal Principle and often taken to be a defining feature of chance.

The input problem for Jeffrey conditioning

You can't predict the stock market by looking at tea leaves. If an episode of looking at tea leaves makes you believe that the stock market will soon collapse, then -- assuming your previous beliefs did not support the collapse hypothesis, nor the hypothesis that tea leaves predict the stock market -- your new belief is unjustified and irrational. So there are epistemic norms for how one's opinions may change through perceptual experience.

Such norms are easily accounted for in the traditional Bayesian picture where each perceptual experience is associated with an evidence proposition E on which any rational agent should condition when they have the experience. But what if perceptual experiences don't confer absolute certainty on anything? Jeffrey pointed out that if there is a partition of propositions { E_i } = E_1,...,E_n such that (1) an experience changes their probabilities to some values { p_i } = p_1,...,p_n, and (2) the experience does not affect the probabilities conditional on any member of the partition, then the new probability assigned to any proposition A is the weighted average of the old probability conditional on the members of the partition, weighted by the new probability of that partition. This rule is often called "Jeffrey conditioning" and sometimes "generalised conditioning", but unlike standard conditioning it isn't a dynamical rule at all: it is a simple consequence of the probability calculus. To get genuine epistemic norms on the dynamics of belief through perceptual experience, Jeffrey's rule must be supplemented with a story about how a given experience, perhaps together with an agent's previous belief state, may fix the partition { E_i } and values { p_i } that determine a Jeffrey update. This is the "input problem" for Jeffrey conditioning.

Bayes factors

Suppose a rational agent makes an observation, which changes the subjective probability she assigns to a hypothesis H. In this case, the new probability of H is usually sensitive to both the observation and the prior probability. Can we factor our the prior probability to get a measure of how the experience bears on the probability of H, independently of the prior probability?

A common answer, going back to Alan Turing and I.J.Good, is to use Bayes factors. The Bayes factor B(H) for H is the ratio (P'(H)/P'(not-H))/(P(H)/P(not-H)) of new odds on H to old odds. Thus the new odds on H are the old odds multiplied by the Bayes factor. For example, if the prior credence in H was 0.25 and the posterior is 0.5, then the odds on H changed from 1:3 to 1:1, and so the Bayes factor of the update is 3. The same Bayes factor would characterise an update from probability 0.01 to about 0.03 (odds 1:99 to 1:33) or from 0.9 to about 0.96 (odds 9:1 to 27:1).

The puzzle of the hats

Luc Bovens and Wlodek Rabinowicz (2010 and 2011) present the following puzzle:

Three people are each given a hat to put on in the dark. The hats' colours, either black or white, has been decided by three independent tosses of a fair coin. Then the light goes on and everyone can see the hats of the two others, but not their own. All of this is common knowledge in the group.

Let's call the three players X, Y and Z. There are eight possible distributions of hat colours, each with probability 1/8:

Self-locating belief and diachronic Dutch Books

If beliefs are modeled by a probability distribution over centered worlds, belief update cannot work simply by conditionalisation. How then does it work? The most popular answer in philosophy goes as follows.

Let P an agent's credence function at time t1, P' the credence function at t2, and E the evidence received at t2. Since E is a centered proposition, it can be true at multiple points within a world. Suppose, however, that the agent assigns probability 0 to worlds at which E is true more than once. Then to compute P', first conditionalise P on the uncentered fragment of E -- i.e. the strongest uncentered proposition entailed by E. This rules out all worlds at which E is true nowhere. Second, move the center of each remaining world to the (unique) point at which E is true.

Assessing the evidence differently

Alice is randomly selected from her population to be tested for a rare genetic disorder that affects about one in 10,000 people. The test is accurate 99 percent of the time, both among subjects that have the disorder and among subjects that don't. Alice's test comes back positive.

Call the information in the previous paragraph E, and suppose it's all you know about the situation. How confident are you that Alice has the disorder?

Letting our subjective probabilities be guided by the stated frequencies, we can use Bayes' Theorem to figure out that P(disorder | positive) = P(positive | disorder) * P(disorder) / (P(positive | disorder) * P(disorder) + P(positive | ~disorder) * P(~disorder)) = 0.99 * 0.0001 / (0.99 * 0.0001 + 0.01 * 0.9999) = 0.0098. Assume then that your degree of belief is about 0.01.

Conditional probabilities and Humphreys' Paradox

Expressions like 'P(A/B)', or 'the probability of A given B', seem to be used in various different ways. On one usage, P(A/B) equals P(AB)/P(B), at least if P(B) > 0. Call this the ratio usage. Simple versions of the ratio usage define P(A/B) as P(AB)/P(B), and so entail that P(A/B) is undefined whenever P(B)=0. But I would like to admit views into the family on which P(A/B) is taken as a primitive binary probability, governed by something like the Popper-Renyi conditions.

Frequentism and the end of time

This paper (recently featured on the physics arXiv blog) argues that if the universe never comes to an end, then the universe will probably come to an end within the next 5 billion years. The reasoning, as far as I can tell, goes roughly like this.

First, define the probability of an event of type A given an event of type B as the total number of A events over the number of B events. If the universe is infinite, then the total number of A events and B events will often be infinite. But infinity over infinity isn't well-defined. So to have well-defined probabilities, the relevant counts of A and B events must be restricted, e.g. to a finite initial segment of the universe.

Preferring the less reliable method

Compare the following two ways of responding to the weather report's "probability of rain" announcement.

Good: Upon hearing that the probability of rain is x, you come to believe to degree x that it will rain.
Bad: Upon hearing that the probability of rain is x, you become certain that it will rain if x > 0.5, otherwise certain that it won't rain.

The Bad process seems bad, not just because it may lead to bad decisions. It seems epistemically bad to respond to a "70% probability of rain" announcement by becoming absolutely certain that it will rain. The resulting attitude would be unjustified and irrational.

Can evidence be inadmissible?

First, a quick reminder of history. David Lewis once proposed a principle (the 'Principal Principle') linking rational credence and objective chance. It says (or rather, entails) that your rational credence in any proposition A, on the assumption that the objective chance of A is x, should also be x, no matter what (further) evidence E you have:

OP: P(A | ch(A)=x & E) = x.

This principle, the 'Old Principle', is widely taken to suffer from two defects. First, suppose your evidence E includes ~A. Then probability theory ensures that P(A | ch(A)=x & E) = 0, irrespective of x. Lewis responded by restricting OP to cases where E is 'admissible'. He suggested that a (true) proposition is admissible iff it is entailed by the history of the world up to now together with the laws of nature.

Mike Titelbaum on Shifting and Sleeping Beauty

In the last entry, I have suggested that

EEP) P_2(A) = P_1(+A|+E)

is a sensible rule for updating self-locating beliefs. Here, E is the total evidence received at time 2 (the time of P_2), and '+' denotes a function that shifts the evaluation index of propositions, much like 'in 5 minutes': '+A' is true at a centered world w iff A is true at the next point from w where new information is received. (EEP) therefore says that upon learning E, your new credence in any proposition A should equal your previous conditional credence that A will obtain at the next point when information comes in, given that this information is E.

Terry Horgan on Sleeping Beauty

I've been participating in a couple of workshops here at ANU lately, and I thought I'd share some notes. First, we had a little Sleeping Beauty workshop where Terry Horgan and Mike Titlebaum defended thirding, and me halfing. Unfortunately, I think we didn't quite get to the heart of our disagreement. Each of us said their own thing, without saying enough about what's wrong with the reasoning of the other sides. So I'll do that here. I start with Terry's account.

Truth-conduciveness and rational priors

We Bayesians are sometimes bugged about ultimate priors: what probability function would suit a rational agent before the incorporation of any evidence? The question matters not because anyone cares about what someone should believe if they popped into existence in a state of ideal rationality and complete empirical ignorance. It matters because the answer also determines what conclusions rational agents should draw from their evidence at any later point in their life. Take the total evidence you have had up to now. Given this evidence, is it more likely that Obama won the 2008 election or that McCain won it? There are distributions of priors on which your evidence is a strong indicator that McCain won. Nevertheless, this doesn't seem like it's a rational conclusion to draw. So there must be something wrong with those priors.

When centering matters

Darks clouds are gathering. Soon it will be raining. When it does, I will believe that it is raining. I do not yet believe that it is raining even though I do believe that my well-informed future self will believe that it is raining. I thereby violate the 'Principle of Reflection'. Once we allow for centered propositions that change their truth-value between times and places, Reflection, like its close cousin Conditioning, become very implausible norms of rationality.

Sleeping Beauty, Dutch Books and Newcomb's Problem

A curious aspect of the Sleeping Beauty debate is the role of Dutch Books. At first sight, it looks as if Dutch Book considerations support thirding (see e.g. Hitchcock 2004). However, as Halpern 2006 shows, Beauty can also be Dutch Booked if she is a thirder. Some have argued that these arguments might fail because in Sleeping Beauty type cases, credences and betting odds can come apart (see e.g. Bradley and Leitgeb 2006). I disagree. Instead, I will argue that her vulnerability to Dutch Books doesn't show that Beauty is irrational -- at least not if she is a halfer.

Desire Reflection

Bas van Fraassen's Reflection Principle says that your current beliefs should be in line with your current beliefs about your future beliefs. More precisely,

PRB: P_1(A | P_2(A)=x) = x.

P_1 is your credence at time 1, P_2 your credence at time 2. PRB says that conditional on the assumption that at time 2 you believe A to degree x, you should already believe A to degree x at time 1. For agents who believe that they will (or might) change their beliefs in irrational ways between the two times, PRB is not a reasonable demand: if you know that you will be hit on the head tomorrow and consequently believe that the Earth is flat, you shouldn't believe that the Earth is flat now. On the other hand, if you're certain you will not change your beliefs in any such irrational way between now and tomorrow, then PRB is reasonable: suppose tomorrow you will believe that the Earth is flat by rationally responding to some very surprising new information; then you can infer that there exists some such information strongly supporting that the Earth is flat. But the fact that there is evidence for P is of course itself evidence for P. Hence you should already believe today that the Earth is probably flat.

How to change one's mind

Suppose beliefs locate us in centered logical space: to believe something is to rule out not only ways a universe might be, but ways things might be for an individual at a time. Then there will be two kinds of rational belief change: we can learn something new about our present situation, and we can change our situation and adjust our beliefs to this change. The rule for changes of the first kind is conditionalization. The rule for changes of the second kind doesn't have an official name yet, as far as I know. (In the AGM/KM framework, it is called "update", but we Bayesians often use "update" for conditioning.) In practice, the two rules always go hand in hand: you never learn something new without changing your situation, and you hardly ever change your situation without learning anything new.

In this paper, I try to spell out the two rules, and their combination: Believing in afterlife: conditionalization in a changing world (PDF).

I'm a bit unhappy with some parts of the story, and I should probably say more about alternative accounts in the literature, and why I don't like them. So hopefully there will be an update soon. In the meantime, comments are as always very welcome!

What if I went by the Sea?

This is a follow-up to the previous post on Shangri La. As before, the story is that a fair coin decides which path you take to Shangri La: on heads, you travel by the Mountains, on tails, by the Sea. If you arrive at Shangri La via the Sea, the guardians will replace your Sea memories with Mountain memories.

In the other post, I said that if you actually traveled by the Mountains, you should remain confident that you traveled by the Mountains, even though you would have ended up with the same evidence had you traveled by the Sea.

I'm certain that I went by the Mountains

(This is more or less the talk I gave at the "Epistemology at the Beach" workshop last Sunday.)

"A wise man proportions his belief to the evidence", says Hume. But to what evidence? Should you proportion your belief to the evidence you have right now, or does it matter what evidence you had before? Frank Arntzenius ("Some problems for conditionalization and reflection", JoP, 2003) tells a story that illustrates the difference:

...there is an ancient law about entry into Shangri La: you are only allowed to enter, if, once you have entered, you no longer know by what path you entered. Together with the guardians you have devised a plan that satisfies this law. There are two paths to Shangri La, the Path by the Mountains, and the Path by the Sea. A fair coin will be tosssed by the guardians to determine which path you will take: if heads you go by the Mountains, if tails you go by the Sea. If you go by the Mountains, nothing strange will happen: while traveling you will see the glorious Mountains, and even after you enter Shangri La you will for ever retain your memories of that Magnificent Journey. If you go by the Sea, you will revel in the Beauty of the Misty Ocean. But just as you enter Shangri La, your memory of this Beauteous Journey will be erased and replaced by a memory of the Journey by the Mountains.

When experts disagree on probabilities

A coin is to be tossed. Expert A tells you that it will land heads with probability 0.9; expert B says the probability is 0.1. What should you make of that?

Answer: if you trust expert A to degree a and expert B to degree b and have no other relevant information, your new credence in heads should be a*0.9 + b*0.1. So if you give equal trust to both of them, your credence in heads should be 0.5. You should be neither confident that the coin will land heads, nor that it will land tails. -- Obviously, you shouldn't take the objective chance of heads to be 0.5, contradicting both experts. Your credence of 0.5 is compatible with being certain that the chance is either 0.1 or 0.9. Credences are not opinions about objective chances.

Another argument for halfing

What about this much simpler argument for halfing:

As usual, Sleeping Beauty wakes up on Monday, knowing that she will have an indistinguishable waking experience on Tuesday iff a certain fair coin has landed tails. Thirders say her credence in the coin landing heads should be 1/3; halfer say it should be 1/2.

Now suppose before falling asleep each day, Beauty manages to write down her present credence in heads on a small piece of paper. Since that credence was 1/2 on Sunday evening, she now (on Monday) finds a note saying "1/2".

Exploding desks and indistinguishable situations

I've thought a bit about belief update recently. One thing I noticed is that it is often assumed in the literature (usually without argument) that if you know that there are two situations in your world that are evidentially indistinguishable from your current situation, then you should give them roughly the same credence. Although I agree with some of the applications, the principle in general strikes me as very implausible. Here is a somewhat roundabout counter-example that has a few other interesting features as well.

Beliefs and thresholds

Following up on Weng-Hong (1, 2, 3), here are a few thoughts on thresholds for belief.

If beliefs come in different degrees or strength, what do we mean when we say not that Fred believes that P with strength x, but simply that Fred believes that P? Perhaps we mean that Fred believes that P with sufficient strength, where context may help determining what counts as sufficient. However, on this account, the following principles should be obviously invalid (both descriptively and normatively):

Two arguments against modeling probabilities by size of propositions

To my surprise, there are quite a few people here at ANU who believe that probabilities of various kinds can be modeled in terms of relative size of propositions: something has probability 1 if it is true in all (or 100%) of the relevant worlds, probability 0 if it is true in none (or 0%), and probability 0.5 if it is true in half of the worlds (or 50%). I also find it surprisingly hard to explain why I think that's wrong. Here are two arguments I've come up with so far (apart from obvious worries about making sense of these fractions in infinite and proper-class cases).

Indifference

I've been assigned some boring administrative work, but that's finished now, I hope. Here are some rough thoughts on indifference and Adam Elga's Dr. Evil paper (PDF).

There are many possible individuals whose mental state is subjectively indistinguishable from my current mental state insofar as they all share my current phenomenal experiences and my (real or quasi-) memories. Some of them inhabit worlds that are exactly as I believe the actual world is, and are located in that world exactly where I believe I am located in the actual world. Others occupy very different places in very different worlds: they are brains in vats or inhabitants of gruesome counterinductive worlds. How should I distribute my credence among all these possibilities?

Optimism

Eliezer Yudkowsky, in his Intuitive Explanation of Bayesian Reasoning, argues that it is irrational to justify the belief that if a biological war will break out it won't wipe out humanity by pointing out that one is an optimist:

p(you are currently an optimist | biological war occurs within ten years and wipes out humanity) =
p(you are currently an optimist | biological war occurs within ten years and does not wipe out humanity)

Always Conditionalize?

Let P be a proposition of which you neither believe that it's true nor that it's false, say Goldbach's Conjecture. Since you know that you don't believe P (otherwise you couldn't have chosen it), your conditional subjective probability for [P and I don't believe P] given P should be close to 1. However, if you were to learn that P, your subjective probability for [P and I don't believe P] shouldn't be close to 1, but close to 0. So is this a case were you shouldn't conditionalize?

Search

Subscribe (RSS)