Wolfgang Schwarz :: RSA models with partially informed speakers

RSA models with partially informed speakers

Posted on Thursday, 19 Oct 2023.

Let's model a few situations in which the hearer does not assume that the speaker has full information about the topic of their utterance.

1.'Some apples are red'

Goodman and Stuhlmüller (2013) consider a scenario in which a speaker wants to communicate how many of three apples are red. The hearer isn't sure whether the speaker has seen all the apples. Chapter 2 of problang.org gives two models of this scenario. The first makes very implausible predictions. The second is very complicated. Here's a simple model that gives the desired results.

var states = ['RRR','RRG','RGR','GRR','RGG','GRG','GGR','GGG'];
var meanings = {
  'all': function(state) { return !state.includes('G') },
  'some': function(state) { return state.includes('R') },
  'none': function(state) { return !state.includes('R') },
  '-': function(state) { return true }
}
var observation = function(state, access) {
    return filter(function(s) {
        return s.slice(0,access) == state.slice(0,access);
    }, states);
}
var hearer0 = Agent({
    credence: Indifferent(states),
    kinematics: function(utterance) {
        return function(state) {
            return evaluate(meanings[utterance], state);
        }
    }
});
var speaker1 = function(obs) {
    return Agent({
        options: keys(meanings),
        credence: update(Indifferent(states), obs),
        utility: function(u,s){
            return learn(hearer0, u).score(s);
        }
    });
};
showChoices(speaker1, [observation('RRR', 2), observation('GGG', 2)]);

I'll briefly pause here for an explanation. I assume that the speaker either has access to all three apples or only to the first two apples, or only to the first apple. The observation function takes a state and an access level (1, 2, or 3) and returns the information a speaker would have about the state at the given access level. For example, for state 'RRG' and access level 2, the function returns the set { 'RRG', 'RRR' }. The level-1 speaker speaker1 is parameterized by some such information. If you run the code, you see what a level-1 speaker would say (a) if her information state is { 'RRG', 'RRR' }, and (b) if her information state is { 'GGG', 'GGR' }.

The level-2 hearer is unsure about the speaker's access level and performs a joint inference about the access level and the state:

// continues #1
var hearer2 = Agent({
    credence: Indifferent(Cross({'state':states, 'access':[1,2,3]})),
    kinematics: function(utterance) {
        return function(s) {
            var obs = observation(s.state, s.access);
            return sample(choice(speaker1(obs))) == utterance;
        }
    }
});
showKinematics(hearer2, keys(meanings))

The results make sense. For example, if the speaker says 'all', the hearer infers that she (the speaker) has full access to 'RRR'. If the speaker says 'some', she has seen at least one red apple and has not seen 'RRR'. And so on. The scalar inference from 'some' to 'not all' hasn't completely disappeared. It has only become weaker.

The second model on problang.org makes similar predictions. The first uses a naive "planning as inference" algorithm to compute the speaker's choice, like so:

// continues #1
var alpha = 10
var speaker1choice = function(obs) {
    return Infer(function() {
        var u = uniformDraw(keys(meanings));
        var s = uniformDraw(states);
        condition(obs.includes(s));
        var utility = learn(hearer0, u).score(s);
        factor(alpha * utility);
        return u;
    });
};
viz.table(speaker1choice(observation('RRG',2)))

In essence, this makes a speaker choose an utterance in proportion to how good the utterance would be in a possible state of full information. A speaker who only sees that the first two apples are red will strongly prefer 'all'.

2.An ignorance implicature

Let's turn to a possibly more interesting example.

Recall the apples and orange juice scenario from my post on scalar implicatures. I'll assume that it is conversationally relevant which juices are on offer. Let's also add one more option to the available utterances. Besides 'have apple', 'have orange', 'have apple and orange', and their negation, the speaker can choose 'have apple or orange':

var states = Cross('apple', 'orange'); // = [{apple: true, orange: true}, ...]
var meanings = {
    'have apple': function(state) { return state['apple'] },
    'not have apple': function(state) { return !state['apple'] },
    'have orange': function(state) { return state['orange'] },
    'not have orange': function(state) { return !state['orange'] },
    'have apple and orange': function(state) { return state['apple'] && state['orange'] },
    'have apple or orange': function(state) { return state['orange'] || state['apple'] },
    'have no juice': function(state) { return !state['apple'] && !state['orange'] }
};
var hearer0 = Agent({
    credence: Indifferent(states),
    kinematics: function(utterance) {
        return function(state) {
            return evaluate(meanings[utterance], state);
        }
    }
});
var speaker1 = function(state) {
    return Agent({
        options: keys(meanings),
        credence: Indifferent([state]),
        utility: function(u,s){
            return learn(hearer0, u).score(s);
        }
    });
};
var hearer2 = Agent({
    credence: Indifferent(states),
    kinematics: function(utterance) {
        return function(state) {
            return sample(choice(speaker1(state))) == utterance;
        }
    }
});
showKinematics(hearer2, keys(meanings));

Predictably, the level-1 speaker never utters 'have apple or orange', no matter what state she has observed. Consequently, the level-2 hearer doesn't know what to think if he hears 'have apple or orange'. He can't conditionalize on an event with probability 0.

A real hearer, of course, would infer that the speaker is not fully informed (or not fully cooperative).

We've built informedness and cooperativity into the model: the hearer simulates the speaker as being certain of the true state and wanting to confer high accuracy towards that state to the hearer.

Let's add the possibility that the speaker may have limited information. Concretely, I'll assume that the hearer is unsure about which question the speaker knows the answer to.

// continues #4
var access = {
    'apple? orange?': function(s) { return s },
    'apple?': function(s) { return s.apple },
    'orange?': function(s) { return s.orange },
    'apple or orange?': function(s) { return s.apple || s.orange }
};
var speaker1 = function(observation) {
    return Agent({
        options: keys(meanings),
        credence: update(Indifferent(states), observation),
        utility: function(u,s){
            return learn(hearer0, u).score(s);
        }
    });
};
var hearer2 = Agent({
    credence: join({
        'state': Indifferent(states),
        'access': Categorical({ vs: keys(access), ps: [0.7, 0.1, 0.1, 0.1] })
    }),
    kinematics: function(utterance) {
        return function(s) {
            var observation = cell(access[s.access], s.state, states);
            return sample(choice(speaker1(observation))) == utterance;
        }
    }
});
showKinematics(hearer2, ['have apple', 'have apple and orange', 'have apple or orange']);

Here I've defined four questions to which the speaker might know the answer. A speaker who knows the answer to 'apple? orange?' is fully informed about the state. A speaker who only knows the answer to 'apple?' is fully informed about the availability of apple juice. And so on. The level-2 hearer is unsure about the state and the speaker's knowledge. Categorical({ vs: keys(access), ps: [0.7, 0.1, 0.1, 0.1] }) encodes the latter uncertainty. It defines a distribution over the four questions, giving probability 0.7 to the first ('apple? orange?') and 0.1 to each of the others.

This time, the level-1 speaker can utter 'have apple or orange'. She does so whenever (a) she only knows the answer to 'apple or orange?' and (b) that answer is positive. As a result, the level-2 hearer infers (a) and (b) from an utterance of 'have apple or orange'. We get an ignorance implicature.

We also see that 'have apple' no longer renders 'not have orange' certain. In the example, the level-2 hearer rather becomes 82% confident that there's no orange juice and 18% confident that there is but the speaker doesn't know.

3.Exclusivity lost?

It is often assumed that disjunctions have an exclusivity implicature – that 'A or B' implicates 'not both'. My model doesn't predict this.

In fact, it's hard to see how the exclusivity implicature could be derived from assumptions about rationality and cooperativity.

To be sure, a speaker who knew that A and B are both true should say 'A and B' rather than 'A or B'. If a fully informed speaker does not utter 'A and B', we can therefore infer that A and B are not both true. The problem is that a speaker who utters 'A or B' thereby reveals that she is not fully informed.

We could predict the implicature if we made strange assumptions about the priors – for example, if we assumed that the speaker is more likely to be informed about A and B if both are true than if both are false. But what could motivate such an assumption?

This is a potential problem for RSA models, and for neo-Gricean models more generally. (By contrast, exclusivity is easily predicted by grammatical theories of implicature, along the lines of Chierchia, Fox, and Spector (2012).) Maria Aloni mentions the problem in her SEP entry on disjunction. So I assume it is well-known. I don't know how people respond.

I'm not sure how serious the problem is, in part because I'm not sure about the robustness of exclusivity inferences, and in part because there might be non-obvious ways of deriving the implicature after all.

4.Innocent exclusion

The apple and orange juice case illustrates an attractive feature of RSA models: we don't need a special "consistency check" to prevent the derivation of inconsistent implicatures.

A naive Gricean algorithm for computing scalar implicatures might say that if a speaker utters a sentence S instead of a stronger alternative S', the hearer may infer that the alternative is false. Assuming that a disjunction 'A or B' has each disjunct as an alternative, this algorithm would predict that one can infer the falsity of both A and B from an utterance of 'A or B', even though this is inconsistent with the literal meaning of the utterance.

Sauerland (2004) presents an improved algorithm. According to Sauerland, the hearer must first consider what the utterance reveals about what the speaker doesn't know. By Gricean reasoning, an utterance of 'A or B' reveals that the speaker doesn't know either of the stronger alternatives A and B. In a second step, the hearer may strengthen these \(\neg K\phi\) hypotheses to \(K\neg\phi\), provided the strengthened hypothesis is consistent with what has been inferred in the first step. In the case of 'A or B', the first step yields \(\neg KA, \neg KB\), and the literal meaning \(K(A\vee B)\). The consistency check blocks the strengthening of \(\neg KA\) to \(K\neg A\), because \(K\neg A\) is inconsistent with \(\neg KA, \neg KB\), and \(K(A\vee B)\).

Spector (2006), Fox (2007), and Schwarz (2016) point out that Sauerland's algorithm can still lead to inconsistent inferences.

Schwarz (2016) considers sentence (1):

(1)Al hired at least two cooks.

Like plain disjunctions, (1) triggers an ignorance implicature: the speaker doesn't know how many cooks Al hired.

Now suppose the alternatives to 'at least two' include 'at least three', 'at least four', 'exactly three', 'exactly four', etc. By Sauerland's algorithm, we first infer from an utterance of 'at least two' that the speaker doesn't know the stronger alternatives 'exactly two', 'at least three' etc. That is, the first step yields

(2)\(K[\geq2], \neg{}K[=\!2], \neg{}K[=\!3], \neg{}K[=\!4], \neg{}K[\geq3], \neg{}K[\geq4]\), etc.

In the second step we strengthen all these \(\neg{}K\phi\) facts to \(K\neg\phi\) provided the strengthening is consistent with (2). 'At least four' passes this test: if the open possibilities are {2,3} then all items in (2) are true, and so is \(K\neg[\geq4]\). 'Exactly three' also passes the test: if the open possibilities are {2,4} then all items in (2) are true, and so is \(K\neg[=\!3]\). But if we add both \(K\neg[\geq4]\) and \(K\neg[=\!3]\) to (2), we get a contradiction.

Schwarz concludes that neo-Gricean models require something stronger than Sauerland's consistency check: we need to check whether the hypotheses \(\phi\) for which we infer \(K\neg\phi\) are innocently excludable, in the sense of Fox (2007).

Here is a simple model RSA model of the 'at least two' scenario, showing that no explicit check of innocent exclusion is needed.

var states = [1,2,3,4];
var meanings = {
    'one': function(state) { return state >= 1 },
    'at least one': function(state) { return state >= 1 },
    'exactly one': function(state) { return state == 1 },
    'two': function(state) { return state >= 2 },
    'at least two': function(state) { return state >= 2 },
    'exactly two': function(state) { return state == 2 },
    'three': function(state) { return state >= 3 },
    'at least three': function(state) { return state >= 3 },
    'exactly three': function(state) { return state == 3 },
    'four': function(state) { return state >= 4 },
    'at least four': function(state) { return state >= 4 },
    'exactly four': function(state) { return state == 4 }
};
var alternatives = {
    'one': ['one', 'two', 'three', 'four'],
    'two': ['one', 'two', 'three', 'four'],
    'three': ['one', 'two', 'three', 'four'],
    'four': ['one', 'two', 'three', 'four'],
    'at least one': keys(meanings),
    'exactly one': keys(meanings),
    'at least two': keys(meanings),
    'exactly two': keys(meanings),
    'at least three': keys(meanings),
    'exactly three': keys(meanings),
    'at least four': keys(meanings),
    'exactly four': keys(meanings)
};
var hearer0 = Agent({
    credence: Indifferent(states),
    kinematics: function(utterance) {
        return function(state) {
            return evaluate(meanings[utterance], state);
        }
    }
});
var speaker1 = function(observation, options) {
    return Agent({
        options: options,
        credence: update(Indifferent(states), observation),
        utility: function(u,s){
            return learn(hearer0, u).score(s);
        }
    });
};
var hearer2 = Agent({
    credence: join({
      'state': Indifferent(states),
      'access': { 'full': 0.9, 'partial': 0.1 }
    }),
    kinematics: function(utterance) {
        return function(s) {
            var obs = s.access == 'full' ? s.state : [s.state, s.state+uniformDraw([-1,-2,1,2])];
            var speaker = speaker1(obs, alternatives[utterance]);
            return sample(choice(speaker)) == utterance;
        }
    }
});
showKinematics(hearer2, ['two', 'at least two'])

As you can see, the level-2 hearer infers that the speaker has partial access to the state when he hears 'at least two', and makes sensible inferences about that state.

What happens is this.

I assume that simple number words ('one', 'two') only have simple number words as alternatives, while all other available options have all options as alternatives. As in my model of plural NPs in the post on scalar implicatures, the alternatives constrain the hearer's reconstruction of the speaker's reasoning: the hearer wonders why the speaker chose the observed utterance from among its alternatives.

Have a look at speaker1:

// continues #6
showChoices(speaker1, [[2], [2,3]], [['one', 'two', 'three', 'four']])
showChoices(speaker1, [[2], [2,3]], [keys(meanings)])

Among the one-word options, speaker1 chooses 'two' if she knows that the state is 2, and also if she merely knows that the state is 2 or 3. Among all options, she chooses 'exactly two' if she knows that the state is 2 and either 'at least two' or 'two' if she merely knows that the state is 2 or 3.

speaker1 never utters 'two' or 'at least two' if she is fully informed. A level-2 hearer who assumes full informedness can still make sense of 'two', because 'two' is the best among its alternatives in knowledge state {2}.

The level-2 hearer is 90% confident that the speaker is fully informed. The remaining 10% of his credence goes to different states of partial information. For example, if the true state is 3, then the speaker's information state might be {3} (most likely) or {1,3}, {2,3}, {3,4}, or {3,5}. Realistically, there are many more ways of having partial information, but I don't think including them would affect the result.

(Incidentally, this is another example where two expressions, here 'two' and 'at least two', are predicted to have different effects, despite having the same literal meaning.)

5.The informed speaker assumption

The above models all make an implausible prediction. Suppose you have no strong prior views about my state of knowledge with respect to the three apples. Then I utter 'some of the apples are red'. I think you'd come to believe that I've probably seen all the apples and that some but not all the apples are red.

Or suppose you have no strong views about my state of knowledge with respect to how many cooks Al has hired. Then I utter 'Al has hired two cooks'. You would infer that I'm probably well-informed and that Al did not hire three cooks.

The above models don't predict this.

An utterance of 'two' seems to convey that the speaker is well-informed. How could this come about?

A natural idea is that it is a higher-order implicature. We've seen that 'at least two' implicates ignorance, while 'two' does not. Uninformed speakers should therefore prefer 'at least two', and informed speakers should prefer 'two', so as to avoid triggering a false ignorance implicature.

(I assume this kind of implicature has been studied, but I don't think I've come across it in the literature.)

The explanation I just gave seems to assume that the speaker cares not only about the hearer's accuracy concerning the state of the world, but also about their accuracy concerning the speaker's state of information with respect to the questions under discussion. This is a reasonable assumption.

Interestingly, we can predict the informedness implicature arising from 'two' even without the assumption. The above model with fixed alternatives does not predict it. If we switch to a model with uncertainty about the speaker's cost function, as in the previous blog post, the effect appears.

var states = [1,2,3,4];
var meanings = {
    'one': function(state) { return state >= 1 },
    'at least one': function(state) { return state >= 1 },
    'exactly one': function(state) { return state == 1 },
    'two': function(state) { return state >= 2 },
    'at least two': function(state) { return state >= 2 },
    'exactly two': function(state) { return state == 2 },
    'three': function(state) { return state >= 3 },
    'at least three': function(state) { return state >= 3 },
    'exactly three': function(state) { return state == 3 },
    'four': function(state) { return state >= 4 },
    'at least four': function(state) { return state >= 4 },
    'exactly four': function(state) { return state == 4 }
};
var complexity = function(utterance) {
    return utterance.includes(' ') ? 3 : 1;
}
var makeHearer = function(speaker) {
    return Agent({
        credence: join({
            'state': Indifferent(states),
            'access': { 'full': 0.9, 'partial': 0.1 },
            'chattiness': Indifferent([0,1,2])
        }),
        kinematics: speaker ? makeKinematics(speaker) : level0kinematics
    });
};
var makeKinematics = function(speaker) {
    return function(utterance) {
        return function(s) {
            var obs = s.access == 'full' ? s.state : [s.state, s.state+uniformDraw([-1,1])];
            var sp = speaker(obs, s.chattiness);
            return sample(choice(sp)) == utterance;
        }
    }
};
var level0kinematics = function(utterance) {
    return function(s) {
        return evaluate(meanings[utterance], s.state);
    }
};
var makeSpeaker = function(hearer) {
    return function(observation, chattiness) {
        return Agent({
            options: keys(meanings),
            credence: update(Indifferent(states), observation),
            utility: function(u,s) {
                var q = marginalize(learn(hearer, u), 'state').score(s);
                var c = (chattiness-2)*complexity(u)/3;
                return q + c;
            }
        });
    }
};
var hearer0 = makeHearer();
var speaker1 = makeSpeaker(hearer0);
var hearer2 = makeHearer(speaker1);
var speaker3 = makeSpeaker(hearer2);
var hearer4 = makeHearer(speaker3);
var speaker5 = makeSpeaker(hearer4);
var hearer6 = makeHearer(speaker5);
showKinematics(hearer6, ['two', 'at least two'])

I need a lot of speakers and hearers here, so I've defined a few helper functions to create them.

In outline, the effect arises as follows.

When a speaker says 'at least two', a relatively naive hearer can infer that the speaker does not have a strong preference for simplicity; otherwise she would have chosen the semantically equivalent 'two'. This, in turn, means that she would have said 'exactly two' if that had led to significantly greater hearer accuracy. It would have done so if the speaker's information state was { 2 }. So the speaker's information state is probably { 2,… }.

When the speaker says 'two', however, the same hearer can't rule out that the speaker has a strong preference for simplicity. If she does, she might not have said 'exactly two' even if her information state was { 2 }. So the speaker's information state might be { 2 } and it might be { 2,… }.

In sum, this hearer finds 2 more probable if he hears 'two' than if he hears 'at least two'. At the next level, a speaker with a slight preference for simplicity might therefore prefer 'two' over 'exactly two' if her information state is { 2 }, but prefer 'at least two' over 'two' if her information state is { 2,3 }.

6.Exclusivity regained?

The effect we've just seen for 'two' and 'at least two' might help shed light on the exclusivity implicature arising from 'A or B'.

Even though 'two' and 'at least two' are semantically equivalent, a speaker who merely knows that Al hired two or more cooks would use 'at least two', while a speaker who knows that Al hired exactly two cooks would use 'two'.

Now compare 'A or B' and 'A or B or both'. These are semantically equivalent. But the latter signals that the speaker's information state is compatible with \(A \wedge B\) (just as 'at least two' signals that the speaker's information state is compatible with [>2]).

A relatively naive hearer who encounters 'A or B or both' can reason that (a) the speaker does not have a strong preference for simplicity, and hence that (b) the speaker would probably have said 'A or B but not both' if they could rule out \(A \wedge B\); since she didn't say 'A or B but not both', it follows that (c) the speaker's information state is compatible with \(A \wedge B\). No such conclusion can be drawn from an utterance of 'A or B'. This initial asymmetry might get amplified at higher levels.

I've briefly tried to confirm this idea with a simulation, but I haven't been able to get it to work. I suspect that it should be relatively easy to predict the exclusivity implicature, however, if we assume that the speaker cares about the hearer's accuracy with respect to the speaker's information state.

Chierchia, Gennaro, Danny Fox, and Benjamin Spector. 2012. “Scalar Implicature as a Grammatical Phenomenon.” In Semantics: An International Handbook of Natural Language Meaning. de Gruyter.

Fox, Danny. 2007. “Free Choice and the Theory of Scalar Implicatures.” In Presupposition and Implicature in Compositional Semantics, edited by U. Sauerland and P. Stateva, 71–120. Basingstoke: Palgrave Macmillan.

Goodman, Noah D., and Andreas Stuhlmüller. 2013. “Knowledge and Implicature: Modeling Language Understanding as Social Cognition.” Topics in Cognitive Science 5 (1): 173–84. doi.org/10.1111/tops.12007.

Sauerland, Uli. 2004. “Scalar Implicatures in Complex Sentences.” Linguistics and Philosophy 27 (3): 367–91.

Schwarz, Bernhard. 2016. “Consistency Preservation in Quantity Implicature: The Case of at Least.” Semantics and Pragmatics 9: 1-1-47. doi.org/10.3765/sp.9.1.

Spector, Benjamin. 2006. “Aspects de La Pragmatique Des Opérateurs Logiques.” PhD thesis, Paris 7.

Comments

No comments yet.

Wolfgang Schwarz