## Identity (equality) in automated semantic tableaux

It's been quiet here. I haven't had much time or energy for philosophy since the pandemic turned me into a stay-at-home dad on top of the regular job(s). At least teaching has finally come to an end a few weeks ago. I've used my newly found free time to build support for identity into the tree prover.

This is something I've wanted to do for a long time. Every five years or so I look into it, but give up because it's too hard. Here I'll explain the challenges, and the approach I chose.

Logic textbooks usually introduce two rules for handling identity: "Leibniz' Law" (in both directions) and a rule that allows closing branches on which there is a node of the form ¬t=t. These rules go back to Jeffrey 1967.

In automated tableaux, Jeffrey's rules have two drawbacks. One is that they can often be applied in many ways, and trying all possible applications is inefficient. This is especially problematic in the presence of function terms. For example, if a branch contains f(a)=a and Pa, Leibniz' Law allows adding Pf(a), Pf(f(a)), and so on, forever.

To help with this problem, one can impose some restrictions on applications of Leibniz' Law, without affecting completeness. First, one can restrict the rule to literals. (This restriction is even imposed in some intro textbooks, such as Priest 2008.) Second, one can require that any application of the rule must replace a "larger" term by a "smaller" term, so that 'a' can't be replaced by 'f(a)'. Third, one may require that if Leibniz' Law is applied to an identity statement, then it can only replace the larger side of the identity. Even with these restrictions, however, the search space for automated rule applications is often huge.

The second problem with Jeffrey's rules is that they are not complete for free-variable tableaux.

When automating tableaux, it is common to instantiate universal formulas not with closed terms but with free variables, and only replace the variables with closed terms if that allows closing a branch. (See the Wikipedia page for the tableaux method.) Now consider a branch with the following nodes:

1. Pg(a)
2. ¬Pg(f(x))
3. f(g(b)) = y

We can't apply Leibniz' Law, nor can we close the branch by unification (i.e., by replacing free variables with closed terms). But if we replace 'x' with 'g(b)', we can apply Leibniz' Law to nodes 2 and 3 to get ¬Pg(y). If we further replace 'y' with 'a', we can close the branch.

So we need a new rule that combines Leibniz' Law with a substitution of free variables: If a branch contains s=t as well as a node A[r], and there is a substitution that renders the terms t and r identical, then one may append A[s] to the branch and apply the substitution (to the whole tableau). We also need a new closure rule: If a branch contains ¬(s=t) and there is a substitution that renders the terms s and t identical, then one may apply the substitution and close the branch. These rules were introduced in Fitting 1990.

(Notice that the resulting tableaux, after the substitutions have been applied, look like tableaux that are expanded with Jeffrey's rules.)

We still have to deal with the first problem. In fact, that problem gets worse in free-variable tableaux. For one thing, when we have free variables, we almost always have function terms, due to the skolemisation required when expanding existential nodes. Moreover, it is hard to implement the above-mentioned limitations on the "size" of terms, because we often don't know which of two terms (say, f(x) and y) eventually ends up being smaller once the free variables are replaced by terms.

Another completeness-preserving restriction proves convenient. Instead of cluttering a tree with pointless applications of identity rules, one can delay their application until some sequence of them allows closing a branch. This means that when we expand a branch on a tableau, we can occasionally check if the branch could be closed with the identity rules. If the answer is negative, we continue with the ordinary expansion rules on the unchanged branch.

Let's have a closer look at what we need to do to check if a branch can be closed with Fitting's rules.

We can usually ignore most of the branch. We only need to look at all the identity statements on the branch as well as candidates for a complementary pair. Fa and ¬Fb, for example, are a candidate for a complementary pair: if we can derive a=b, the branch can be closed. In effect, then, our task is to check if the identity statements on a branch allow deriving another identity statement (here, a=b), or a set of such statements. (If the branch contains Rab and ¬Rcd, we need to derive both a=b and c=d.)

This kind of problem is called a "rigid E-unification problem" (see Beckert 1998). It is a hard problem. In general, deciding whether a rigid E-unification problem has a solution is NP-complete.

To illustrate, return to the example from above. Nodes 1 and 2 are a candidate complementary pair. We could turn them into an actual complementary pair (by Leibniz' Law) if we could show that g(a) = g(f(x)). So that's our goal. We want to derive this equality from the equalities on the branch -- i.e., from node 3: f(g(b)) = y -- with the help of Leibniz' Law, in Fitting's version. Equivalently: We are looking for a substitution (of terms for free variables) under which we can derive g(a) = g(f(x)) from f(g(b)) = y by Leibniz' original Law. There's no simple and efficient algorithm for this kind of task.

There is another complication. A substitution that allows closing the current branch might prevent closure of another open branch. For example, suppose we've expanded a branch to the following nodes. (I forgot where I found this example.)

1. z=a v z=b
2. ¬g(x)=g(y) v x=y
3. g(a)=b
4. ¬g(g(a)=a

Expanding node 1 introduces two branches, one with ¬z=a, the other with z=b. We can close the first branch if we replace z by g(g(a)). On the remaining branch, we can expand 2, which yields two more branches, one with ¬g(x)=g(y) and one with x=y. We can close the first of these if we replace both x and y with a. But that turns x=y into a=a, preventing closure of the remaining branch (except by starting over with expansion of 1 and 2 and choosing different substitutions on the next round).

To get around this, it looks like we need to search for a substitution that allows closing all open branches at the same time. This requires solving a "simultaneous rigid E-unification problem". And that task isn't just NP-complete, but altogether undecidable (as shown in Degtyarev and Voronkov 1995).

So we need a different approach. What we can do is tentatively close a branch if we found a solution to one of its E-unification problems, but be prepared to backtrack and re-open the branch if we can't close the whole tree without exceeding a certain level of complexity (say, a certain total number of nodes).

That's the approach I use in my prover.

To tackle the E-unification problems on individual branches, I use a version of the "BSE" calculus presented in Degtyarev and Voronkov 1998.

Remember that in order to reduce the search space, we would like to restrict applications of Leibniz' Law so that they always replace larger terms by smaller terms (among other things). But we often don't know whether a term will eventually be larger or smaller than another, because that depends on the substitution of free variables. In the BSE calculus, each application of Leibniz' Law therefore imposes a constraint on any substitution that may eventually be used to solve the problem.

Formally, the calculus operates on pairs of a rigid E-unification problem and a constraint. (The problems in Degtyarev and Voronkov all have a single equality as goal, but this is easy to generalise.)

There are three rules. One is called 'rrbs' and looks as follows:

        [... l=r ... ⊢ s[p]=t], [C]
-------------------------------------------
[... l=r ... ⊢ s[r]=t], [C, l=p, l>r, s[p]>t]


Above the line, we have a rigid E-unification problem with goal s[p]=t. ('s[p]' denotes a term s that contains p.) The equations that can be used to solve the goal are to the left of the turnstile and include some equation l=r. Above the line, we also have some constraint C. After application of the rrbs rule, the goal of the unification problem has changed to s[r]=t and the constraint has been extended by the following three conditions.

First, any allowable substitution must render the terms l and p syntactically identical, so that the inference from s[p]=t to s[r]=t based on l=r is a genuine application of Leibniz's Law.

Second, any allowable substitution must render the term l (and therefore p) larger than r, because we want to respect the constraint that larger terms are always replaced by smaller ones.

Similarly for the third condition, which reflects the constraint that we only want to replace the larger side of an equality.

The rule can only be applied if the resulting constraint is satisfiable.

There's a similar rule ('lrbs') for applying Leibniz' Law to the equalities on the left-hand side of the turnstile.

The third rule ('er') says that a problem is solved if the current constraint is compatible with the syntactic identity of the goal terms.

The BSE rules can be added to the ordinary tableau rules. Degtyarev and Voronkov show that the resulting rules are sound and complete. (The completeness proof is surprisingly difficult and doesn't look like ordinary completeness proofs for tableaux.)

The BSE calculus doesn't settle the order in which the rules should be applied, and to which equations. An algorithm based on the BSE calculus is helpfully outlined in Franssen 2008. I have loosely followed Franssen's approach. One difference is that I've replaced his depth-first search with a less efficient breadth-first search. This is partly because the breadth-first search often finds simpler solutions. More importantly, it makes it easier to give the browser a rest during difficult computations. I also haven't implemented any check for satisfiability of ordering conditions in what Franssen calls "solved forms" (mainly out of laziness). This means that I allow replacing smaller terms by larger terms under certain (rare) conditions. In the context of my general proof search strategy this shouldn't affect completeness.

I have added one improvement that I haven't seen in the literature: The BSE rules (effectively, Fitting's rules) are restricted so that they never replace terms inside skolem terms. This sometimes allows for a dramatic reduction of the search space, and it is intuitively clear that it doesn't affect completeness. (I say "intuitively" because I don't have a completeness proof for my approach, since I don't want to face the task of adapting the proof in Degtyarev and Voronkov.)

A nice feature of this general approach to handling identity is that once a branch is closed, it can easily be converted into a branch that looks like it has been solved with Jeffrey's rules. This is important for me because I want the output of my prover to look like a familiar textbook tableau.

I've tried the new prover on a few dozen examples and it generally does what it should, unless you challenge it with something really mean, like this problem from Beckert 1997.

I'm sure there are still bugs. Let me know if you find one!

# on 28 June 2021, 17:45

Wow! The identity feature makes your tree prover even more valuable. Even without it, the generator has allowed me to make proofs that I probably wouldn’t have been able to reliably create without the help of such a tool.

Being interested in computerized database communication it struck me from the first use that if relational database theory weren’t so widely misunderstood, the generator would be an essential component of every relational database compiler. Most of the applied theory of that subject is not only pathetically informal but cries out for automated validation both of design and application. The potential audience for such a compiler numbers in the millions.

The thought also occurred that if it became a so-called progressive Web app, it might become very widely used by the many people who have subjects just as susceptible to automation.

I’ve thanked you before and all I can do is thank you again and add congratulations.