CS 76 / 176

First-order logic

Professor Devin Balkcom
devin@cs.dartmouth.edu
office: Sudikoff 206

http://www.cs.dartmouth.edu/~cs76

First-order logic

From Self-Reliance, Ralph Waldo Emerson:

$\forall x ((\operatorname{mind}(x) \wedge \operatorname{little}(x)) \Rightarrow \operatorname{hobgoblin}(x, consistency)$

What I think he meant to say was:

$\exists x (\operatorname{mind}(x) \wedge \operatorname{great}(x) \wedge \neg \operatorname{consistent}(x))$

Or even stronger:

$\operatorname{great}(x) \Rightarrow \neg \operatorname{consistent}(x)$

If great minds were precise, I guess he would have said "consistency is the hobgoblin only of little minds."

Problems with propositional logic

Proliferation of symbols:

$B_{11} \iff (P_{1,2} \vee P_{2, 1})$, ...

Problems with natural language

Can be imprecise and depend on context. "Spring":

mechanical?
a season?
flowing water?

Using natural language as communication assumes large knowldge base, and seems to require some probability based reasoning.

Problems with programming languages

Most formal languages are procedural rather than declarative. You can have objects, but don't expect Java to reason about them automatically.

There are exceptions. Prolog can reason about statements like "All cars are red." (Constraint satisfaction and some formal logic tools are built-in to Prolog!)

Knowledge representation

We want something that:

has objects (like NL)
is declarative (like NL, propositional logic)
is context-independent (unlike NL, no "hidden" KB)
has a precise syntax

First-order logic

First-order logic makes a stronger ontological commitment (what exists in the world) than propositional logic.

Propositional logic: facts that are true or false

First-order logic: objects, and relationships between them that hold (true) or do not hold (false).

You can think of first-order objects as graph vertices, and relationships as (hyper-)edges. Sentences constrain possible (hyper-)graph structures.

First-order logic

Propositional logic and first-order logic make a similar epistmiological commitment (degree of knowledge): true/false values.

Probabilistic reasoning allows values of $[0, 1]$: fact or relationship holds with probability $x$.

First-order logic

First-order logic includes:

Objects (think of vertices)
Relations or maps take a tuple and return true or false: is there an (hyper-)edge?
Functions are $n$ to one, and return an object, not true/false. But formally, these are relations, too.

Examples

Relation (binary): $\operatorname{siblings}(Leia, Luke)$

Property (a unary relation): $\operatorname{evil}(emperor)$

Function: $\operatorname{father}(Luke)$ is $DarthVader$

or equivalently,

$\operatorname{father}(Obiwon, Luke) = false$

$\operatorname{father}(Leia, Luke) = false$

$\operatorname{father}(DarthVader, Luke) = true$

Symbols: Richard, John, leftleg(Richard), leftleg(John), leftleg(leftleg(John)), ...

Syntax

Three types of symbols:

Constant symbols
Predicate symbols
function symbols

Connectives: $\wedge | \vee | \Rightarrow | \Leftrightarrow | \neg$

Constant symbols: $A | X | John$

Equality: $=$ Two symbols refer to the same object

variables: x, y, z

Quantifiers: $\forall, \exists$; ways to refer to groups of objects.

Semantics

sentences + (model, interpretation) $\mapsto$ true/false

interpretation is new, and makes use of the $=$ sign. Will come back to it.

Models, objects, relations

Models

model: a set of true/false values for every relation among objects. (Think of a set of directed edges, with different colors for each relation, of graph.)

$r_1(O_1, O_2) = t$, $r_1(O_1, O_3) = f$, $r_1(O_2, O_1) = f$, ...

Models

How many models?

For each binary relation (possible edge), there are $n^2$ possible object pairs (2-tuples), $n^3$ possible ternary relations, $n^k$ possible k-array relations.

That's just the number of tuples. Each can be true or false. So for each relation, we get a factor of $2^{n^k}$ models.

By the way, $n$ might be infinite. (Maybe the objects are natural numbers, which can be described in FOL.)

Interpretation

Computational complexity gets even worse. The syntax doesn't bind symbols to particular objects.

Symbols: Luke, DarthVader, Emperor, Palpatine, Anikin. Five symbols, but how many objects?

Intended interpretation

Multiple interpretations may be consistent with a set of first-order logic sentences. Need DarthV = Anikin.

Quantifiers

Universal ($\forall$)

$\forall x \operatorname{king}(x) \Rightarrow \operatorname{person}(x)$

What it really means (universal instantiation):

$\operatorname{king}(John) \Rightarrow \operatorname{person}(John)$

($\wedge$) $\operatorname{king}(Richard) \Rightarrow \operatorname{person}(Richard)$

($\wedge$) $\operatorname{king}(crown) \Rightarrow \operatorname{person}(crown)$

Quantifiers

Existential ($\exists$)

$\neg \exists x \operatorname{wise}(x) \wedge \operatorname{person}(x)$

What it really means (existential instantiation):

$\neg$ ( $(\operatorname{wise}(John) \wedge \operatorname{person}(John))$

$ \vee (\operatorname{wise}(Richard) \wedge \operatorname{person}(Richard))$

$\vee (\operatorname{wise}(crown) \wedge \operatorname{person}(crown))$ )

Quantifiers

Instantiation gives simple proof that Demorgan's rules apply to quantifiers.

$\neg(A \wedge B \wedge C) \equiv \neg A \vee \neg B \vee \neg C$. So,

$\neg \forall x \operatorname{person}(x) \Rightarrow \operatorname{king}(x) \equiv$

$\exists x (\neg (\operatorname{person(x)} \Rightarrow \operatorname{king}(x))) \equiv$

$\exists x (\neg (\neg \operatorname{person}(x) \vee \operatorname{king}(x))) \equiv$

$\exists x (\operatorname{person}(x) \wedge \neg \operatorname{king}(x))$

Quantifiers

Universal: usually with $\Rightarrow$, since $\vee$ or $\wedge$ too strong. Example:

$\forall x \operatorname{person}(X) \wedge \operatorname{king}(x)$

But as George Orwell said,

$\forall x \operatorname{animal}(x) \Rightarrow \operatorname{equal}(x)$

Existential: usually with $\wedge$. (Implication is usually too weak, since any false premise will satisfy the quantifier.)

Equality operator

The equality operator is used to indicate that two symbols refer to the same object.

"Every person has at least two brothers."

Wrong: $\forall x \exists y, z \operatorname{person}(x) \Rightarrow \operatorname{brother}(x, y) \wedge \operatorname{brother}(x, z)$

Correct: need to include $y \ne z$.

Nested quantifiers

"Everybody loves somebody":

$\forall x \exists y \operatorname{loves(x, y)}$

"Everybody is loved by somebody":

$\forall x \exists y \operatorname{loves(y, x)}$

"There is somebody everybody loves":

$\exists y \forall x \operatorname{loves(x, y)}$

Example: natural numbers

Numbers are good for counting, ordering, and measuring. Let's describe them with FOL.

A constant symbol: $0$
A property: $\operatorname{NaturalNumber}()$ or $N()$ for short.
A function symbol: $S()$

We need the property $N()$ because there might be other objects that we want to reason about, and we want to make sure to separate out the numbers. Almost a namespace.

Peano axioms

$N(0)$
$\forall n \, N(n) \Rightarrow N(S(n))$
So, $0$, $S(0)$, $S(S(0))$, $S(S(S(0)))$ all natural numbers.
Could call them 0, 1, 2, 3, 4, but not yet.

$\forall n \, N(n) \Rightarrow S(n) \ne 0$

$\forall m, n \, ((N(n) \wedge N(m) ) \wedge (n \ne m)) \Rightarrow S(n) \ne S(m)$

(3) and (4) describe the intended interpretation.

"$0$ is not the successor of any natural number, and no two natural numbers have the same successor.""

Natural numbers: intended interpretation

$\forall n \, N(n) \Rightarrow S(n) \ne 0$
$\forall m, n \, ((N(n) \wedge N(m) ) \wedge (n \ne m)) \Rightarrow S(n) \ne S(m)$

We want:

We need to rule out things like this:

Addition

One more function symbol, $+$, and two rules:

$\forall m \, N(m) \Rightarrow +(0, m) = m$
Additive identity: add 0 $\mapsto$ same number

Addition

$\forall m, n \, (N(n) \wedge N(m) ) \Rightarrow$ $ +(S(m), n) = S(+(m, n))$

Example: Compute $+( S(S(0)), S(S(S(0))) )$

Rule (6): move an $S$ from first term outwards:

$S(+( S(0), S(S(S(0))) ))$

Play it again, Sam:

$S(S(+( 0, S(S(S(0))) )))$

Addition

$S(S(+( 0, S(S(S(0))) )))$

Rule (5) says adding zero to a number gives that number, so we can remove the addition symbol from the inner part:

$S(S(S(S(S(0)))))$

Repeated applications of rule 5, followed by one application of rule 6, give us a machine to add any two natural numbers.

Awesome!

First-order inference

Modus ponens: $\frac{A \Rightarrow B \quad A}{B}$

Say we have the knowledge base:

$\forall x \, king(x) \wedge greedy(x) \Rightarrow evil(x)$

$king(John)$

$greedy(John)$

Want to show $evil(John)$. How?

Find a substitution $s$ (here $\{x / John \}$ so that:

$king(John) \wedge greedy(John) \Rightarrow evil(John)$

Instantiation

Idea #1: convert FOL to propositional logic by getting rid of quantifiers.

Universal instantiation: $\frac{\forall v \quad \alpha}{\operatorname{subst}(\{v/g\}, \alpha)}$,

for any variable $v$, sentence $\alpha$, any ground term $g$.

Existential instantiation: $\frac{\exists v \quad \alpha}{\operatorname{subst}(\{v/k\}, \alpha)}$,

for any variable $v$ and constant symbol $k$ that does not appear elsewhere.

Skolemization

Existential instantiation: $\frac{\exists v \quad \alpha}{\operatorname{subst}(\{v/k\}, \alpha)}$.

Example:

From $\exists crown(x) \wedge onhead(x, John)$

infer $crown(c_1) \wedge onhead(c_1, John)$

The new name $c_1$ is called the Skolem constant.

Universal quantifier gives many instantiations, but existential gives only one (but new symbols).

Propositionalization

$\forall x \, king(x) \wedge greedy(x) \Rightarrow evil(x)$

$king(John)$

$greedy(John)$

Apply universal instantiation first, using all possible ground terms:

$king(John) \wedge greedy(John) \Rightarrow evil(John)$

$king(Richard) \wedge greedy(Richard) \Rightarrow evil(Richard)$

Now view $king(john)$, $greedy(john)$, etc, as propositional symbols.

Herbrand's theorem

Do we have a complete algorithm for entailment?

Function symbols can lead to infinite ground terms:

$0$, $S(S(0))$, $S(S(S(0)))$

Herbrands theorem (1930): if a sentence is entailed by the original FOL KB, then there exists a proof involving only a finite subset of the propositionalized KB.

Since any subset has a maximum depth of ground terms, generate all terms of depth 0, then depth 1 (BFS): $John, Richard, father(John)$, $father(Richard), father(father(John)), ...$

Semi-decidability

Propositionalization is complete: any entailed sentence can be proven.

But what if a sentence is not entailed?

We say that FOL entailment is semi-decidable: algorithms exist that say yes to every entailed sentence, but no algorithm exists that says no to every non-entailed sentence. (Turing 1936, Church 1936), related to halting problem.

More efficient entailment

It seems silly to generate $king(Richard)$ and $greedy(Richard)$ to prove $evil(John)$.

Unification and lifting: Find some $x$ such that $x$ is a king, and $x$ is greedy; infer that this $x$ is evil.

More generally, if there exists a substitution $\theta$ that makes the premise identical to sentences already in the KB, then we can infer the conclusion.

Unification

$\forall x \, king(x) \wedge greedy(x) \Rightarrow evil(x)$

$king(John)$

$\forall y \, greedy(y)$

We'd like to concluded Johhn is evil, since John is king, and everyone is greedy. We need a substitution for:

variables in the implication sentence
variables in sentences to be matched

Apply $\{x / John\}$, $\{y/John\}$ to the implication premises $king(x)$, $greedy(x)$, and to the KB sentences $king(John)$ and $greedy(y)$.

Generalized modus ponens

For atomic sentences $p_i$, $p_i'$, and $q$, where there is a substitution $\theta$ s.t. $\operatorname{subst}(\theta, p_i') = \operatorname{subst}(\theta, p_i)$ for all $i$,

$$ \frac{p_1', p_2', ... p_n' \quad (p_1 \wedge p_2...\wedge p_n \Rightarrow q)}{\operatorname{subst}(\theta, q)}$$

GMP is an example of lifting: a first-order version of a propositional logic rule.

Unification

$\operatorname{unify}(knows(John, x), knows(John, Jane))$

$= \{x/Jane\}$

$\operatorname{unify}(knows(John, x), knows(y, Bill))$

$= \{x/Bill\}, \{y/John\}$

$\operatorname{unify}(knows(John, x), knows(y, mother(y)))$

$= \{y/John\}, \{x/mother(John)\}$

Standardizing apart

$\operatorname{unify}(knows(John, x), knows(x, Elizabeth)$

fails

John knows everybody. Everybody knows Elizabeth. Surely John knows Elizabeth? Why doesn't it work?

Need a new variable: standardize apart

Implement unification by standardizing apart, then exploring inputs side-by-side.

Datalog

Resolution is expensive (even in prop logic), and sometimes you don't need it. Use GMP instead.

Datalog KB: no function symbols, first-order definite clauses:

$\forall x \, king(x) \wedge greedy(x) \Rightarrow evil(x)$

$\forall x \, \neg king(x) \vee \neg greedy(x) \vee evil(x)$

Backward chaining

(From AIMA.) "The law says that it is a crime for an American to sell weapons to hostile nations. The country Nono, an enemy of America, and all of its missiles were sold to it by Colonel West, an American." Prove that West is a criminal.

$American(x) \wedge weapon(y) \wedge sells(x, y, z) \wedge hostile(z)$ $\Rightarrow(criminal(x))$

$\exists \,x \, owns(Nono,x) \wedge missile(x)$. Not a definite clause. Skolemize and and-eliminate:

$owns(Nono, M_1)$
$missile(M_1)$

Backward chaining

$missile(x) \wedge owns(Nono, x) \Rightarrow sells(West, x, Nono)$
$missile(x) \Rightarrow weapon(x)$
$enemy(x, America) \Rightarrow hostile(x)$
$American(West)$
$enemy(Nono, America)$

CS 76 / 176

First-order logic

Problems with propositional logic

Problems with natural language

Problems with programming languages

Knowledge representation

First-order logic

First-order logic

First-order logic

Examples

Syntax

Semantics

Models

Models

Interpretation

Quantifiers

Quantifiers

Quantifiers

Quantifiers

Equality operator

Nested quantifiers

Example: natural numbers

Peano axioms

Natural numbers: intended interpretation

Addition

Addition

Addition

First-order inference

Instantiation

Skolemization

Propositionalization

Herbrand's theorem

Semi-decidability

More efficient entailment

Unification

Generalized modus ponens

Unification

Standardizing apart

Datalog

Backward chaining

Backward chaining

and-or graph