Philosophy of Abstraction Logic

This article is obsolete. It has been superseded by https://doi.org/10.47757/pal.2.

Dec 21, 2021

How can we do mathematics on a computer?

This question is bound to evoke a wide variety of responses. Many will ﬁnd the question somewhat silly and point out that a computer easily handles mathematics, for example via a calculator app. Others will question whether mathematics can be done on a computer at all, because mathematics is a social process happening within the mathematical community.

As has become abundantly clear in the last two decades, computers can support and facilitate social processes, for good and bad. This aspect of mathematics does not seem to be an unsurmountable obstacle to its computerisation. Attempts like the ProofPeer [1] project have been made to overcome this obstacle, but so far none of these attempts have been truly successful.

A central and maybe the most important aspect of mathematical activity is the formulation and proving of theorems. How can we do this on a computer, beyond just writing it up in TeX [2]?

In principle, this question has been answered. We can do it by using logic as a foundation of mathematics. Logic can be implemented via software (and/or hardware), and we can then use this logic to formulate and prove theorems. A large-scale example of this is the formal proof of the Kepler conjecture [3].

To allow the formulation of theorems, logic must provide a language in which to express these theorems. To make proofs of theorems possible, logic must clearly explain what a proof is, and how to certify that something is a proof of a given theorem.

There is no consensus on whether there is a single best logic to use as a foundation of mathematics. There are many different logics, and they differ in how (much) mathematics can be expressed in them. For example, the formal proof of the Kepler conjecture uses simply-typed higher-order logic (HOL) [4].

On one end of the spectrum one can ﬁnd the opinion that there cannot be a single best logic, and that the best we can do is to develop methods [5] which exploit commonalities among the plethora of logics.

On the other end of the spectrum there is the opinion, championed by Quine [6], that ﬁrst-order logic (FOL) is the proper foundation of mathematics. In Quine’s opinion, there is a more or less clear boundary between logic and mathematics, and other logics than FOL are not really logic, but simply mathematics dressed up as logic. He considers second-order logic (SOL) just as another way of formulating set theory, which in his opinion does not qualify as logic, but as mathematics.

Indeed, following Quine one can ask, what is the difference between logic and mathematics?

This article argues that the answer to this question is the key to understanding that there indeed is a best logic for serving as a foundation of mathematics.

The short answer is: There is a border between logic and mathematics, but it is not ﬁxed in one place, and it can be moved. Moving the border to the right incorporates more and more speciﬁc mathematics into the logic. Because there are many different mathematical theories that can potentially be turned into logics, moving the border to the right leads to a branching out into many different logics. On the other hand, moving the border to the left reverses the effect of branching out, and leads to a single logic at the root, which we call Abstraction Logic (AL).

AL is therefore the simplest of all logics that can serve as a foundation of mathematics. At the same time, it is also the most universal one, as all other logics to the right of it can be represented in it as mathematical theories. This is why we proclaim AL to be the best foundational logic.

The rest of this article is a longer elaboration of this short answer. Instead of diving into the technical details of AL, which are explained elsewhere [7], we will focus on a conceptual understanding of AL.

We have speculated earlier that many people will associate mathematics foremost with calculation.

Although this is of course a narrow view of mathematics, it is at the same time surprisingly general: Each term in the language of an AL can be viewed as a calculation, if one is prepared to understand the notion of calculation in a broad enough sense.

Consider the following simple logic which we will call Calculator Logic (CL). A term in CL is either a decimal literal denoting natural numbers, like

42

or

0

, or an equality

a = b

, an addition

a + b

, a subtraction

a - b

, unary minus

-a

, a multiplication

a \times b

, or a division

a \div b

, where both

a

and

b

are terms. We use the usual rules of precedence, and brackets

(a)

to override them.

The language of CL then consists of all of these terms. The theorems of CL are those terms of CL that evaluate to true. A proof of a theorem

A

consists of evaluating

A

and certifying that the result is true.

Therefore

300 - (42 + 8) \times 2 = -(1000 \div (-5))

is certainly a theorem, and

0 = 1

is not. The term

6 \times 7

is not a theorem, because it evaluates to

42

instead of true.

We deﬁne

x \div y

to be the unique integer

z

such that

x = y \times z

. If no such integer exists, or is not unique, then the result of a division will fail. This is the case for the expression

1 \div 0

, which fails, and is deﬁnitely not true. Therefore

1 \div 0

is not a theorem. But what about

1 \div 0 = 0 \div 0

? Or

1 \div 0 = 7 \div 5

? And

8 \div 5 = 7 \div 5

? Are these theorems? Both the left-hand side and the right hand side of these equalities fail, and we decide that fail

=

fail actually evaluates to true. Thus all three equalities are theorems in CL.

What about the term

(5 = 2 + 2) = 4

? Is that a theorem? No, it is not, because the left-hand side

(5 = 2 + 2)

evaluates to false, and false is not equal to

4

.

We have been quite pragmatic about the results true, false and fail. We treated them as normal values that can occur during or as the result of a calculation. We could just as well expand the deﬁnition of CL terms by introducing dedicated terms for these values:

\top

for true,

\bot

for false, and

\Finv

for fail. This would yield new theorems like

1 \div 0 = \Finv

and

(5 = 2 + 2) = \bot

. It also allows terms like

2 + \top

. This is a term that does not seem to make much sense, so we decide that evaluating it fails and yields the theorem

2 + \top = \Finv

. Note that adding the terms

\top

,

\bot

and

\Finv

to CL does not create any issues that were not already inherent in our approach:

2 + (3 = 3)

also fails.

CL is not an abstraction logic, as it does not feature variables, for example. But as we will see later →, it can easily be embedded into an AL such that the terms of CL correspond to a subset of the terms of the AL, and such that a term in CL is a theorem if and only if the corresponding term in AL is a theorem.

Some will recoil with horror from CL as presented here. They will say, surely something like

2 + (3 = 3)

should not be allowed to count as a term in the ﬁrst place? The situation bears obvious similarity with the division of programming language camps into proponents of dynamically typed programming languages like Lisp vs. proponents of statically typed programming languages like ML.

But there is a fundamental difference. While in programming languages static types serve as important safe guards against programming errors, in a logic there already is an ultimate safe guard and arbiter of truth: The theorem.

This obvious fact has been twisted in a rather ingenious way by declaring types to be the same as theorems [8]. While this is certainly a viable way of approaching logic, it is by no means the only one, or even a very natural one, for that matter. Instead, using abstraction logic one can consider types as a useful mathematical theory [9], just like set theory, instead of hard-wiring types into the logic.

Abstraction logic presumes that all values we want to calculate with and reason about live in a single universe $\mathcal{U}$ . This is the same assumption that ﬁrst-order logic makes. Unlike FOL, abstraction logic makes the additional assumption that there also is a value true in

\mathcal{U}

. Apart from this, AL makes no a-priori demands on

\mathcal{U}

.

The assumption that there is a single universe is an important reason why AL is conceptually so simple. Any value can in principle be combined with any other value without any a-priori restrictions. This is also what makes set theory so attractive, but unlike set theory, abstraction logic has no ontological presuppositions other than that

\mathcal{U}

must contain true.

Values are combined via operations and operators.

An

n

-ary operation (of

\mathcal{U}

) is a function taking

n

arguments where each argument is a value in

\mathcal{U}

, and returning a result that again is a value in

\mathcal{U}

.

An

n

-ary operator (of

\mathcal{U}

) is a function taking

n

arguments where each argument is either a value or an

m

-ary operation, depending on the argument position;

m

is ﬁxed for a given position. The result of applying an operator is still always a value.

We consider any value to also be an

0

-ary operation, and any

n

-ary operation to also be an

n

-ary operator. If we want to emphasise that an operation is not a value we call that operation proper. Similarly, we call an operator proper if it is not an operation.

Note that proper operations and operators are not part of the universe. There are simply too many of them for this to be the case if

\mathcal{U}

contains more than one value, as we know due to Cantor’s theorem. Therefore all attempts to include all operations and/or all operators in the universe are doomed.

A most important difference between abstraction logic and ﬁrst-order logic is that while FOL only considers arbitrary operations (in the form of functions and relations), AL also allows the use of arbitrary operators. The only proper operators allowed in FOL are

\forall

and

\exists

.

An abstraction is a name, intended to denote an operator. A signature is a collection of abstractions, where each abstraction is associated with a shape. A shape encodes both what kind of operator an abstraction can possibly denote, and how to interpret a term formed by applying the abstraction.

In general, a shape has the form

(m; p_0, \ldots, p_{n-1})

where

m

is the valence and

n

the arity of the shape. The

p_i

are subsets of

\{0, \ldots, m-1\}

. To avoid degenerate cases, we demand that the union of all

p_i

must equal

\{0, \ldots, m-1\}

.

We say that the operator

o

is compatible with the shape

(m; p_0, \ldots, p_{n-1})

in all of these cases:

In all other cases

o

is not compatible with the shape.

Because degenerate shapes are excluded, a value is only compatible with shape

(0;)

, a unary operation must have shape

(0; \emptyset)

, a binary operation must have shape

(0; \emptyset, \emptyset)

, and a proper unary operator necessarily has shape

(1; \{0\})

.

An abstraction can only denote operators that are compatible with its shape.

We assume an inﬁnite pool

X

of variables.

A term, relative to a given signature, is either a variable application or an abstraction application.

A variable application has the form

x[t_0, \ldots, t_{n-1}]

where

x

is a variable, and the

t_i

are themselves terms. Instead of

x[]

we usually just write

x

. We say that

x

occurs with arity $n$ .

An abstraction application has the form

(a\, x_0 \ldots x_{m-1}.\, t_0 \ldots t_{n-1})

where

a

is an abstraction belonging to the signature,

m

is the valence of

a

, and

n

is the arity of

a

. The

t_i

are the terms to which the abstraction is being applied. The

x_j

are distinct variables.

Often, instead of

(a.)

we just write

a

, instead of

(a.\, s\, t)

we write

s\, a\, t

when appropriate, and instead of

(a.\, t)

we might write

a\, t

or

t\, a

, depending on whether we consider

a

to be a preﬁx or postﬁx unary operator. In these cases we might disambiguate via brackets

(\ldots)

and the usual assumptions about associativity and precedence.

We consider the arity with which a variable occurs to be an implicit part of the variable name. This means that the two occurrences of the variable

x

in

x[x]

actually refer to two different variables. We could emphasise this by writing

x^1[x^0]

instead, but we choose the simpler syntax, as no ambiguities can arise. Because of this convention, we do not need to distinguish between wellformed terms and non-wellformed terms. As long as abstraction applications use the correct arity and valence according to the signature, all terms are automatically wellformed.

A logic signature is a signature

\mathfrak{S}

that contains at least three abstractions:

An abstraction logic is a logic signature

\mathfrak{S}

together with a list

{\mathbb{A}}

of terms called axioms.

Obviously,

\top

represents the value true,

⇒

is implication, and

\forall

stands for universal quantiﬁcation.

A signature

\mathfrak{S}'

extends a signature

\mathfrak{S}

if every abstraction of

\mathfrak{S}

is also an abstraction of

\mathfrak{S}'

, and has the same shape in both

\mathfrak{S}

and

\mathfrak{S}'

.

An abstraction logic

{\mathcal{L}}'

extends an abstraction logic

{\mathcal{L}}

if the signature of

{\mathcal{L}}'

extends the signature of

{\mathcal{L}}

, and if every axiom of

{\mathcal{L}}

is also an axiom of

{\mathcal{L}}'

(modulo $\alpha$ -equivalence →).

Deduction Logic ${\mathcal{L}}_D$ has signature

\mathfrak{S}_D

consisting of the minimally required three abstractions, and the following axioms

{\mathbb{A}}_D

:

The relevance of

{\mathcal{L}}_D

is founded in the Deduction Theorem →.

Axioms 2 to 5 might surprise despite their familiar form, because one would expect them to apply only to truth values, not every value in the entire universe. But because abstraction logic has only a single universe

\mathcal{U}

, every value must also be considered to be a truth value, because for any term

t

one can ask the question: Is

t

true?

Deduction Logic with Equality ${\mathcal{L}}_E$ extends

{\mathcal{L}}_D

. It has signature

\mathfrak{S}_E

which extends

\mathfrak{S}_D

by adding the equality abstraction

=

. It has axioms

{\mathbb{A}}_E

, adding the following axioms to

{\mathbb{A}}_D

:

{\mathcal{L}}_E

is arguably the most important abstraction logic from a theoretical point of view. We will show that every abstraction logic extending

{\mathcal{L}}_E

is complete →.

Deduction Logic with Equality and Falsity

{\mathcal{L}}_F

extends

{\mathcal{L}}_E

. Its signature

\mathfrak{S}_F

extends

\mathfrak{S}_E

by adding

\bot

, denoting false, logical negation

¬

, and inequality

≠

. Its axioms

{\mathbb{A}}_F

add the following axioms to

{\mathbb{A}}_E

:

The possibility of deﬁning

\bot

like this has been noted as early as 1951 by Church [10], albeit Church let the quantiﬁer range over booleans only. Much later in 2013 Kripke found it remarkable enough to exhibit it in a short note about Fregean Quantiﬁcation Theory [11]. Unlike Church, and like us, Kripke lets the quantiﬁer range over the entire universe, which he calls a Fregean domain.

Intuitionistic Logic

{\mathcal{L}}_I

extends

{\mathcal{L}}_F

. Its signature

\mathfrak{S}_I

adds abstractions to

\mathfrak{S}_F

for conjunction

∧

, disjunction

∨

, equivalence

\Leftrightarrow

and existential quantiﬁcation

∃

. Its axioms

{\mathbb{A}}_I

are obtained by adding the following axioms to

{\mathbb{A}}_F

:

Classical Logic

{\mathcal{L}}_K

extends

{\mathcal{L}}_I

with the law of the excluded middle:

Let us explore how Calculator Logic can be represented as an AL. Our starting point is

{\mathcal{L}}_K

. We follow an axiomatisation of arithmetic [12] described by Wang.

Here is the list of primitive CL abstractions we add:

\begin{array}{c|c} {\includegraphics[height=0.5em]{6947DBE3-CD70-4EFF-BF53-C6C89DEEED94}} & {\includegraphics[height=0.5em]{62D10A52-6FE8-4A24-8915-CD1CF7EE78A6}} \\\hline 0 & {\includegraphics[height=0.5em]{9BEC655E-072B-4910-A134-EFF89F08B9ED}} \\ 1 & {\includegraphics[height=0.5em]{2476E8B4-20C5-4D48-A303-4CAD63F0D9B1}} \\ x + y & {\includegraphics[height=0.5em]{12C24EB7-BDD6-47A7-95C3-01A79BE7498A}} \\ x - y & {\includegraphics[height=0.5em]{C9CBD968-08E8-484F-8425-E305F5E32DD2}} \\ - x & {\includegraphics[height=0.5em]{E977CB53-6090-4826-8A0D-DEF5B102AB65}} \\ x \times y & {\includegraphics[height=0.5em]{4AFA6C80-C161-40C7-9413-F3EE14BC3C71}} \\ x \div y & {\includegraphics[height=0.5em]{5F6278CE-F1EB-44BF-A742-BD8770E1198F}} \\ (\text{Num}.\, x) & x\ {\includegraphics[height=0.5em]{EC23184C-48A0-4BEF-BE8E-929234F1EF03}} \\ (\text{Pos}.\, x) & x\ {\includegraphics[height=0.5em]{DA0DFF61-A987-4FDC-AC38-229B17F293DE}} \\ \end{array}

Note that as long as abstractions have different arities and/or valence, as subtraction and minus do, they can share the same symbol and still be told apart. Just as we do for variables, we treat valence and arity of an abstraction as part of its name.

We add derived CL abstractions via deﬁnitions → of the form

A ≝ B

, where

A

is an application of an abstraction

a

. This introduces

a

as an abstraction, and adds

A = B

to the list of axioms. The shape of

a

will be obvious from the structure of

A

.

\begin{array}{rcl} (\text{Num}.\, x\, y) & ≝ & (\text{Num}.\, {x}) ∧ (\text{Num}.\, {y}) \\ (\text{Num}.\, x\,y\, z) & ≝ & (\text{Num}.\, {x}) ∧ (\text{Num}.\, {y}) ∧ (\text{Num}.\, {z})\\ (\operatorname{closed} x.\, D[x]\, F[x]) & ≝ & (\forall x.\, D[x] ⇒ D[F[x]]) \\ (\operatorname{closed} x \,y.\, D[x]\, F[x, y]) & ≝ & (\forall x\, y.\, D[x] ∧ D[y] ⇒ D[F[x, y]]) \\ (\operatorname{comm} x \,y.\, F[x, y]) & ≝ & (\forall x\, y.\, (\text{Num}.\, x\, y) ⇒ F[x, y] = F[y, x]) \\ (\operatorname{assoc} x \,y.\, F[x, y]) & ≝ & (\forall x\, y\, z.\, (\text{Num}.\, x\,y\, z) ⇒ \\ & & \phantom{(\forall}F[x, F[y, z]] = F[F[x, y], z]) \\ (\operatorname{distr} x \,y.\, F[x, y]\, G[x, y]) & ≝ & (\forall x\, y\, z.\, (\text{Num}.\, x\,y\, z) ⇒ \\ & & \phantom{(\forall}F[G[x, y], G[x, z]] = G[x, F[y, z]]) \\ (\operatorname{right-neutral} x\,y.\, F[x,y]\, e) & ≝ & (\text{Num}.\, e)\ ∧ \\ & & (\forall x.\, (\text{Num}.\, x) ⇒ F[x, e] = x) \\ (\text{either-or}.\, A\, B\, C) & ≝ & (A ∨ B ∨ C)\, ∧ \\ & & ¬(A ∧ B) ∧ ¬(A ∧ C) ∧ ¬(B ∧ C) \end{array}

The purpose of the following deﬁnitions is to support decimal literals of integers:

\begin{array}{rcl} 2 & ≝ & 1 + 1 \\ 3 & ≝ & 2 + 1 \\ 4 & ≝ & 3 + 1 \\ 5 & ≝ & 4 + 1 \\ 6 & ≝ & 5 + 1 \\ 7 & ≝ & 6 + 1 \\ 8 & ≝ & 7 + 1 \\ 9 & ≝ & 8 + 1 \\ (\text{dec}.\, x\, y) & ≝ & (9 + 1) \times x + y \end{array}

E.g. the literal

57104

can then be understood as syntactic sugar for

(\text{dec}.\, {(\text{dec}.\, {(\text{dec}.\, {(\text{dec}.\, 5\, 7)}\, 1)}\, 0)}\, 4)

The remaining axioms of CL are:

The last axiom is the axiom of induction. It can be represented in AL effortlessly.

The axioms we have introduced so far cover the standard behaviour of arithmetic, but do not say anything about terms like

2 + \top

or

1 \div 0

. This can be an advantage, because it means the theorems proven for this abstraction logic are applicable in more contexts. For example, there might be a context in which we would want

1 \div 0 = \infty

to hold. Or maybe we would prefer to leave division by zero mostly unspeciﬁed except for requiring

(\text{Num}.\, x) ⇒ (\text{Num}.\, {x \div 0})

. Therefore, we might want to leave terms like this undeﬁned on purpose.

The disadvantage of undeﬁned terms is that we cannot use theorems to prove that they are undeﬁned; this is because they are undeﬁned, and we cannot say much about them. This is where type systems shine: many undeﬁned terms are automatically rejected because type checking has proven that the term is illtyped.

To emulate this feature of type systems, we would like to be able to prove a theorem of the form “

t

is undeﬁned”. We cannot really do that, as explained before. But we can arrange things so that we can prove a theorem of the form “

t

is an error”. Simplifying further, we decide that there is a single error fail, which we denote by

\Finv

. We can then show that

t

fails by proving the theorem

t = \Finv

.

We introduce the following tools to deal with errors:

With these, we make the following changes to CL:

This says that the only values in the universe are booleans, numbers, and fail. Furthermore, all arithmetical operations fail if not applied to numbers. Division fails if it is not uniquely determined.

Farmer proposes a traditional approach to undeﬁnedness [13] as a basis for formalising undeﬁnedness. He mentions that undeﬁned could be modelled as a third truth value besides

\top

and

\bot

, but warns that

Our approach to undeﬁnedness resolves these issues. There is a third value

\Finv

with

\Finv ≠ \top

and

\Finv ≠ \bot

, but logically, it is the same as

\bot

, because we can prove

\Finv \Leftrightarrow \bot

. Furthermore, there is no need to change the implementation of the logic at all, as all we needed were a few additional abstractions and axioms.

A valuation

\nu

in

\mathcal{U}

is a function that assigns to each variable

x

in

X

occurring with arity

n

an

n

-ary operation

\nu(x, n)

of

\mathcal{U}

. Given a valuation

\nu

in

\mathcal{U}

, distinct variables

x_0, \ldots, x_{k-1}

and values

u_0, \ldots, u_{k-1}

in

\mathcal{U}

, we deﬁne an updated valuation via

\nu[x_0 \coloneqq u_0, \ldots, x_{k-1} \coloneqq u_{k-1}](y, n) = \begin{cases}u_i & \text{for}\ y = x_i\ \text{and}\ n = 0 \\ \nu(y, n) & \text{otherwise}\end{cases}

An abstraction algebra is a universe

\mathcal{U}

together with a signature

\mathfrak{S}

and an interpretation

I(a)

of each abstraction

a

in

\mathfrak{S}

as a meta value of

\mathcal{U}

compatible with the shape of

a

. We say that

a

denotes

I(a)

(in the given abstraction algebra).

Abstraction algebras are a generalisation of what Rasiowa calls abstract algebras [14]. An abstract algebra is an abstraction algebra that only contains abstractions with zero valence. Rasiowa uses abstract algebras as models for propositional calculi, i.e. to provide an algebraic semantics for propositional logics. We take this a step further and use abstraction algebras to provide models for abstraction logics.

Given such an abstraction algebra, and a valuation

\nu

in

\mathcal{U}

, every term

t

denotes a value

\llbracket t \rrbracket

in the universe

\mathcal{U}

. The calculation of this value is straightforward, and recurses over the structure of

t

:

Two terms

s

and

t

are called $\alpha$ -equivalent with respect to a given signature

\mathfrak{S}

iff for all abstraction algebras

\mathfrak{A}

with this signature, and all valuations in (the universe of)

\mathfrak{A}

,

s

and

t

denote the same value. It can easily be checked whether two terms are

\alpha

-equivalent by ﬁrst computing their de Bruijn representations, and then checking if the two representations are identical. A proof of this can be found in the original AL report [7] under the moniker “equivalence of

\alpha

-equivalence and semantical equivalence”.

A variable

x

occurs free with arity

n

in a term

t

if there is an occurrence of

x

with arity

n

in

t

that is not bound by any surrounding abstraction. Because abstractions bind only variables of arity

0

, any occurrence of

x

in

t

with arity

n > 0

is free.

It is clear that the value of

t

depends only on the assignments in the valuation to those variables

x

which are free in

t

. Therefore the value of a closed term

t

, i.e. a term without any free variables, does not depend on the valuation at all, but only on the abstraction algebra in which the calculation takes place.

An $n$ -ary template has the form

[x_0 \ldots x_{n-1}.\, t]

, where the

x_i

are distinct variables and

t

is a term. The template binds all free occurrences of

x_i

in

t

of arity

0

. A

0

-ary template

[.\, t]

is usually just written

t

.

A substitution

\sigma

is a function deﬁned on a domain

D

that maps a variable

x

, given an arity

n

such that

(x, n)

belongs to

D

, to an

n

-ary template

\sigma(x, n)

.

The purpose of a substitution

\sigma

is to be applied to a term

t

, yielding another term

{t} / {\sigma}

as the result of the substitution. This is done by determining those free occurrences of

x

in

t

of arity

n

that are in

\sigma

’s domain, and replacing them with

\sigma(x, n)

. This means that expressions of the form

[x_0 \ldots x_{n-1}.\, t][s_0, \ldots, s_{n-1}]

must be resolved to actual terms when replacing variables occurring with non-zero arity. This can in turn be done by applying to

t

the substitution that maps

x_i

to

s_i

.

The usual caveats of substitution in the presence of bound variables apply. For example, assume

\sigma

maps

x

to

y

. Then applying

\sigma

to

(a\, y.\, (b.\, x\, y))

by performing a direct replacement of

x

with

y

would yield

(a\, y.\, (b.\, y\, y))

. This is not what we want, as can be seen as follows. Note ﬁrst that

(a\, y.\, (b.\, x\, y))

is

\alpha

-equivalent to

(a\, z.\, (b.\, x\, z))

, i.e. the meaning of those two terms is the same in all abstraction algebras with respect to all valuations. Therefore it should not matter to which of the two terms we apply the substitution to, the results should be the same, or at least

\alpha

-equivalent. But applying

\sigma

to the second term (again via direct replacement) yields

(a\, z.\, (b.\, y\, z))

, which is not

\alpha

-equivalent to

(a\, y.\, (b.\, y\, y))

.

These issues can be dealt with by deﬁning substitution via de Bruijn terms instead of terms. The details of this have been worked out in the original AL report [7]. For our purposes here we will just note that we can deal with these issues by renaming bound variables appropriately before performing replacement to avoid clashes between free and bound variables. Also note that there is no canonical result of applying a substitution, but that the result is determined only up to

\alpha

-equivalence.

The main property of substitution is that, given a background valuation

\nu

, we can turn any substitution

\sigma

into a valuation

\nu_\sigma

. The role of the background valuation is to provide meaning for free variables that are either not in the domain of

\sigma

, or free in some template of

\sigma

.

The valuation

\nu_\sigma

has the property that for any term

t

\llbracket {t} \rrbracket_\sigma = \llbracket {{t} / {\sigma}} \rrbracket

holds. Here

\llbracket {\cdot} \rrbracket

denotes evaluation with respect to

\nu

, and

\llbracket {\cdot} \rrbracket_\sigma

calculates the value with respect to

\nu_\sigma

.

We deﬁne

\nu_\sigma

as follows for a variable

x

occurring with arity

n

:

Further details and a proof of above property using de Bruijn terms can be found in the original AL report [7].

What is the meaning of an abstraction logic? Terms have been given meaning relative to an abstraction algebra and a valuation earlier →. Correspondingly, a model of an abstraction logic

{\mathcal{L}}

with signature

\mathfrak{S}

and axioms

{\mathbb{A}}

is almost an abstraction algebra

{\mathfrak{A}}

with the same logic signature

\mathfrak{S}

, a universe

\mathcal{U}

, and an interpretation

I

, such that we have for all valuations

\nu

in

\mathcal{U}

and all axioms

a

in

{\mathbb{A}}

\llbracket {a} \rrbracket = I(\top)

There are two important modiﬁcations to be made before arriving at the actual deﬁnition of a model:

To address the ﬁrst point, we deﬁne a logic algebra to be an abstraction algebra with logic signature

\mathfrak{S}

such that

⇒

and

\forall

have denotations satisfying the following minimum requirements:

To address the second point, we deﬁne a valuation space

\mathcal{V}

of an abstraction algebra

{\mathfrak{A}}

to be a collection of valuations in the universe of

{\mathfrak{A}}

such that:

These requirements for

\mathcal{V}

will turn out to be just strong enough to show soundness → of any abstraction logic, and weak enough to prove completeness → for any abstraction logic extending

{\mathcal{L}}_E

.

A model of an abstraction logic

{\mathcal{L}}

is then a pair

({\mathfrak{A}}, \mathcal{V})

such that

{\mathfrak{A}}

is a logic algebra with the same signature as

{\mathcal{L}}

, and

\mathcal{V}

is a valuation space of

{\mathfrak{A}}

, such that

\llbracket {a} \rrbracket = I(\top)

for all valuations

\nu

in

\mathcal{V}

and all axioms

a

in

{\mathbb{A}}

.

A model is called degenerate if the universe of

{\mathfrak{A}}

consists of a single element. Note that every abstraction logic has a degenerate model.

If every model of

{\mathcal{L}}

is also a model of the logic with the same signature as

{\mathcal{L}}

and the single axiom

t

, then we say that

t

is valid in

{\mathcal{L}}

and write

{\mathcal{L}} ⊨ t

Given a logic

{\mathcal{L}}

with signature

\mathfrak{S}

and axioms

{\mathbb{A}}

, a proof in

{\mathcal{L}}

of a term

t

is one of the following:

If there exists a proof of

t

in

{\mathcal{L}}

, we say that

t

is a theorem of

{\mathcal{L}}

and write

{\mathcal{L}} ⊢ t

The Deduction Theorem for abstraction logic is that any proof of

A

in a logic

{\mathcal{L}}^+

with axioms

{\mathbb{A}}

and the additional axiom

\phi

can be translated into a proof of

\phi' ⇒ A

in the logic

{\mathcal{L}}

with the same signature, but only axioms

{\mathbb{A}}

, as long as

{\mathcal{L}}

extends

{\mathcal{L}}_D

.

Here

\phi'

is the universal closure of

\phi

, i.e.

\phi' = ∀ x_0. ∀ x_1. \ldots ∀ x_{n-1}.\, \phi

where the

x_i

are the distinct free variables of

\phi

. Note that this implies that

\phi

cannot contain free variables of non-zero arity for the Deduction Theorem to apply without any further conditions.

The proof of the Deduction Theorem is simply by recursing over the structure of the original proof and translating it step by step.

A special case of the deduction theorem is

{\mathcal{L}}_D ⊢ A ⇒ A

We can prove this directly: Substituting

A

for

C

in Axiom 3 → and then applying modus ponens using Axiom 2 → yields the theorem

(A ⇒ B) ⇒ (A ⇒ A)

. Substituting

B ⇒ A

for

B

and applying modus ponens using Axiom 2 → again, we obtain the theorem

A ⇒ A

.

In a system for machine-assisted theorem proving like Practal [15] it makes sense to bake in native support for

{\mathcal{L}}_D

so that moving theorems from

{\mathcal{L}}^+

to

{\mathcal{L}}

doesn’t necessarily involve the actual translation of proofs.

Abstraction logic is sound, i.e. every theorem of

{\mathcal{L}}

is also valid in

{\mathcal{L}}

:

{\mathcal{L}} ⊢ t \quad\text{implies}\quad {\mathcal{L}} ⊨ t

We show by induction over

p

that if

p

is a proof of

t

in

{\mathcal{L}}

, then

t

is valid in

{\mathcal{L}}

:

An abstraction logic

{\mathcal{L}}

is called complete if every valid term is also a theorem:

{\mathcal{L}} ⊨ t \quad \text{implies}\quad {\mathcal{L}} ⊢ t

If

{\mathcal{L}}

extends

{\mathcal{L}}_E

, then

{\mathcal{L}}

is complete.

To prove this, we construct the Rasiowa Model $(\mathfrak{R}, \mathcal{V})$ of ${\mathcal{L}}$ , which is a model for

{\mathcal{L}}

. The construction relies heavily on the fact that

{\mathcal{L}}

extends

{\mathcal{L}}_E

. The universe of the Rasiowa Model consists of equivalence classes of terms, where two terms

s

and

t

are called equivalent if

s = t

is a theorem of

{\mathcal{L}}

. This is a variation of a standard approach to prove completeness, originally due to Lindenbaum and Tarski. We write

{[t]_{\mathfrak{R}}}

for the equivalence class to which a term

t

belongs. The details of the construction can be found in the original AL paper [7] (the Rasiowa Model is called Rasiowa Abstraction Algebra there, which is a misnomer). In the construction we make essential use of the fact that

\mathcal{V}

does not need to be the class of all valuations, but can be custom tailored as long as we can guarantee that

\mathcal{V}

has the properties required of a valuation space.

Most importantly, there exists a valuation in

\mathcal{V}

with

\llbracket {t} \rrbracket = {[{t}]_{\mathfrak{R}}}

for any term

t

, where

\llbracket {\cdot} \rrbracket

evaluates with respect to that valuation. Furthermore, for the interpretation

I

of

\mathfrak{R}

we have

I(\top) = {[{\top}]_{\mathfrak{R}}}

From this we can see immediately that

{\mathcal{L}}

is complete. Because assume that

t

is valid. Then

{[t]_{\mathfrak{R}}} = \llbracket t \rrbracket = I(\top) = {[{\top}]_{\mathfrak{R}}}

which shows that

t = \top

is a theorem of

{\mathcal{L}}

, and thus

t

is a theorem as well.

An abstraction logic

{\mathcal{L}}

is called inconsistent if

{\mathcal{L}} ⊢ ∀ x.\, x

and consistent if it is not inconsistent.

If an abstraction logic

{\mathcal{L}}

is inconsistent and an extension of a deduction logic

{\mathcal{L}}_D

, then all models of

{\mathcal{L}}

are degenerate. This is easy to see. Assume

{\mathcal{L}}

is inconsistent. Then

∀ x.\,x

is a theorem in

{\mathcal{L}}

. Substituting

[x.\, x]

for

A

in Axiom 4 → of

{\mathcal{L}}_D

and applying modus ponens, it follows that

x

is a theorem. That means that

x

is valid in any model

({\mathfrak{A}}, \mathcal{V})

of

{\mathcal{L}}

. In particular, for any valuation

\nu

in

\mathcal{V}

we have

\llbracket x \rrbracket = I(⊤)

. There is such

\nu

because

\mathcal{V}

is non-empty. For any value

u

in the universe

\mathcal{U}

of

{\mathfrak{A}}

the valuation

\nu[x \coloneqq u]

is in

\mathcal{V}

as well, and therefore

u = \llbracket {x} \rrbracket_{x \coloneqq u} = I(⊤)

. That means

\mathcal{U}

consists only of

I(⊤)

, thus any model of

{\mathcal{L}}

is degenerate.

What about the other direction? If every model of an abstraction logic

{\mathcal{L}}

is degenerate, then

\llbracket {∀ x.\, x} \rrbracket = I(⊤)

holds. Therefore

{\mathcal{L}} ⊨ ∀ x.\, x

But is

∀ x.\, x

then a theorem of

{\mathcal{L}}

? We have seen that this is indeed the case if

{\mathcal{L}}

extends

{\mathcal{L}}_E

, because then

{\mathcal{L}}

is complete →.

Therefore, if

{\mathcal{L}}

extends

{\mathcal{L}}_E

, then inconsistency of an abstraction logic

{\mathcal{L}}

is equivalent to all of its models being degenerate.

It is thus straightforward to see that

{\mathcal{L}}_K

is consistent. As a model, use a two-element Boolean Algebra [16] extended in the obvious way with operators denoting

\forall

and

\exists

. The axioms of

{\mathcal{L}}_K

can then be checked by simply exhausting all possibilities.

A head

h

for an abstraction

a

is an abstraction application of the form

(a\, x_0\ldots x_{m-1}.\, t_0 \ldots t_{n-1})

such that:

The shape of

a

can easily be determined from a head for

a

. It must be

(m; p_0, \ldots, p_{n-1})

, where

p_i = \{{j_0}, \ldots, {j_{k-1}}\}

given the above form of

t_i

.

Given a logic

{\mathcal{L}}

extending

{\mathcal{L}}_E

, a deﬁnition has the form

h ≝ t

where

h

is a head for an abstraction

a

not contained in

{\mathcal{L}}

, and

t

is a term of

{\mathcal{L}}

such that all of

t

’s free variables are also free variables of

h

.

A deﬁnition results in a logic

{\mathcal{L}}'

which is an extension of

{\mathcal{L}}

. The signature of

{\mathcal{L}}'

is that of

{\mathcal{L}}

extended by the abstraction

a

. The axioms of

{\mathcal{L}}'

are the axioms of

{\mathcal{L}}

plus the axiom

h = t

.

Furthermore,

{\mathcal{L}}'

is a conservative extension of

{\mathcal{L}}

, i.e.

{\mathcal{L}}'

is not only an extension of

{\mathcal{L}}

, but if

t

is a term of

{\mathcal{L}}

, then

t

is a theorem of

{\mathcal{L}}

iff

t

is a theorem of

{\mathcal{L}}'

. This is straightforward to prove by induction over the structure of proofs, and considering the more general statement that

t'

is a theorem of

{\mathcal{L}}

iff

t

is a theorem of

{\mathcal{L}}'

. Here

t'

results from

t

by rewriting away all occurrences of

a

using the equation

h = t

.

Therefore if

{\mathcal{L}}'

is inconsistent, then

{\mathcal{L}}

is inconsistent as well, because

\forall x.\, x

is a term of

{\mathcal{L}}

. Consequently,

{\mathcal{L}}

is consistent iff

{\mathcal{L}}'

is consistent.

We can also allow conditional deﬁnitions of the form

c ⇒ h ≝ t

by considering this as an abbreviation for

h ≝ (\text{if}.\, c\, t)

where the abstraction “

\text{if}

” is governed by the single axiom

(\text{if}.\, \top\, x) = x

, which is assumed to be already contained in

{\mathcal{L}}

. This allows us to derive

c ⇒ h = t

in

{\mathcal{L}}'

.

Abstraction logic was born out of the desire to ﬁnd solid logical foundations for Practical Types [9]. Discussions on the Lean Zulip chat with Mario Carneiro and others left me convinced that while Practical Types were basically the right approach, in order to make a convincing case I needed to be able to explain the logical foundations with a level of detail that I was not able to at that moment.

During my search for logical foundations for Practical Types, I came across the work of

. The algebraic approach to logic [14] as Rasiowa applied it to propositional logic made immediate sense to me. Rasiowa chose to generalise this approach to predicate logic by following Mostowski and interpreting quantiﬁers as least upper bounds and greatest lower bounds. To me, that does not seem to be the proper way to generalise Rasiowa’s approach. Instead, I believe abstraction logic is. Just like Rasiowa restricts her models by restricting abstract algebras to implicative algebras to ensure that implication satisﬁes modus ponens, AL meets minimal requirements of implication and universal quantiﬁcation by restricting abstraction algebras to logic algebras.

Why did Rasiowa miss this generalisation, which now seems obvious with hindsight? The reason may simply be that operators were not part of her normal toolbox, while working with higher-order functions is standard in both programming and mechanised theorem proving today. Another possible reason is that working with a Fregean domain [11] might have felt alien to her. But on the other hand, her models for propositional logic are topological spaces, and it must have felt somewhat of a waste of space to ﬁll them just with mere truth values. As we can see now, these spaces can represent whole universes.

Our approach seems to be conceptually simpler and less involved than Rasiowa’s approach to predicate logics. An important indicator for this is how simple, elegant and without any need to exclude special cases our completeness proof for abstraction logic is (once the Rasiowa Model is constructed). Completeness holds for all abstraction logics from

{\mathcal{L}}_E

on, including intuitionistic abstraction logic, classical abstraction logic, and all logics in between and beyond. On the other hand, completeness for predicate logics based on Rasiowa’s approach is in general complicated territory [17]. Ono concludes his paper [18] from 1995 on the completeness of predicate logics with respect to algebraic semantics with the words

It seems that abstraction logic just might be this extension of algebraic semantics that Ono was looking for.

Given that AL is a generalisation of algebraic semantics for propositional logics, we can formulate all propositional logics from Rasiowa’s book [14] as ALs. As an example, we do this for positive implicative logic.

There is a caveat though:

\top

and

\forall

are part of any AL by deﬁnition, but these abstractions are not present in positive implicative logic. We will not be bothered by this, and just assume the following two axioms:

The ﬁrst axiom is obvious. The second axiom effectively renders

\forall

a complicated way of writing

\top

.

The remaining axioms of positive implicative logic are:

The semantics of this AL is identical to the semantics given by Rasiowa. Note that in our sense, this AL is of course inconsistent, but this doesn’t mean anything here, because

\forall

is essentially meaningless. Rasiowa shows that this logic is both complete and consistent, in the sense that there are terms that are not theorems.

Another approach is to deﬁne propositional logic as an abstraction logic such that the universe only consists of propositions. Starting from

{\mathcal{L}}_I

or

{\mathcal{L}}_K

, we add the axiom

(A \Leftrightarrow B) ⇒ (A = B)

and obtain intuitionistic propositional AL

{\mathcal{L}}_{IP}

or classical propositional AL

{\mathcal{L}}_{KP}

, respectively. Completeness follows from both

{\mathcal{L}}_{IP}

and

{\mathcal{L}}_{KP}

extending

{\mathcal{L}}_E

. Consistency can be conﬁrmed as before for

{\mathcal{L}}_K

→.

AL will be hard-pressed to implement any paraconsistent logic, as AL assumes there being only a single value representing true. This is at odds with Priest’s 3-valued Logic of Paradox [19], for example, which has an additional value paradoxical which is also “true”. In particular, any AL extending

{\mathcal{L}}_F

has the principle of explosion as a theorem:

A ⇒ ((¬ A) ⇒ B)

It should be possible to adapt abstraction logic to allow multiple different values all representing “true”. But as no paraconsistent logic provides a widely accepted foundation of mathematics, there seems no need to accommodate paraconsistency at this point.

To represent ﬁrst-order logic (FOL) as an abstraction logic, we should be able to use the same method as for Calculator Logic →. But instead of just numbers “

\text{Num}

”, we embed an entire FOL-universe “

{\includegraphics[height=0.5em]{28213F44-E527-4295-B82C-BE6B32A651A9}}

” in our AL-universe

\mathcal{U}

. As for Calculator Logic, we build FOL on top of

{\mathcal{L}}_K

, and we model undeﬁnedness → via

\Finv

. FOL operators would be deﬁned in such a way that illformed FOL formulas evaluate to

\Finv

. We will not ﬂesh out this construction in detail at this point, but leave this for another time.

I hope a picture is forming by now why I believe Abstraction Logic to be the best logic. It is doubtful that this claim can ever be proven. But it doesn’t need to be. I mean best logic in a very practical sense, and therefore if the claim is indeed true, Abstraction Logic should be able to form the basis of a theorem proving system which surpasses other systems based on just FOL, set theory, HOL, or dependent types. In fact, it should be able to accommodate all of these foundations, as they all should be axiomatisable in Abstraction Logic.

So, let’s build Practal and see!

Philosophy of Abstraction Logic

Introduction

What is Logic?

Calculator Logic

Values, Operations, Operators

Abstraction and Shape

Syntax

Logic

Several Abstraction Logics

Calculator Logic as an AL

Primitive Abstractions

Deﬁnitions

Literals

Axioms

Undeﬁnedness

Semantics

Substitution

Substitution as Valuation

Models

Proofs

The Deduction Theorem

Soundness

Completeness

Consistency

Deﬁnitions

Rasiowa’s Algebraic Approach

Propositional Logic

Paraconsistent Logic

First-Order Logic

The Best Logic?

References