Abstraction Logic has been introduced in a previous, rather technical article. In this article we take a step back and look at Abstraction Logic from a conceptual point of view. This will make it easier to appreciate the simplicity, elegance, and pragmatism of Abstraction Logic. We will argue that Abstraction Logic is the best logic for serving as a foundation of mathematics.
This question is bound to evoke a wide variety of responses. Many will find the question somewhat silly and point out that a computer easily handles mathematics, for example via a calculator app. Others will question whether mathematics can be done on a computer at all, because mathematics is a social process happening within the mathematical community.
As has become abundantly clear in the last two decades, computers can support and facilitate social processes, for good and bad. This aspect of mathematics does not seem to be an unsurmountable obstacle to its computerisation. Attempts like the ProofPeer [1] project have been made to overcome this obstacle, but so far none of these attempts have been truly successful.
A central and maybe the most important aspect of mathematical activity is the formulation and proving of theorems. How can we do this on a computer, beyond just writing it up in TeX [2]?
In principle, this question has been answered. We can do it by using logic as a foundation of mathematics. Logic can be implemented via software (and/or hardware), and we can then use this logic to formulate and prove theorems. A large-scale example of this is the formal proof of the Kepler conjecture[3].
What is Logic?
To allow the formulation of theorems, logic must provide a language in which to express these theorems. To make proofs of theorems possible, logic must clearly explain what a proof is, and how to certify that something is a proof of a given theorem.
There is no consensus on whether there is a single best logic to use as a foundation of mathematics. There are many different logics, and they differ in how (much) mathematics can be expressed in them. For example, the formal proof of the Kepler conjecture uses simply-typed higher-order logic (HOL)[4].
On one end of the spectrum one can find the opinion that there cannot be a single best logic, and that the best we can do is to develop methods [5] which exploit commonalities among the plethora of logics.
On the other end of the spectrum there is the opinion, championed by Quine [6], that first-order logic (FOL) is the proper foundation of mathematics. In Quine’s opinion, there is a more or less clear boundary between logic and mathematics, and other logics than FOL are not really logic, but simply mathematics dressed up as logic. He considers second-order logic (SOL) just as another way of formulating set theory, which in his opinion does not qualify as logic, but as mathematics.
Indeed, following Quine one can ask, what is the difference between logic and mathematics?
This article argues that the answer to this question is the key to understanding that there indeed is a best logic for serving as a foundation of mathematics.
The short answer is: There is a border between logic and mathematics, but it is not fixed in one place, and it can be moved. Moving the border to the right incorporates more and more specific mathematics into the logic. Because there are many different mathematical theories that can potentially be turned into logics, moving the border to the right leads to a branching out into many different logics. On the other hand, moving the border to the left reverses the effect of branching out, and leads to a single logic at the root, which we call Abstraction Logic (AL).
AL is therefore the simplest of all logics that can serve as a foundation of mathematics. At the same time, it is also the most universal one, as all other logics to the right of it can be represented in it as mathematical theories. This is why we proclaim AL to be the best foundational logic.
The rest of this article is a longer elaboration of this short answer. Instead of diving into the technical details of AL, which are explained elsewhere [7], we will focus on a conceptual understanding of AL.
Calculator Logic
We have speculated earlier that many people will associate mathematics foremost with calculation.
Although this is of course a narrow view of mathematics, it is at the same time surprisingly general: Each term in the language of an AL can be viewed as a calculation, if one is prepared to understand the notion of calculation in a broad enough sense.
Consider the following simple logic which we will call Calculator Logic (CL). A term in CL is either a decimal literal denoting natural numbers, like 42 or 0, or an equality a=b, an addition a+b, a subtraction a−b, unary minus −a, a multiplication a×b, or a division a÷b, where both a and b are terms. We use the usual rules of precedence, and brackets (a) to override them.
The language of CL then consists of all of these terms. The theorems of CL are those terms of CL that evaluate to true. A proof of a theorem A consists of evaluating A and certifying that the result is true.
Therefore 300−(42+8)×2=−(1000÷(−5)) is certainly a theorem, and 0=1 is not. The term 6×7 is not a theorem, because it evaluates to 42 instead of true.
We define x÷y to be the unique integer z such that x=y×z. If no such integer exists, or is not unique, then the result of a division will fail. This is the case for the expression 1÷0, which fails, and is definitely not true. Therefore 1÷0 is not a theorem. But what about 1÷0=0÷0? Or 1÷0=7÷5? And 8÷5=7÷5? Are these theorems? Both the left-hand side and the right hand side of these equalities fail, and we decide that fail=fail actually evaluates to true. Thus all three equalities are theorems in CL.
What about the term (5=2+2)=4? Is that a theorem? No, it is not, because the left-hand side (5=2+2) evaluates to false, and false is not equal to 4.
We have been quite pragmatic about the results true, false and fail. We treated them as normal values that can occur during or as the result of a calculation. We could just as well expand the definition of CL terms by introducing dedicated terms for these values: ⊤ for true, ⊥ for false, and Ⅎ for fail. This would yield new theorems like 1÷0=Ⅎ and (5=2+2)=⊥. It also allows terms like 2+⊤. This is a term that does not seem to make much sense, so we decide that evaluating it fails and yields the theorem 2+⊤=Ⅎ. Note that adding the terms ⊤, ⊥ and Ⅎ to CL does not create any issues that were not already inherent in our approach: 2+(3=3) also fails.
CL is not an abstraction logic, as it does not feature variables, for example. But as we will see later →, it can easily be embedded into an AL such that the terms of CL correspond to a subset of the terms of the AL, and such that a term in CL is a theorem if and only if the corresponding term in AL is a theorem.
Some will recoil with horror from CL as presented here. They will say, surely something like 2+(3=3) should not be allowed to count as a term in the first place? The situation bears obvious similarity with the division of programming language camps into proponents of dynamically typed programming languages like Lisp vs. proponents of statically typed programming languages like ML.
But there is a fundamental difference. While in programming languages static types serve as important safe guards against programming errors, in a logic there already is an ultimate safe guard and arbiter of truth: The theorem.
This obvious fact has been twisted in a rather ingenious way by declaring types to be the same as theorems [8]. While this is certainly a viable way of approaching logic, it is by no means the only one, or even a very natural one, for that matter. Instead, using abstraction logic one can consider types as a useful mathematical theory[9], just like set theory, instead of hard-wiring types into the logic.
Values, Operations, Operators
Abstraction logic presumes that all values we want to calculate with and reason about live in a singleuniverse U. This is the same assumption that first-order logic makes. Unlike FOL, abstraction logic makes the additional assumption that there also is a value true in U. Apart from this, AL makes no a-priori demands on U.
The assumption that there is a single universe is an important reason why AL is conceptually so simple. Any value can in principle be combined with any other value without any a-priori restrictions. This is also what makes set theory so attractive, but unlike set theory, abstraction logic has no ontological presuppositions other than that U must contain true.
Values are combined via operations and operators.
An n-ary operation (of U) is a function taking n arguments where each argument is a value in U, and returning a result that again is a value in U.
An n-ary operator (of U) is a function taking n arguments where each argument is either a value or an m-ary operation, depending on the argument position; m is fixed for a given position. The result of applying an operator is still always a value.
We consider any value to also be an 0-ary operation, and any n-ary operation to also be an n-ary operator. If we want to emphasise that an operation is not a value we call that operation proper. Similarly, we call an operator proper if it is not an operation.
Note that proper operations and operators are not part of the universe. There are simply too many of them for this to be the case if U contains more than one value, as we know due to Cantor’s theorem. Therefore all attempts to include all operations and/or all operators in the universe are doomed.
A most important difference between abstraction logic and first-order logic is that while FOL only considers arbitrary operations (in the form of functions and relations), AL also allows the use of arbitrary operators. The only proper operators allowed in FOL are ∀ and ∃.
Abstraction and Shape
An abstraction is a name, intended to denote an operator. A signature is a collection of abstractions, where each abstraction is associated with a shape. A shape encodes both what kind of operator an abstraction can possibly denote, and how to interpret a term formed by applying the abstraction.
In general, a shape has the form (m;p0,…,pn−1) where m is the valence and n the arity of the shape. The pi are subsets of {0,…,m−1}. To avoid degenerate cases, we demand that the union of all pi must equal {0,…,m−1}.
We say that the operator o is compatible with the shape (m;p0,…,pn−1) in all of these cases:
The arity n is zero (which implies m=0, because we excluded degenerate shapes) and o is a value.
The valence m is zero, the arity n is non-zero, and o is an n-ary operation. Note that o is then necessarily a proper operation.
The valence m is non-zero (which implies n>0) and o is an n-ary operator, such that o accepts in argument position i a ∣pi∣-ary operation. Here ∣pi∣ is the size of pi. Note that o is then necessarily a proper operator.
In all other cases o is not compatible with the shape.
Because degenerate shapes are excluded, a value is only compatible with shape (0;), a unary operation must have shape (0;∅), a binary operation must have shape (0;∅,∅), and a proper unary operator necessarily has shape (1;{0}).
An abstraction can only denote operators that are compatible with its shape.
Syntax
We assume an infinite pool X of variables.
A term, relative to a given signature, is either a variable application or an abstraction application.
A variable application has the form x[t0,…,tn−1] where x is a variable, and the ti are themselves terms. Instead of x[] we usually just write x. We say that xoccurs with arityn.
An abstraction application has the form (ax0…xm−1.t0…tn−1) where a is an abstraction belonging to the signature, m is the valence of a, and n is the arity of a. The ti are the terms to which the abstraction is being applied. The xj are distinct variables.
Often, instead of (a.) we just write a, instead of (a.st) we write sat when appropriate, and instead of (a.t) we might write at or ta, depending on whether we consider a to be a prefix or postfix unary operator. In these cases we might disambiguate via brackets (…) and the usual assumptions about associativity and precedence.
We consider the arity with which a variable occurs to be an implicit part of the variable name. This means that the two occurrences of the variable x in x[x] actually refer to two different variables. We could emphasise this by writing x1[x0] instead, but we choose the simpler syntax, as no ambiguities can arise. Because of this convention, we do not need to distinguish between wellformed terms and non-wellformed terms. As long as abstraction applications use the correct arity and valence according to the signature, all terms are automatically wellformed.
Logic
A logic signature is a signature S that contains at least three abstractions:
The abstraction ⊤, intended to denote a value.
The abstraction ⇒, intended to denote a binary operation.
The abstraction ∀, intended to denote a proper unary operator.
An abstraction logic is a logic signature S together with a list A of terms called axioms.
Obviously, ⊤ represents the value true, ⇒ is implication, and ∀ stands for universal quantification.
A signature S′extends a signature S if every abstraction of S is also an abstraction of S′, and has the same shape in both S and S′.
An abstraction logic L′extends an abstraction logic L if the signature of L′ extends the signature of L, and if every axiom of L is also an axiom of L′ (modulo α-equivalence→).
Several Abstraction Logics
Deduction Logic LD has signature SD consisting of the minimally required three abstractions, and the following axioms AD:
Axioms 2 to 5 might surprise despite their familiar form, because one would expect them to apply only to truth values, not every value in the entire universe. But because abstraction logic has only a single universe U, every value must also be considered to be a truth value, because for any term t one can ask the question: Is t true?
Deduction Logic with Equality LE extends LD. It has signature SE which extends SD by adding the equality abstraction =. It has axioms AE, adding the following axioms to AD:
x=x
x=y⇒(A[x]⇒A[y])
A⇒(A=⊤)
LE is arguably the most important abstraction logic from a theoretical point of view. We will show that every abstraction logic extending LE is complete→.
Deduction Logic with Equality and FalsityLF extends LE. Its signature SF extends SE by adding ⊥, denoting false, logical negation ¬, and inequality =. Its axioms AF add the following axioms to AE:
⊥=(∀x.x)
¬A=(A⇒⊥)
(x=y)=¬(x=y)
The possibility of defining ⊥ like this has been noted as early as 1951 by Church [10], albeit Church let the quantifier range over booleans only. Much later in 2013 Kripke found it remarkable enough to exhibit it in a short note about Fregean Quantification Theory[11]. Unlike Church, and like us, Kripke lets the quantifier range over the entire universe, which he calls a Fregean domain.
Intuitionistic LogicLI extends LF. Its signature SI adds abstractions to SF for conjunction ∧, disjunction ∨, equivalence ⇔ and existential quantification ∃. Its axioms AI are obtained by adding the following axioms to AF:
(A∧B)⇒A
(A∧B)⇒B
A⇒(B⇒(A∧B))
A⇒(A∨B)
B⇒(A∨B)
(A∨B)⇒((A⇒C)⇒((B⇒C)⇒C))
(A⇔B)=((A⇒B)∧(B⇒A))
A[x]⇒(∃x.A[x])
(∃x.A[x])⇒((∀x.A[x]⇒B)⇒B)
Classical LogicLK extends LI with the law of the excluded middle:
A∨¬A
Calculator Logic as an AL
Let us explore how Calculator Logic can be represented as an AL. Our starting point is LK. We follow an axiomatisation of arithmetic [12] described by Wang.
Primitive Abstractions
Here is the list of primitive CL abstractions we add:
abstraction01x+yx−y−xx×yx÷y(Num.x)(Pos.x)meaningzero
one
addition
subtraction
minus
multiplication
division
xis an integer
xis a positive integer
Note that as long as abstractions have different arities and/or valence, as subtraction and minus do, they can share the same symbol and still be told apart. Just as we do for variables, we treat valence and arity of an abstraction as part of its name.
Definitions
We add derived CL abstractions via definitions→ of the form A=defB, where A is an application of an abstraction a. This introduces a as an abstraction, and adds A=B to the list of axioms. The shape of a will be obvious from the structure of A.
The purpose of the following definitions is to support decimal literals of integers: 23456789(dec.xy)=def=def=def=def=def=def=def=def=def1+12+13+14+15+16+17+18+1(9+1)×x+y E.g. the literal 57104 can then be understood as syntactic sugar for (dec.(dec.(dec.(dec.57)1)0)4)
Axioms
The remaining axioms of CL are:
(right-neutralxy.(x+y)0)
(right-neutralxy.(x×y)1)
(closedxy.(Num.x)(x+y))
(closedxy.(Num.x)(x×y))
(closedx.(Num.x)(−x))
(commxy.x+y)
(assocxy.x+y)
(commxy.x×y)
(assocxy.x×y)
(distrxy.(x+y)(x×y))
(Num.xy)⇒x−y=x+(−y)
(Num.xyz)⇒y=0⇒(x÷y=z)=(x=z×y)
(Num.x)⇒x−x=0
(Num.xyz)⇒z=0⇒z×x=z×y⇒x=y
(closedxy.(Pos.x)(x+y))
(closedxy.(Pos.x)(x×y))
(Num.x)⇒(either-or.(Pos.x)(x=0)(Pos.−x))
A[1]⇒(∀x.A[x]⇒A[x+1])⇒(Pos.x)⇒A[x]
The last axiom is the axiom of induction. It can be represented in AL effortlessly.
Undefinedness
The axioms we have introduced so far cover the standard behaviour of arithmetic, but do not say anything about terms like 2+⊤ or 1÷0. This can be an advantage, because it means the theorems proven for this abstraction logic are applicable in more contexts. For example, there might be a context in which we would want 1÷0=∞ to hold. Or maybe we would prefer to leave division by zero mostly unspecified except for requiring (Num.x)⇒(Num.x÷0). Therefore, we might want to leave terms like this undefined on purpose.
The disadvantage of undefined terms is that we cannot use theorems to prove that they are undefined; this is because they are undefined, and we cannot say much about them. This is where type systems shine: many undefined terms are automatically rejected because type checking has proven that the term is illtyped.
To emulate this feature of type systems, we would like to be able to prove a theorem of the form “t is undefined”. We cannot really do that, as explained before. But we can arrange things so that we can prove a theorem of the form “t is an error”. Simplifying further, we decide that there is a single error fail, which we denote by Ⅎ. We can then show that tfails by proving the theorem t=Ⅎ.
We introduce the following tools to deal with errors:
Abstraction “the” is added, denoting a proper unary operator.
(∃1x.A[x])⇒A[x]⇒(thex.A[x])=x
¬(∃1x.A[x])⇒(thex.A[x])=Ⅎ
With these, we make the following changes to CL:
(either-or.((x=⊤)∨(x=⊥))(Num.x)(x=Ⅎ))
(domain-boundxy.(Num.x)(x+y))
(domain-boundx.(Num.x)(−x))
(domain-boundxy.(Num.x)(x×y))
We drop the previous axiom governing division.
x÷y=(thez.(x=z×y))
This says that the only values in the universe are booleans, numbers, and fail. Furthermore, all arithmetical operations fail if not applied to numbers. Division fails if it is not uniquely determined.
Farmer proposes a traditional approach to undefinedness[13] as a basis for formalising undefinedness. He mentions that undefined could be modelled as a third truth value besides ⊤ and ⊥, but warns that
[...] it is very unusual in mathematical practice to use truth values beyond true and false. Moreover, the presence of a third truth value makes the logic more complicated to use and implement.
Our approach to undefinedness resolves these issues. There is a third value Ⅎ with Ⅎ=⊤ and Ⅎ=⊥, but logically, it is the same as ⊥, because we can prove Ⅎ⇔⊥. Furthermore, there is no need to change the implementation of the logic at all, as all we needed were a few additional abstractions and axioms.
Semantics
A valuationν in U is a function that assigns to each variable x in X occurring with arity n an n-ary operation ν(x,n) of U. Given a valuation ν in U, distinct variables x0,…,xk−1 and values u0,…,uk−1 in U, we define an updated valuation via ν[x0:=u0,…,xk−1:=uk−1](y,n)={uiν(y,n)fory=xiandn=0otherwise
An abstraction algebra is a universe U together with a signature S and an interpretation I(a) of each abstraction a in S as a meta value of U compatible with the shape of a. We say that adenotesI(a) (in the given abstraction algebra).
Abstraction algebras are a generalisation of what Rasiowa calls abstract algebras[14]. An abstract algebra is an abstraction algebra that only contains abstractions with zero valence. Rasiowa uses abstract algebras as models for propositional calculi, i.e. to provide an algebraic semantics for propositional logics. We take this a step further and use abstraction algebras to provide models for abstraction logics.
Given such an abstraction algebra, and a valuation ν in U, every term t denotes a value [[t]] in the universe U. The calculation of this value is straightforward, and recurses over the structure of t:
If t=x where x is a variable, then [[t]]=ν(x,0).
If t=x[t0,…,tn−1] is a variable application with n>0, then [[t]]=f(u0,…,un−1) where f=ν(x,n) is the n-ary operation assigned to x via the valuation ν, and ui is the value of ti with respect to ν.
If t=(a.) then [[t]]=u, where a denotes the value u.
If t=(a.t0…tn−1) for n>0, then [[t]]=f(u0,…,un−1) where a denotes the n-ary operation f, and ui is the value of ti with respect to ν.
Assume t=(ax0…xm−1.t0…tn−1) for m>0, and let the shape of a be (m;p0,…,pn−1). Let F be the n-ary operator denoted by a. We define the arguments ri to which we will apply F as follows. If pi=∅, then ri is the value denoted by ti with respect to ν. If pi={j0,…,jk−1} for ∣pi∣=k>0, then the abstraction abinds the variables xj0,…,xjk−1 occurring with arity 0 in ti. Thus ri is the k-ary operation which maps k values u0,…,uk−1 to the value denoted by ti with respect to the valuation ν[xj0:=u0,…,xjk−1:=uk−1]. Then [[t]]=F(r0,…,rn−1)
Two terms s and t are called α-equivalent with respect to a given signature S iff for all abstraction algebras A with this signature, and all valuations in (the universe of) A, s and t denote the same value. It can easily be checked whether two terms are α-equivalent by first computing their de Bruijn representations, and then checking if the two representations are identical. A proof of this can be found in the original AL report [7] under the moniker “equivalence of α-equivalence and semantical equivalence”.
A variable x occurs free with arity n in a term t if there is an occurrence of x with arity n in t that is not bound by any surrounding abstraction. Because abstractions bind only variables of arity 0, any occurrence of x in t with arity n>0 is free.
It is clear that the value of t depends only on the assignments in the valuation to those variables x which are free in t. Therefore the value of a closed term t, i.e. a term without any free variables, does not depend on the valuation at all, but only on the abstraction algebra in which the calculation takes place.
Substitution
An n-ary template has the form [x0…xn−1.t], where the xi are distinct variables and t is a term. The template binds all free occurrences of xi in t of arity 0. A 0-ary template [.t] is usually just written t.
A substitutionσ is a function defined on a domainD that maps a variable x, given an arity n such that (x,n) belongs to D, to an n-ary template σ(x,n).
The purpose of a substitution σ is to be applied to a term t, yielding another term t/σ as the result of the substitution. This is done by determining those free occurrences of x in t of arity n that are in σ’s domain, and replacing them with σ(x,n). This means that expressions of the form [x0…xn−1.t][s0,…,sn−1] must be resolved to actual terms when replacing variables occurring with non-zero arity. This can in turn be done by applying to t the substitution that maps xi to si.
The usual caveats of substitution in the presence of bound variables apply. For example, assume σ maps x to y. Then applying σ to (ay.(b.xy)) by performing a direct replacement of x with y would yield (ay.(b.yy)). This is not what we want, as can be seen as follows. Note first that (ay.(b.xy)) is α-equivalent to (az.(b.xz)), i.e. the meaning of those two terms is the same in all abstraction algebras with respect to all valuations. Therefore it should not matter to which of the two terms we apply the substitution to, the results should be the same, or at least α-equivalent. But applying σ to the second term (again via direct replacement) yields (az.(b.yz)), which is not α-equivalent to (ay.(b.yy)).
These issues can be dealt with by defining substitution via de Bruijn terms instead of terms. The details of this have been worked out in the original AL report [7]. For our purposes here we will just note that we can deal with these issues by renaming bound variables appropriately before performing replacement to avoid clashes between free and bound variables. Also note that there is no canonical result of applying a substitution, but that the result is determined only up to α-equivalence.
Substitution as Valuation
The main property of substitution is that, given a background valuationν, we can turn any substitution σ into a valuation νσ. The role of the background valuation is to provide meaning for free variables that are either not in the domain of σ, or free in some template of σ.
The valuation νσ has the property that for any term t[[t]]σ=[[t/σ]] holds. Here [[⋅]] denotes evaluation with respect to ν, and [[⋅]]σ calculates the value with respect to νσ.
We define νσ as follows for a variable x occurring with arity n:
If n=0, then νσ(x,n)=[[x/σ]].
If n>0, then for any values u0,…,un−1 we define νσ(x,n)(u0,…,un−1)=[[t]]y:=u where σ(x,n)=[y0…yn−1.t], and [[⋅]]y:=u evaluates with respect to the valuation ν[y0:=u0,…,yn−1:=un−1].
Further details and a proof of above property using de Bruijn terms can be found in the original AL report [7].
Models
What is the meaning of an abstraction logic? Terms have been given meaning relative to an abstraction algebra and a valuation earlier →. Correspondingly, a model of an abstraction logic L with signature S and axioms A is almost an abstraction algebra A with the same logic signature S, a universe U, and an interpretation I, such that we have for all valuations ν in U and all axioms a in A[[a]]=I(⊤) There are two important modifications to be made before arriving at the actual definition of a model:
We are not interested in all possible abstraction algebras as models, but only in those that are logic algebras.
We are not interested in testing the axioms with all possible valuations, but only with those that belong to a space of valuations depending on the model itself.
To address the first point, we define a logic algebra to be an abstraction algebra with logic signature S such that ⇒ and ∀ have denotations satisfying the following minimum requirements:
I(⇒)(I(⊤),u)=I(⊤) implies u=I(⊤) for all values u in U.
Let f be any unary operation of U such that f(u)=I(⊤) for all values u in U. Then I(∀)(f)=I(⊤).
To address the second point, we define a valuation spaceV of an abstraction algebra A to be a collection of valuations in the universe of A such that:
V is not empty.
If ν is a valuation belonging to V, u is a value in the universe of A, and x is a variable in X, then ν[x:=u] also belongs to V.
If ν is a valuation belonging to V, and if σ is a substitution with respect to the signature of A, then νσ also belongs to V.
These requirements for V will turn out to be just strong enough to show soundness→ of any abstraction logic, and weak enough to prove completeness→ for any abstraction logic extending LE.
A model of an abstraction logic L is then a pair (A,V) such that A is a logic algebra with the same signature as L, and V is a valuation space of A, such that [[a]]=I(⊤) for all valuations ν in V and all axioms a in A.
A model is called degenerate if the universe of A consists of a single element. Note that every abstraction logic has a degenerate model.
If every model of L is also a model of the logic with the same signature as L and the single axiom t, then we say that t is valid in L and write L⊨t
Proofs
Given a logic L with signature S and axioms A, a proof in L of a term t is one of the following:
An axiom invocationAX(t) such that t is an axiom in A.
An application of substitutionSUBST(t,σ,ps) where
ps is proof of s in L
σ is a substitution
t and s/σ are α-equivalent
An application of modus ponensMP(t,ph,pg) such that
ph is a proof of h in L
pg is a proof of g in L
g is identical to h⇒t
Introduction of the universal quantifierALL(t,x,ps) where
x is a variable
ps is a proof of s in L
t is identical to (∀x.s)
If there exists a proof of t in L, we say that t is a theorem of L and write L⊢t
The Deduction Theorem
The Deduction Theorem for abstraction logic is that any proof of A in a logic L+ with axioms A and the additional axiom ϕ can be translated into a proof of ϕ′⇒A in the logic L with the same signature, but only axioms A, as long as L extends LD.
Here ϕ′ is the universal closure of ϕ, i.e. ϕ′=∀x0.∀x1.…∀xn−1.ϕ where the xi are the distinct free variables of ϕ. Note that this implies that ϕ cannot contain free variables of non-zero arity for the Deduction Theorem to apply without any further conditions.
The proof of the Deduction Theorem is simply by recursing over the structure of the original proof and translating it step by step.
A special case of the deduction theorem is LD⊢A⇒A We can prove this directly: Substituting A for C in Axiom 3 → and then applying modus ponens using Axiom 2 → yields the theorem (A⇒B)⇒(A⇒A). Substituting B⇒A for B and applying modus ponens using Axiom 2 → again, we obtain the theorem A⇒A.
In a system for machine-assisted theorem proving like Practal [15] it makes sense to bake in native support for LD so that moving theorems from L+ to L doesn’t necessarily involve the actual translation of proofs.
Soundness
Abstraction logic is sound, i.e. every theorem of L is also valid in L: L⊢timpliesL⊨t We show by induction over p that if p is a proof of t in L, then t is valid in L:
Assume p=AX(t). Then t is an axiom and is therefore valid.
Assume p=SUBST(t,σ,ps). Then ps is a proof of s, and therefore s is valid. Let ν be a valuation in V for some model (A,V) of L. Then [[t]]=[[s/σ]]=[[s]]σ Because νσ is in V as well, and s is valid, [[s]]σ=I(⊤) Together this yields [[t]]=[[s]]σ=I(⊤).
Assume p=MP(t,ph,pg). Then ph is a proof of h, and pg is a proof of g, and therefore both h and g are valid. Let ν be a valuation in V for some model (A,V) of L. Then [[h]]=I(⊤) and I(⊤)=[[g]]=[[h⇒t]]=I(⇒)([[h]],[[t]])=I(⇒)(I(⊤),[[t]]) This implies [[t]]=I(⊤).
Assume p=ALL(t,x,ps). Then ps is a proof of s, and therefore s is valid. Let ν be a valuation in V for some model (A,V) of L. Let U be the universe of A. Then [[t]]=[[(∀x.s)]]=I(∀)(f) Here f is a unary operation of U where f(u)=[[s]]x:=u for all values u in U. Because ν[x:=u] also belongs to V and s is valid, we have [[s]]x:=u=I(⊤) and therefore f(u)=I(⊤) for all u. This implies [[t]]=I(∀)(f)=I(⊤).
Completeness
An abstraction logic L is called complete if every valid term is also a theorem: L⊨timpliesL⊢t
If L extends LE, then L is complete.
To prove this, we construct the Rasiowa Model (R,V) of L, which is a model for L. The construction relies heavily on the fact that L extends LE. The universe of the Rasiowa Model consists of equivalence classes of terms, where two terms s and t are called equivalent if s=t is a theorem of L. This is a variation of a standard approach to prove completeness, originally due to Lindenbaum and Tarski. We write [t]R for the equivalence class to which a term t belongs. The details of the construction can be found in the original AL paper [7] (the Rasiowa Model is called Rasiowa Abstraction Algebra there, which is a misnomer). In the construction we make essential use of the fact that V does not need to be the class of all valuations, but can be custom tailored as long as we can guarantee that V has the properties required of a valuation space.
Most importantly, there exists a valuation in V with [[t]]=[t]R for any term t, where [[⋅]] evaluates with respect to that valuation. Furthermore, for the interpretation I of R we have I(⊤)=[⊤]R
From this we can see immediately that L is complete. Because assume that t is valid. Then [t]R=[[t]]=I(⊤)=[⊤]R which shows that t=⊤ is a theorem of L, and thus t is a theorem as well.
Consistency
An abstraction logic L is called inconsistent if L⊢∀x.x and consistent if it is not inconsistent.
If an abstraction logic L is inconsistent and an extension of a deduction logic LD, then all models of L are degenerate. This is easy to see. Assume L is inconsistent. Then ∀x.x is a theorem in L. Substituting [x.x] for A in Axiom 4 → of LD and applying modus ponens, it follows that x is a theorem. That means that x is valid in any model (A,V) of L. In particular, for any valuation ν in V we have [[x]]=I(⊤). There is such ν because V is non-empty. For any value u in the universe U of A the valuation ν[x:=u] is in V as well, and therefore u=[[x]]x:=u=I(⊤). That means U consists only of I(⊤), thus any model of L is degenerate.
What about the other direction? If every model of an abstraction logic L is degenerate, then [[∀x.x]]=I(⊤) holds. Therefore L⊨∀x.x But is ∀x.x then a theorem of L? We have seen that this is indeed the case if L extends LE, because then L is complete →.
Therefore, if L extends LE, then inconsistency of an abstraction logic L is equivalent to all of its models being degenerate.
It is thus straightforward to see that LK is consistent. As a model, use a two-element Boolean Algebra [16] extended in the obvious way with operators denoting ∀ and ∃. The axioms of LK can then be checked by simply exhausting all possibilities.
Definitions
A headh for an abstraction a is an abstraction application of the form (ax0…xm−1.t0…tn−1) such that:
Each term ti has the form Fi[xj0,…,xjk−1] where the xj0,…,xjk−1 are distinct variables bound by a.
All Fi are distinct free variables of h.
The shape of a can easily be determined from a head for a. It must be (m;p0,…,pn−1), where pi={j0,…,jk−1} given the above form of ti.
Given a logic L extending LE, a definition has the form h=deft where h is a head for an abstraction a not contained in L, and t is a term of L such that all of t’s free variables are also free variables of h.
A definition results in a logic L′ which is an extension of L. The signature of L′ is that of L extended by the abstraction a. The axioms of L′ are the axioms of L plus the axiom h=t.
Furthermore, L′ is a conservative extension of L, i.e. L′ is not only an extension of L, but if t is a term of L, then t is a theorem of L iff t is a theorem of L′. This is straightforward to prove by induction over the structure of proofs, and considering the more general statement that t′ is a theorem of L iff t is a theorem of L′. Here t′ results from t by rewriting away all occurrences of a using the equation h=t.
Therefore if L′ is inconsistent, then L is inconsistent as well, because ∀x.x is a term of L. Consequently, L is consistent iff L′ is consistent.
We can also allow conditional definitions of the form c⇒h=deft by considering this as an abbreviation for h=def(if.ct) where the abstraction “if” is governed by the single axiom (if.⊤x)=x, which is assumed to be already contained in L. This allows us to derive c⇒h=t in L′.
Rasiowa’s Algebraic Approach
Abstraction logic was born out of the desire to find solid logical foundations for Practical Types [9]. Discussions on the Lean Zulip chat with Mario Carneiro and others left me convinced that while Practical Types were basically the right approach, in order to make a convincing case I needed to be able to explain the logical foundations with a level of detail that I was not able to at that moment.
During my search for logical foundations for Practical Types, I came across the work of
. The algebraic approach to logic [14] as Rasiowa applied it to propositional logic made immediate sense to me. Rasiowa chose to generalise this approach to predicate logic by following Mostowski and interpreting quantifiers as least upper bounds and greatest lower bounds. To me, that does not seem to be the proper way to generalise Rasiowa’s approach. Instead, I believe abstraction logic is. Just like Rasiowa restricts her models by restricting abstract algebras to implicative algebras to ensure that implication satisfies modus ponens, AL meets minimal requirements of implication and universal quantification by restricting abstraction algebras to logic algebras.
Why did Rasiowa miss this generalisation, which now seems obvious with hindsight? The reason may simply be that operators were not part of her normal toolbox, while working with higher-order functions is standard in both programming and mechanised theorem proving today. Another possible reason is that working with a Fregean domain [11] might have felt alien to her. But on the other hand, her models for propositional logic are topological spaces, and it must have felt somewhat of a waste of space to fill them just with mere truth values. As we can see now, these spaces can represent whole universes.
Our approach seems to be conceptually simpler and less involved than Rasiowa’s approach to predicate logics. An important indicator for this is how simple, elegant and without any need to exclude special cases our completeness proof for abstraction logic is (once the Rasiowa Model is constructed). Completeness holds for all abstraction logics from LE on, including intuitionistic abstraction logic, classical abstraction logic, and all logics in between and beyond. On the other hand, completeness for predicate logics based on Rasiowa’s approach is in general complicated territory [17]. Ono concludes his paper [18] from 1995 on the completeness of predicate logics with respect to algebraic semantics with the words
[...] there seems to be a big difference between instantiations in logic and in algebra. In other words, though usually we interpret quantifiers as infinite joins and meets in algebraic semantics, i.e., we deal with quantifiers like infinite disjunctions and conjunctions, this will be not always appropriate. So, one of the most important questions in the study of algebraic semantics for predicate logics would be how to give an appropriate interpretation of quantifiers, in other words, how to extend algebraic semantics.
It seems that abstraction logic just might be this extension of algebraic semantics that Ono was looking for.
Propositional Logic
Given that AL is a generalisation of algebraic semantics for propositional logics, we can formulate all propositional logics from Rasiowa’s book [14] as ALs. As an example, we do this for positive implicative logic.
There is a caveat though: ⊤ and ∀ are part of any AL by definition, but these abstractions are not present in positive implicative logic. We will not be bothered by this, and just assume the following two axioms:
⊤
(∀x.F[x])
The first axiom is obvious. The second axiom effectively renders ∀ a complicated way of writing ⊤.
The remaining axioms of positive implicative logic are:
A⇒(B⇒A)
(A⇒(B⇒C))⇒((A⇒B)⇒(A⇒C))
The semantics of this AL is identical to the semantics given by Rasiowa. Note that in our sense, this AL is of course inconsistent, but this doesn’t mean anything here, because ∀ is essentially meaningless. Rasiowa shows that this logic is both complete and consistent, in the sense that there are terms that are not theorems.
Another approach is to define propositional logic as an abstraction logic such that the universe only consists of propositions. Starting from LI or LK, we add the axiom (A⇔B)⇒(A=B) and obtain intuitionistic propositional AL LIP or classical propositional AL LKP, respectively. Completeness follows from both LIP and LKP extending LE. Consistency can be confirmed as before for LK→.
Paraconsistent Logic
AL will be hard-pressed to implement any paraconsistent logic, as AL assumes there being only a single value representing true. This is at odds with Priest’s 3-valued Logic of Paradox[19], for example, which has an additional value paradoxical which is also “true”. In particular, any AL extending LF has the principle of explosion as a theorem: A⇒((¬A)⇒B) It should be possible to adapt abstraction logic to allow multiple different values all representing “true”. But as no paraconsistent logic provides a widely accepted foundation of mathematics, there seems no need to accommodate paraconsistency at this point.
First-Order Logic
To represent first-order logic (FOL) as an abstraction logic, we should be able to use the same method as for Calculator Logic→. But instead of just numbers “Num”, we embed an entire FOL-universe “Univ
” in our AL-universe U. As for Calculator Logic, we build FOL on top of LK, and we model undefinedness → via Ⅎ. FOL operators would be defined in such a way that illformed FOL formulas evaluate to Ⅎ. We will not flesh out this construction in detail at this point, but leave this for another time.
The Best Logic?
I hope a picture is forming by now why I believe Abstraction Logic to be the best logic. It is doubtful that this claim can ever be proven. But it doesn’t need to be. I mean best logic in a very practical sense, and therefore if the claim is indeed true, Abstraction Logic should be able to form the basis of a theorem proving system which surpasses other systems based on just FOL, set theory, HOL, or dependent types. In fact, it should be able to accommodate all of these foundations, as they all should be axiomatisable in Abstraction Logic.
[17]Petr Cintula, Carles Noguera. (2015). A Henkin-Style Proof of Completeness for First-Order Algebraizable Logics, https://doi.org/10.1017/jsl.2014.19.