Abstraction Logic
Oct 22, 2021
Abstract
Abstraction Logic is introduced as a foundation for
Practical Types [1] and
Practal [2]. It combines the simplicity of first-order logic with direct support for variable binding constants called
abstractions. It also allows free variables to depend on parameters, which means that first-order axiom schemata can be encoded as simple axioms. Conceptually abstraction logic is situated between first-order logic and second-order logic. It is sound and complete with respect to an intuitive and simple algebraic semantics.
Introduction
The goal of
Practical Types [1] is to provide a foundation for formal machine-assisted reasoning that anyone would find empowering who is interested in rigorous thinking and design. This paper provides a foundation on which Practical Types will be built as a particular class of
abstraction logics.
Like first-order logic, abstraction logic has a single universe of discourse, i.e. it is one-sorted. It goes even further than ordinary first-order logic in that truth values are part of this universe, i.e. the universe forms a
Fregean domain [3].
Like higher-order logic and type theory, abstraction logic provides built-in support for variable binding constructs. But variable binding is not achieved indirectly via a function sort or type; instead it is directly modelled via
shape specifications for (possibly) variable binding constructs called
abstractions. Abstractions are simply the constants of abstraction logic. As a consequence
α-equivalence of terms is a central concept in abstraction logic, but
β/η-conversion do not play any role at this level.
Free variables can depend on parameters. Thus where first-order logic needs axiom schemata consisting of infinitely many axioms, abstraction logic can express the same with a single axiom, just like higher-order logic. This is illustrated multiple times when we
present several concrete abstraction logics →.
Although abstraction logic started out as an investigation into the logic of Practical Types, it will be important for the design of
Practal [2] in its own right. The relationship between Practal, Practical Types and abstraction logic is similar to the relationship between the
Isabelle System [4] and its logics HOL and Pure.
Locales [5] are an important mechanism for modular reasoning in Isabelle built on top of Pure. Because abstraction logic is simpler than type theory, but at the same time more versatile, the same should be true for an abstraction logic mechanism for modular reasoning inspired by locales.
The relationship between Practical Types and abstraction logic can also be compared to the relationship between set theory and first-order logic. Similarly, abstraction logic is not limited to Practical Types, it can provide a foundation for many applications where currently first-order logic is used. Furthermore, abstraction logic can be applied in places where previously higher-order logic was needed, as in the realm of interactive theorem proving.
Preliminaries
Let’s fix some general notation first:
- N is the set of natural numbers starting from 0.
- We write n for {0,…,n−1}, in particular 0=∅.
- We write ⟨i0,…,in−1⟩ where i0<i1<…<in−1 for the set {i0,…,in−1}.
- We write U→V for the set of functions from set U to set V.
- We write U↪V for the set of partial functions from set U to set V.
- The function idU:U→U denotes the identity on U.
- We write U×V for the cartesian product of the sets U and V.
- We define Un for n∈N, n≥1 via U1=U and Um+1=Um×U for m≥1.
- We write ∣p∣ for the size of the finite set p.
- For f:U→V, distinct x0,…,xn−1, and arbitrary v0,…,vn−1 in V, we write f[x0:=v0,…,xn−1:=vn−1] for the function g:U∪{x0,…,xn−1}→V defined via g(u)={vif(u)if u=xiif u∈/{x0,…,xn−1}
- Instead of f[xi0:=v[i0],…,xin−1:=v[in−1]] we also write f[xp:=v[p]] where p=⟨i0,…,in−1⟩ and v[x] stands for an arbitrary expression depending on x.
Algebra
A
shape σ prescribes a number
m of
binders and a number
n of
parameters. Furthermore it assigns to each parameter the set of binders it depends on. A shape is therefore written as
(m;p0,…,pn−1) where
m,n∈N, and each
pi is a subset of
m. Furthermore we require that there are no superfluous binders:
i=0⋃n−1pi=m We call
m the
valence of the shape, and
n its
arity. Above condition implies that a shape with arity
0 necessarily has valence
0. It also implies that for a unary shape its sole parameter will depend on all binders.
An
abstraction signature S=(a0:σ0,…,ak−1:σk−1) is a list of
k different
abstractions ai, where each abstraction
ai is associated with its shape
σi.
An
abstraction algebra A=(U;I;S) is a non-empty set
U together with an abstraction signature
S and an
interpretation I which interprets an abstraction
ai as an element
I(ai) of the set
ΛU(σi). The set
ΛU(σi) will be defined shortly.
Examples of abstractions and their shapes are:
- Equality x=y has two parameters x and y and no binders. The shape of equality is therefore σ==(0;{},{}).
- Universal quantification ∀x.P[x] has one binder x and one parameter P, which depends on x. The shape of universal quantification is σ∀=(1;{0}).
- Function abstraction λx:T.A[x] has one binder x and two parameters T and A. The parameter T does not depend on x, but A does. The shape of function abstraction is therefore σλ=(1;{},{0}).
We require an abstraction
a of shape
σ to be an element of
ΛU(σ). We define
ΛU(σ) for
σ=(m;p0,…,pn−1) as follows:
ΛU(σ)={UFmp0(U)×…×Fmpn−1(U)→Uif n=0if n≥1 Here, for
m≥1 and
p⊆m, we define
Fmp(U) as follows:
Fmp(U)={U{f∘πmp(U)∣f∈U∣p∣→U}⊆Um→Uif p=∅if p=∅ The
projections πmp(U):Um→U∣p∣ are defined for
p=⟨i0,…,in−1⟩ via
(u0,…,um−1)↦(ui0,…,uin−1)
In particular, if the valence of
σ is
0 and its arity
n, then
ΛU(σ)={UUn→Uif n=0if n=0
Note that an
abstract algebra as described by
Rasiowa [6] (see also
[7],
[8] →) is thus simply an abstraction algebra where all abstractions have zero valence.
If the valence of
σ is
1, then
ΛU(σ)=F1p0(U)×…×F1pn−1(U)→U, where
F1pi(U) is either
U or
U→U, depending on whether
pi=∅ or
pi={0}, respectively.
Continuing our previous examples, we can now say the following:
- The interpretation of = is a function in U×U→U.
- The interpretation of ∀ is a function in (U→U)→U.
- The interpretation of λ is a function in U×(U→U)→U.
Bounded universal quantification ∀x:T.P[x] has shape
(1;{},{0}), and is therefore interpreted as an element of
U×(U→U)→U.
Another example is defined by
funeqx.Tf[x]g[x]=def∀x:T.f[x]=g[x] The shape of
funeq is
(1;{},{0},{0}), and therefore
funeq:U×(U→U)×(U→U)→U
Syntax
Given an abstraction signature
S=(a0:σ0,…,ak−1:σk−1) and a set of
variables X, we define the set
TX(S) of
terms over
S. We require variables and abstractions not to overlap, of course, i.e.
ai∈/X for all
i∈k. We also require each variable
x∈X to be associated with a fixed arity
xarity∈N.
The set
TX(S) of terms is then defined inductively as follows:
- Given a variable x and terms t0,…,txarity−1, x[t0,…,txarity−1] is a term. We also write just x instead of x[].
- Given an abstraction a of valence m and arity n, terms t0,…,tn−1, and distinct variables x0,…,xm−1 such that (xi)arity=0 for all i∈m, (ax0…xm−1.t0…tn−1) is also a term.
Note that with this definition all terms in
TX(S) are wellformed, although the definition makes only use of the valence and arity of an abstraction instead of its full shape. Of course, the full shape is important: it is needed to determine the
free variables Free(t) of a term
t∈TX(S). We define
Free(t) recursively, following the inductive definition of
TX(S):
Free(x[t0,…,txarity−1])Free((ax0…xm−1.t0…tn−1))=={x}∪Free(t0)∪…∪Free(txarity−1)∪ (Free(t0)∖{xi0,0,…,xi0,∣p0∣−1})∪ (Free(t1)∖{xi1,0,…,xi1,∣p1∣−1})⋮∪ (Free(tn−1)∖{xin−1,0,…,xin−1,∣pn−1∣−1})where σa=(m;p0,…,pn−1)and pj=⟨ij,0,…,ij,∣pj∣−1⟩ ∀j∈n In case of
Free(t)=∅, the term
t is called
closed.
The following table examines our previous examples in terms of
abstraction syntax:
concrete syntax
x=y∀x.P[x]λx:T.A[x]∀x:T.P[x]funeqx.Tf[x]g[x]∀x:T.f[x]=g[x]abstraction syntax
(=.x y)(∀x.P[x])(λx.TA[x])(∀x.TP[x])(funeqx.Tf[x]g[x])(∀x.T(=.f[x]g[x]))free variables
x,yPT,AT,PT,f,gT,f,g Note that
x is a free variable of the expression
λx:x. x Although the variable
x is bound in the second parameter of
λ, it occurs free in the first parameter of
λ, and is therefore a free variable of the entire expression.
Semantics
Given a set
F of variables, and a set
U, a
valuation ν of
F in
U is a mapping that assigns to each
x∈F with
xarity=0 an element
ν(x)∈U, and to each
x∈F with
xarity≥1 a function
ν(x)∈Uxarity→U. We denote the set of valuations of
F in
U by
VAL
(F,U).
Let
S be an abstraction signature, and
A=(U;I;S) an abstraction algebra with signature
S. Given a term
t∈TX(S), any valuation
ν of
Free(t) in
U determines an element
ν[[t]]∈U, recursively defined as follows:
- If t=x[t0,…,txarity−1], then x∈Free(t), and Free(ti)⊆Free(t) for all i∈xarity. Thus we set ν[[t]]={ν(x)ν(x)(ν[[t0]],…,ν[[txarity−1]])if xarity=0if xarity≥1
- If t=(a.t0…tn−1) for an abstraction a with zero valence and arity n, we set ν[[t]]={I(a)I(a)(ν[[t0]],…,ν[[tn−1]])if n=0if n≥1
- Assume now t=(ax0…xm−1.t0…tn−1) for an abstraction a with valence m≥1 and arity n≥1, and of shape σa=(m;p0,…,pn−1). Let us consider a fixed i∈n, and let pi=⟨j0,…,jk−1⟩⊆m. We have Free(ti)⊆Free(t)∪{xj0,…,xjk−1} Therefore, for any (u0,…,um−1)∈Um, the valuation μi=ν[xj0:=uj0,…,xjk−1:=ujk−1] determines a value μi[[ti]]∈U. Defining fi:Um→U by fi:(u0,…,um−1)↦μi[[ti]] it is obvious that fi∈Fmpi(U). Consequently, we set ν[[t]]=I(a)(f0,…,fn−1)
If
t is a closed term, then
ν[[t]] is independent of the choice of the valuation
ν in
U, and we can therefore define
[[t]]A=ν[[t]]
De Bruijn Terms
Terms
TX(S) provide an adequate user interface for humans, but are awkward to handle from a technical point of view. This becomes apparent when formulating more complex operations on terms, like substitution. One reason for this awkwardness is the potential of clashes between bound variables and free variables. Another reason is that two terms which are semantically equivalent, like
(ax.x) and
(ay.y), can nevertheless be syntactically different because of different choices for the names of bound variables.
In actual software implementations of logic, a common approach to deal with this problem is called
locally nameless [9]. The following definitions are inspired by this approach. We first define
de Bruijn preterms BXpre(S):
- For any n∈N, ↑n is a de Bruijn preterm. We implicitly assume everywhere that ↑n∈/X. Given a set N⊂N we write ↑N for the set {↑n∣n∈N}.
- Given a variable x∈X and de Bruijn preterms t0,…,txarity−1, x[t0,…,txarity−1] is a de Bruijn preterm. We also write just x instead of x[].
- Given an abstraction a of arity n, and de Bruijn preterms t0,…,tn−1, (at0…tn−1) is also a de Bruijn preterm.
The idea is that a term like
(axy.(b.xy)(cx.xy)) corresponds to the de Bruijn term
(a(b↑0↑1)(c↑0↑2))
This correspondence is made precise by the mapping
α:TX(S)→BXpre(S) Let
X0={x∈X∣xarity=0}. To define
α, we first define a function
α′:TX(S)×(X0↪N)→BXpre(S) which keeps track of the current mapping of bound variables to de Bruijn indices:
α′(x[t0,…,tn−1],η)α′(x,η)α′((ax0…xm−1.t0…tn−1),η)where the shape of aand for all i∈n, ti′=====x[α′(t0,η),…,α′(tn−1,η)]if η is undefined at x↑η(x)if η is defined at x(at0′…tn−1′)(m;p0,…,pn−1)α′(ti,(η+m)[xpi:=pi]) For any partial function
f:U↪N and any
m∈N we define
f′=f+m in the obvious way such that
f′(u)=f(u)+m holds if
f is defined at
u∈U.
Then we can define
α by
α(t)=α′(t,∅)
Of particular interest are those elements of
BXpre(S) that are in the image of
α. We are also interested in (part of) the image of
α′:
BXh(S)BX(S)=={α′(t,η)∣(t,η)∈TX(S)×(X0↪h)}⋃h=0∞BXh(S) The image of
α therefore is
BX0(S). We call
BX(S) the set of
de Bruijn terms, and
BXh(S) is the set of de Bruijn terms with at most
h holes.
Instead of defining de Bruijn terms indirectly via
α′, it is easy to carve them out of de Bruijn preterms in a more direct fashion. For that we need the notion of
free atoms FreeAtoms(t) of a de Bruijn preterm
t∈BXpre(S):
FreeAtoms(t)FreeAtomsl(↑n)FreeAtomsl(x[t0,…,tn−1])FreeAtomsl((at0…tn−1))====FreeAtoms0(t){∅{n−l}if n<lif n≥l{x}∪⋃i=0n−1FreeAtomsl(ti)⋃i=0n−1FreeAtomsl+m(ti)where m is the valence of a
De Bruijn terms
BX(S) are then those de Bruijn preterms
t such that for every subpreterm of
t of the form
(at0…tn−1) the inclusion
FreeAtoms(ti)∩↑m⊆↑pi holds for all
i∈n. Here the shape of
a is
(m;p0,…,pn−1).
Note that an immediate consequence of this characterisation is that every subterm of a de Bruijn term is also a de Bruijn term.
Furthermore, note that
BXh(S)={t∈BX(S)∣FreeAtoms(t)∩↑N⊆↑h}
α-Equivalence
We say that two terms
s and
t in
TX(S) are
α-equivalent iff
α(s)=α(t)
The importance of
α-equivalence stems from the fact that it is the same as
semantical equivalence. We call
s and
t in
TX(S) semantically equivalent iff for any abstraction algebra
(U;I;S), and for any valuation
ν∈VAL
(X,U), we have
ν(s)=ν(t).
We first show that
α-equivalence implies semantical equivalence.
To this end, we define valuations also on de Bruijn terms. Given a set
F of variables and a subset
N of
N, a
de Bruijn valuation ν of
F∪↑N in
U is a mapping such that
ν is a valuation of
F in
U, and additionally,
ν assigns to each
↑n an element
ν(↑n)∈U for all
n∈N.
Any valuation
ν of
FreeAtoms(t) in
U determines an element
ν[[t]]∈U:
ν[[↑n]]ν[[x[t0,…,tn−1]]]ν[[(at0…tn−1)]]ν[[(at0…tn−1)]]where fi(u0,…,um−1)=====ν(↑n)ν(x)(ν[[t0]],…,ν[[tn−1]])I(a)(ν[[t0]],…,ν[[tn−1]])if the valence of a is 0I′(a)(f0,…,fn−1)if the valence of a is m≥1↑m(ν)[↑0:=u0,…,↑(m−1):=um−1][[ti]]for all (u0,…,um−1)∈Um
Here
I′ is a slight modification of the interpretation
I in that where
I(a) expects an element
e∈U as one of its arguments,
I′(a) will accept a constant function
(u0,…,um−1)↦e instead. Otherwise,
I′ and
I are identical.
The valuation
↑m(ν) is derived from
ν via
↑m(ν)(x)↑m(ν)(↑(m+n))==ν(x)ν(↑n)where x∈Xwhere n∈N In the above the left hand sides are defined iff the right hand sides are defined.
It is not immediately obvious that
ν[[(at0…tn−1)]] is well-defined, because in order to be able to apply
I′(a) to
(f0,…,fn−1), the
fi cannot be just any functions in
Um→U. Assuming that the shape of
a is
(m;p0,…,pn−1), if
pi=∅,
fi must be a constant function, and for
pi=∅,
fi needs to be in
Fmpi(U). But this is easily seen to be the case, given that
ν′[[t]]=ν[[t]] only holds if
ν′ and
ν have a different assignment for some
q∈FreeAtoms(t), and according to our
earlier observation →,
FreeAtoms(ti)∩↑m⊆↑pi.
Given a valuation
ν∈VAL
(X,U), together with an injective function
η:X0↪N, the corresponding de Bruijn valuation
νη is defined as
νη(x)νη(↑n)==ν(x)ν(x)where x∈Xwhere η(x)=n We prove that for any term
t∈TX(S), any valuation
ν∈VAL
(X,U), and any injective
η:X0↪N, we have
νη[[α′(t,η)]]=ν[[t]]
The proof is by induction over
t:
In particular, for
t∈TX(S) and
ν∈VAL
(X,U):
ν[[α(t)]]=ν[[t]] An immediate consequence is that
α-equivalence implies semantical equivalence: Let
s and
t be
α-equivalent terms in
TX(S), i.e.
α(s)=α(t). Then for all valuations
ν∈VAL
(X,U), we have
ν[[s]]=ν[[α(s)]]=ν[[α(t)]]=ν[[t]]
Substitution
Before showing that semantical equivalence implies
α-equivalence in the
next section →, we will first consider how to perform
substitution of (some of) the free atoms in a de Bruijn term.
A
substitution is a function
σ:D→BX(S) with
domain D⊆X∪↑N. Application
t/σ of a substitution
σ:D→BX(S) to a de Bruijn term
t∈BX(S) is recursively defined as follows:
↑n/σ↑n/σx[t0,…,tn−1]/σx[t0,…,tn−1]/σ(at0…tn−1)/σ=====σ(↑n)if ↑n is in the domain of σ↑nif ↑n is not in the domain of σ(σ(x)/∅[↑0:=t0′,…,↑(n−1):=tn−1′])−nif x is in the domain of σ;here ∅ denotes the empty substitution,and ti′=(ti/σ)+nx[t0/σ,…,tn−1/σ]if x is not in the domain of σ(at0′…tn−1′) where ti′=ti/↑m(σ),given the valence of a is m In the above definition we used the expression
t+z for
t∈BX(S) and
z∈Z. This expression is only defined if for all
↑n∈FreeAtoms(t), we have
n+z≥0. Then:
t+z↑n+lz↑n+lzx[t0,…,tn−1]+lz(at0…tn−1)+lz=====t+0z↑n↑(n+z)x[t0+lz,…,tn−1+lz](a(t0+l+mz)…(tn−1+l+mz))where the valence of a is mif n<lif n≥l
Furthermore, we define
↑m(σ) for a substitution
σ:D→BX(S) and
m∈N by
↑m(σ)(x)↑m(σ)(↑(m+n))==σ(x)+xaritymσ(↑n)+mfor x∈X∩Dfor↑n∈D Then
↑m(σ) is a substitution with domain
(X∩D)∪{↑(m+n)∣↑n∈D}.
We define the
free atoms of a substitution σ:D→BX(S) by
FreeAtoms(σ)=∪⋃↑n∈DFreeAtoms(σ(↑n))⋃x∈D∩X{q−xarity∣q∈FreeAtoms(σ(x))∖xarity)}
We call a substitution
σ:D→BX(S) regular if
FreeAtoms(σ)⊆X.
For regular
σ we have
σ(x)∈BXxarity(S) for all
x∈D∩X and
σ(↑n)∈BX0(S) for all
↑n∈D.
Note that if
σ is regular, then
↑m(σ) is regular, too.
Given a de Bruijn valuation
ν of
F in
U, any regular substitution
σ:D→BX(S) with
FreeAtoms(σ)⊆F induces a de Bruijn valuation
νσ of
F∪D in
U via