Good afternoon, and thank you for having me here today.

The topic for my talk today is extremely broad, and as a result will also be
necessarily shallow; we're going to touch on a lot of things, but unfortunately
none of them in any great detail.

For roughly the last ten years, my research has primarily been oriented around
characterising the logic of one particular type of intended computation: input
handling. Not by coincidence, it's turned out that incorrect input handling is
the single most prolific origin of software vulnerabilities. Indeed, the point
where language-theoretic security really took off was in 2009, when Len
Sassaman, Dan Kaminsky, and I demonstrated, by characterising the parse tree
differential attack, that where precision in input handling thwarts
exploitation, *im*precision in input handling opens the door to it.

Over the course of our mission to provide developers with better tools for
handling inputs precisely and definitely, we've arrived at a formal view of
*correct* input handling as the sound and complete parsing of serialised data
into well-typed objects. However, our formal characterisation of "weird
machines" -- the unintended behaviour which constitutes exploitation -- has
lagged behind our characterisation of intended behaviour. In part, this is
because the universe of unintended behaviours is far vaster than that of
intended ones. Not every crash is exploitable, as the automated systems of the
DARPA Cyber Grand Challenge had to contend with.

But just as in the early days of langsec, we have mathematics as our
compass. Along with the universe of unintended behaviour, there's an entire
universe of logics to explore. Let's begin.

--

We're going to need to start with some history of mathematical logic, some of
which you'll already be familiar with, but much of which may be new. From there,
we'll talk about one key property that the type systems you're familiar with
have -- and what happens when we *relax* that property. There are actually a
*lot* of possible things that can happen when we relax it, so we'll spend some
time talking about that range as well. And finally, I'll toss out some spitballs
about where the overall research direction of langsec might go as it expands
into not just semantics, but pragmatics.

--

Our story today starts in the mid-1800s, when George Boole began his work on
symbolic logic. Aristotelian propositional logic had of course been around since
antiquity, but Boole's interest was in systematising and formalising it,
particularly as an *algebra*. By "algebra" here we mean an algebraic structure,
which is a set -- in the case of the typical Boolean algebra, the finite set
whose arguments are true and false -- with two binary operators on elements in
that set, and one unary operator. (There are atypical Boolean algebras, but
we'll get into that later.) Usually the binary operators are called
multiplication and addition, but in the case of Boolean algebra, they're called
conjunction and disjunction, or AND and OR for short. Similarly, the unary
operator is usually called "scalar multiplication," but in Boolean algebra, it's
called negation, or NOT.

Now, characterising propositional logic as an algebra doesn't necessarily sound
like the most exciting thing in the world. We already knew the laws of logic,
again thanks to Aristotle; isn't the fact that they also form an algebra just a
nifty coincidence? Maybe, but it's a coincidence that ended up bearing quite a
lot of mathematical fruit. After Boole you get Pierce and Frege, who
independently came up with ways of handling logic with quantifiers, which ends
up being a major motivation for Russell and Whitehead's _Principia Mathematica_
and eventually the development of type theory. So you can already see, even
though we're talking about the nineteen-teens here, how we're barrelling
straight in the direction of reasoning about computation. But I talk about
Bertrand Russell all the time, so let's talk about a different logician.

--

Let's talk about Luitzen Egbertus Jan Brouwer. Brouwer was a Dutch logician and
topologist who was born toward the end of the 19th century, about 15 years after
Boole died. He's probably best known for sparring with David Hilbert over the
very foundations of mathematics and logic themselves. Hilbert was a formalist;
to him, truth was an abstract concept, to be arrived at by applying string
manipulation rules to formally encoded axioms and propositions. Brouwer was an
intuitionist. In intuitionistic logic, which is actually the work of Brouwer's
student Arend Heyting, the question of what is *true* is actually a question of
what mathematical objects can be *constructed* and what claims about them are
*justified*.

One major difference between classical logic and intuitionistic logic, then, is
that intuitionistic logic doesn't admit the law of the excluded middle. That is
to say, "A *and* not-A" is still always false, but unlike in classical logic, "A
*or* not-A" is not necessarily always true. Or, more accurately,
justified. Another way of saying this is that you might not be able to find a
proof of either A or not-A. Which means that intuitionistic logic isn't
complete! Instead, we say it's *paracomplete*.

Negation is also a little bit different. Whereas in classical logic,
asserting "not B" means you're asserting that B is false, in intuitionistic
logic, "not B" means that there's a *counterexample* for B. This also means that
"not not B" asserts that there's a counterexample for *not B*, which is not the
same thing as asserting B. A few other properties also fall out of this, for
example the fact that De Morgan's laws don't hold for intuitionistic logic. So
already you can see that there are some interesting implications around the
question "what happens to systems of logic when we relax certain classical
axioms?" But in order to connect the dots between logic and computation, first
we'll have to talk about the Curry-Howard isomorphism.

--

To me, the Curry-Howard correspondence is one of the most beautiful equivalences
in mathematics. There are three key observations that go into it. The first one
is of a link between types and propositions: Haskell Curry was interested in how
to eliminate quantified variables from predicate logic, and came up with a
formalism that he called combinatory logic. Is this anything like parser
combinators? Why yes -- both parser combinators and the combinators of
combinatory logic are higher-order functions that can be assembled and applied
for a particular purpose. In combinatory logic, the basic combinators are called
S and K, and these can be composed in such a way as to produce any computable
function. They can also be composed into other combinators, such as the famous Y
combinator. And what Curry noticed is that the *types* of these logical
combinators corresponded to axiom schemas in implicational intuitionistic
logic. Later, he was able to show that the types of the S and K combinators
correspond syntactically to the second and third axiom schemes of the Hilbert
system for classical propositional logic. And finally, William Alvin Howard
completed the puzzle by demonstrating a syntactic correspondence between the
rules of natural deduction and the type introduction rules in the simply typed
lambda calculus. Proving a proposition is exactly the same thing as constructing
a term of the appropriate type. 

"Programs are proofs, types propositions; that is all / Ye know on earth and all
ye need to know." I'm using a little poetic license there, but really it's hard
to overstate the implications of unifying logics and type theories in this
way. And I do mean to use the plural there, because what we end up seeing is
different logics giving rise to different kinds of type theories. Propositional
logic gives us function types, as in Alonzo Church's simply typed lambda
calculus. Predicate logic, or first-order logic, which adds quantified
variables, gives us dependent types. Second-order logic, which also allows
quantification over relations, gives us parametric polymorphism. But I'm getting
a little ahead of myself -- for example, what do I mean by "dependent types?"

--

To answer that, we have to look at the work of Per Martin-Lof. Howard and
Nicolaas de Bruijn had already developed the concept of dependent *functions*,
whose return type depends on the values of its arguments and which correspond to
the universal quantifier, and dependent *pairs*, where the type of the second
element in the pair is determined by the value of the first element. By this
point in history, Jean-Yves Girard had formalised a second-order typed lambda
calculus that allows types to be universally quantified -- or, in other words,
parametrically polymorphic -- which he called System F. Martin-Lof was trying to
generalise System F, but what he actually came up with was intuitionistic type
theory. And this is where things really start to get computational.

I said earlier that the type-theory equivalent of proving a proposition true is
demonstrating that the type is *inhabited* -- that you can construct an element
in it. Intuitionistic type theory uses the Brouwer-Heyting-Kolmogorov
interpretation of intuitionistic logic, which gives structural induction rules
for what constitutes a proof of an assertion with a particular connective or
quantifier. It happens to be the case that in the BHK interpretation, "what
constitutes a proof" is just some function or pair, possibly dependent if
there's a quantifier involved. Also, since this is intuitionistic logic, "not P"
is really the same as saying "there's a counterexample for P," negation is taken
to mean "P implies an unfulfillable proposition" -- which by the Curry-Howard
correspondence is the same as saying that P implies an uninhabitable type. And
Martin-Lof picked this up and ran with it.

I could go into much more detail on Martin-Lof type theory, but in the interest
of time, I want to emphasize that MLTT was the mathematical innovation that made
formal verification possible. All the proof assistants and dependently typed
languages that enable us to write verified software -- like Coq, Isabelle, HOL,
Agda, Idris, and so on -- exist because of intuitionistic type theory. The same
is true of functional programming languages like ML and Haskell. And we will be
coming back to it, because it's this very mechanisation of intuitionistic type
theory that serves as a defensive force multiplier. It's how we write software
that's correct by construction.

But as I said at the beginning, we're here to talk about the *other* kind.

--

The very first langsec workshop, five years ago, actually had a wonderful
illustration of a proof assistant helping to find a counterexample for an
incorrect-by-construction program. Andreas Bogk and Marco Schoepl had decided to
write a verified PDF parser, and the tool they chose for the job was Coq. They
were able to implement a terminating parser for a subset of PDF, but they
weren't able to implement the entire specification and still have it be
terminating. The reason for this is that the PDF specification allows the length
field of an object to be an indirect reference to any object -- including
self-reference, or two objects mutually referencing each other. Cross-reference
tables are also allowed to have cyclic references. When they attempted to
introduce this behaviour, per the specification, suddenly the proof didn't
terminate anymore! And in fact they were able to construct a counterexample, in
the form of a PDF that caused literally every PDF reader in existence at the
time to hang. That PDF is a proof of unwanted behaviour. Nontermination, right
there, in practice. Programs are proofs, and sometimes what they prove is that
you underspecified your problem. You didn't ask the right questions.

--

Asking the right questions is really what formal verification is all
about. Rice's theorem tells us that we can't have a general process for proving
nontrivial properties of arbitrary programs, but what we *can* do is prove
nontrivial properties of *individual* programs the hard way. Which is to say, by
construction. Julien Vanegue, at the second langsec workshop in 2015, presented
some preliminary work characterising heap vulnerabilities in the context of the
DARPA Cyber Grand Challenge. He proposed a formal semantics for heap allocation,
both on the physical and logical levels, in such a way that many common heap
vulnerabilities turn out to violate the constraints that the transition system
imposes. In a sense, this is just adding a layer of abstraction, but crucially,
it does so in a way that facilitates reasoning about particular kinds of problems.

So this is all useful, but it's also all very tentative. We've had some success
in formally identifying when a program is underspecified, and we've made some
progress in thinking about how we think about *good* behaviour such that *if*
bad behaviour is possible, it surfaces while we're reasoning about it during
development, rather than at runtime in production. But if we want to talk about
bad behaviour with any kind of richness or nuance that we don't explicitly
construct ourselves, we're going to need logical formalisms that don't crash and
burn when they hit a contradiction. And that's the real meat of this talk.

--

Now, there's one thing that all these logics we've been discussing so far --
classical, intuitionistic, predicate, and second-order -- have in common. They
all uphold the principle of explosion, or _ex contradictione quodlibet_. "Out of
a contradiction, you can prove anything." It's an axiom that a logic has to
include if it's going to be consistent. And indeed this is the case for all the
logics we've been talking about so far.

--

But we can *suspend* the principle of explosion, in exactly the same way that
intuitionistic logic and other paracomplete logics suspend the law of the
excluded middle. When we do this, we obtain a family of logics known as the
*paraconsistent* logics. These allow us to reason beyond a contradiction in a
variety of ways, depending on what the logic in question allows for. There are
paraconsistent classical logics, paraconsistent predicate logics, and so on;
moreover, different paraconsistent logics go about suspending the principle of
explosion in different ways, leading to different consequences.

If we just suspend the law of non-contradiction but keep the law of the excluded
middle and the rest of classical logic, then we get what's called
co-intuitionistic logic or dual intuitionistic logic, which is complete but
paraconsistent. Where intuitionistic logic is a logic of assertions,
co-intuitionistic logic is a logic of hypotheses: assertions are justified by
conclusive evidence in favour of them, and hypotheses are justified by even the
slightest amount of evidence in their favour so long as there is not conclusive
evidence against them.

The same way that intuitionistic logic has a material implication connective,
co-intuitionistic logic has a coimplication connective, which is usually called
something like "subtraction" or "exclusion," but is difficult to describe more
abstractly than its sequent-calculus definition. Its concept of negation is also
a little bit different. As I said earlier, intuitionistic negation is basically
saying "if A is true, then absurdity," as opposed to saying "A is not true" like
in classical logic. There is far less agreement on what paraconsistent negation
even means, or whether it even really constitutes a kind of negation at all. (It
may be wiser to dial our expectations back and just think of it as *some* unary
operator.) That said, the inference rule that Prolog calls "negation as
failure," where the consequence of a failed search to prove P is that you can
assert not-P, has been shown to be paraconsistent. So there are some wide-open
questions where even the connectives of paraconsistent logics are concerned, but
we keep turning up clues in the realm of computation. It's a topic that deserves
much more thorough study.

What if we want to have our cake and eat it too, and suspend both the law of
non-contradiction and the law of the excluded middle? We can do that, and
there's more than one way to do it. There are three examples of this that I'd
like to get into today -- linear logic, bi-intuitionistic logic, and subtractive
logic -- but this certainly isn't all of them.

Before I dive into those, however, I'd like to touch on some of the other
correspondences that pop up as consequences of Curry-Howard. Phil Wadler talks
about a lot of these in his landmark paper "Propositions as Types," and they go
way beyond just terms and functions. For example, control flow. In 1990, Tim
Griffin observed that the type of the call-with-current-continuation control
operator corresponds -- syntactically, just like Curry and Howard observed -- to
an axiom called Peirce's law, which is basically a way of expressing the law of
the excluded middle using only implication. And that's classical logic -- it
upholds the law of the excluded middle, so it can't be intuitionistic. In other
words, continuation-based control flow can be reasoned about on the basis of
type inference, because there exists an inhabited type for it -- just not in
intuitionistic type theory. 

There are similar correspondences where other fundamental CS concepts are
concerned. The pi-calculus is a well-known process calculus for modeling
processes that compute concurrently and communicate over named channels. And it
happens to be the case that the behaviour of the pi-calculus corresponds rather
cleanly to the properties of co-intuitionistic logic. We can say much the same
for codata, or data structures that are defined by the mathematical dual of
structural induction, such as streams -- and their types correspond
syntactically to the axioms that describe how coinduction decomposes an infinite
object into simpler objects.

--

With that in mind, let's dive more deeply into these logics at the bottom of the
diamond. We'll start with bi-intuitionistic logic, which is a logic that brings
both intuitionistic and co-intuitionistic logic together. It has all the
connectives you'd expect from intuitionistic logic, but it also incorporates
co-implication and paraconsistent negation from co-intuitionistic logic. And
what this means is that suddenly, we have logical tools for reasoning about both
the computation that goes on within a process -- which we can reason about using
intuitionistic logic -- and the computation that goes on *concurrently*, or
*between* processes. And it's exactly co-implication and paraconsistent negation
that allow us to do this.