Good afternoon, and thank you for having me here today. The topic for my talk today is extremely broad, and as a result will also be necessarily shallow; we're going to touch on a lot of things, but unfortunately none of them in any great detail. For roughly the last ten years, my research has primarily been oriented around characterising the logic of one particular type of intended computation: input handling. Not by coincidence, it's turned out that incorrect input handling is the single most prolific origin of software vulnerabilities. Indeed, the point where language-theoretic security really took off was in 2009, when Len Sassaman, Dan Kaminsky, and I demonstrated, by characterising the parse tree differential attack, that where precision in input handling thwarts exploitation, *im*precision in input handling opens the door to it. Over the course of our mission to provide developers with better tools for handling inputs precisely and definitely, we've arrived at a formal view of *correct* input handling as the sound and complete parsing of serialised data into well-typed objects. However, our formal characterisation of "weird machines" -- the unintended behaviour which constitutes exploitation -- has lagged behind our characterisation of intended behaviour. In part, this is because the universe of unintended behaviours is far vaster than that of intended ones. Not every crash is exploitable, as the automated systems of the DARPA Cyber Grand Challenge had to contend with. But just as in the early days of langsec, we have mathematics as our compass. Along with the universe of unintended behaviour, there's an entire universe of logics to explore. Let's begin. -- We're going to need to start with some history of mathematical logic, some of which you'll already be familiar with, but much of which may be new. From there, we'll talk about one key property that the type systems you're familiar with have -- and what happens when we *relax* that property. There are actually a *lot* of possible things that can happen when we relax it, so we'll spend some time talking about that range as well. And finally, I'll toss out some spitballs about where the overall research direction of langsec might go as it expands into not just semantics, but pragmatics. -- Our story today starts in the mid-1800s, when George Boole began his work on symbolic logic. Aristotelian propositional logic had of course been around since antiquity, but Boole's interest was in systematising and formalising it, particularly as an *algebra*. By "algebra" here we mean an algebraic structure, which is a set -- in the case of the typical Boolean algebra, the finite set whose arguments are true and false -- with two binary operators on elements in that set, and one unary operator. (There are atypical Boolean algebras, but we'll get into that later.) Usually the binary operators are called multiplication and addition, but in the case of Boolean algebra, they're called conjunction and disjunction, or AND and OR for short. Similarly, the unary operator is usually called "scalar multiplication," but in Boolean algebra, it's called negation, or NOT. Now, characterising propositional logic as an algebra doesn't necessarily sound like the most exciting thing in the world. We already knew the laws of logic, again thanks to Aristotle; isn't the fact that they also form an algebra just a nifty coincidence? Maybe, but it's a coincidence that ended up bearing quite a lot of mathematical fruit. After Boole you get Pierce and Frege, who independently came up with ways of handling logic with quantifiers, which ends up being a major motivation for Russell and Whitehead's _Principia Mathematica_ and eventually the development of type theory. So you can already see, even though we're talking about the nineteen-teens here, how we're barrelling straight in the direction of reasoning about computation. But I talk about Bertrand Russell all the time, so let's talk about a different logician. -- Let's talk about Luitzen Egbertus Jan Brouwer. Brouwer was a Dutch logician and topologist who was born toward the end of the 19th century, about 15 years after Boole died. He's probably best known for sparring with David Hilbert over the very foundations of mathematics and logic themselves. Hilbert was a formalist; to him, truth was an abstract concept, to be arrived at by applying string manipulation rules to formally encoded axioms and propositions. Brouwer was an intuitionist. In intuitionistic logic, which is actually the work of Brouwer's student Arend Heyting, the question of what is *true* is actually a question of what mathematical objects can be *constructed* and what claims about them are *justified*. One major difference between classical logic and intuitionistic logic, then, is that intuitionistic logic doesn't admit the law of the excluded middle. That is to say, "A *and* not-A" is still always false, but unlike in classical logic, "A *or* not-A" is not necessarily always true. Or, more accurately, justified. Another way of saying this is that you might not be able to find a proof of either A or not-A. Which means that intuitionistic logic isn't complete! Instead, we say it's *paracomplete*. Negation is also a little bit different. Whereas in classical logic, asserting "not B" means you're asserting that B is false, in intuitionistic logic, "not B" means that there's a *counterexample* for B. This also means that "not not B" asserts that there's a counterexample for *not B*, which is not the same thing as asserting B. A few other properties also fall out of this, for example the fact that De Morgan's laws don't hold for intuitionistic logic. So already you can see that there are some interesting implications around the question "what happens to systems of logic when we relax certain classical axioms?" But in order to connect the dots between logic and computation, first we'll have to talk about the Curry-Howard isomorphism. -- To me, the Curry-Howard correspondence is one of the most beautiful equivalences in mathematics. There are three key observations that go into it. The first one is of a link between types and propositions: Haskell Curry was interested in how to eliminate quantified variables from predicate logic, and came up with a formalism that he called combinatory logic. Is this anything like parser combinators? Why yes -- both parser combinators and the combinators of combinatory logic are higher-order functions that can be assembled and applied for a particular purpose. In combinatory logic, the basic combinators are called S and K, and these can be composed in such a way as to produce any computable function. They can also be composed into other combinators, such as the famous Y combinator. And what Curry noticed is that the *types* of these logical combinators corresponded to axiom schemas in implicational intuitionistic logic. Later, he was able to show that the types of the S and K combinators correspond syntactically to the second and third axiom schemes of the Hilbert system for classical propositional logic. And finally, William Alvin Howard completed the puzzle by demonstrating a syntactic correspondence between the rules of natural deduction and the type introduction rules in the simply typed lambda calculus. Proving a proposition is exactly the same thing as constructing a term of the appropriate type. "Programs are proofs, types propositions; that is all / Ye know on earth and all ye need to know." I'm using a little poetic license there, but really it's hard to overstate the implications of unifying logics and type theories in this way. And I do mean to use the plural there, because what we end up seeing is different logics giving rise to different kinds of type theories. Propositional logic gives us function types, as in Alonzo Church's simply typed lambda calculus. Predicate logic, or first-order logic, which adds quantified variables, gives us dependent types. Second-order logic, which also allows quantification over relations, gives us parametric polymorphism. But I'm getting a little ahead of myself -- for example, what do I mean by "dependent types?" -- To answer that, we have to look at the work of Per Martin-Lof. Howard and Nicolaas de Bruijn had already developed the concept of dependent *functions*, whose return type depends on the values of its arguments and which correspond to the universal quantifier, and dependent *pairs*, where the type of the second element in the pair is determined by the value of the first element. By this point in history, Jean-Yves Girard had formalised a second-order typed lambda calculus that allows types to be universally quantified -- or, in other words, parametrically polymorphic -- which he called System F. Martin-Lof was trying to generalise System F, but what he actually came up with was intuitionistic type theory. And this is where things really start to get computational. I said earlier that the type-theory equivalent of proving a proposition true is demonstrating that the type is *inhabited* -- that you can construct an element in it. Intuitionistic type theory uses the Brouwer-Heyting-Kolmogorov interpretation of intuitionistic logic, which gives structural induction rules for what constitutes a proof of an assertion with a particular connective or quantifier. It happens to be the case that in the BHK interpretation, "what constitutes a proof" is just some function or pair, possibly dependent if there's a quantifier involved. Also, since this is intuitionistic logic, "not P" is really the same as saying "there's a counterexample for P," negation is taken to mean "P implies an unfulfillable proposition" -- which by the Curry-Howard correspondence is the same as saying that P implies an uninhabitable type. And Martin-Lof picked this up and ran with it. I could go into much more detail on Martin-Lof type theory, but in the interest of time, I want to emphasize that MLTT was the mathematical innovation that made formal verification possible. All the proof assistants and dependently typed languages that enable us to write verified software -- like Coq, Isabelle, HOL, Agda, Idris, and so on -- exist because of intuitionistic type theory. The same is true of functional programming languages like ML and Haskell. And we will be coming back to it, because it's this very mechanisation of intuitionistic type theory that serves as a defensive force multiplier. It's how we write software that's correct by construction. But as I said at the beginning, we're here to talk about the *other* kind. -- The very first langsec workshop, five years ago, actually had a wonderful illustration of a proof assistant helping to find a counterexample for an incorrect-by-construction program. Andreas Bogk and Marco Schoepl had decided to write a verified PDF parser, and the tool they chose for the job was Coq. They were able to implement a terminating parser for a subset of PDF, but they weren't able to implement the entire specification and still have it be terminating. The reason for this is that the PDF specification allows the length field of an object to be an indirect reference to any object -- including self-reference, or two objects mutually referencing each other. Cross-reference tables are also allowed to have cyclic references. When they attempted to introduce this behaviour, per the specification, suddenly the proof didn't terminate anymore! And in fact they were able to construct a counterexample, in the form of a PDF that caused literally every PDF reader in existence at the time to hang. That PDF is a proof of unwanted behaviour. Nontermination, right there, in practice. Programs are proofs, and sometimes what they prove is that you underspecified your problem. You didn't ask the right questions. -- Asking the right questions is really what formal verification is all about. Rice's theorem tells us that we can't have a general process for proving nontrivial properties of arbitrary programs, but what we *can* do is prove nontrivial properties of *individual* programs the hard way. Which is to say, by construction. Julien Vanegue, at the second langsec workshop in 2015, presented some preliminary work characterising heap vulnerabilities in the context of the DARPA Cyber Grand Challenge. He proposed a formal semantics for heap allocation, both on the physical and logical levels, in such a way that many common heap vulnerabilities turn out to violate the constraints that the transition system imposes. In a sense, this is just adding a layer of abstraction, but crucially, it does so in a way that facilitates reasoning about particular kinds of problems. So this is all useful, but it's also all very tentative. We've had some success in formally identifying when a program is underspecified, and we've made some progress in thinking about how we think about *good* behaviour such that *if* bad behaviour is possible, it surfaces while we're reasoning about it during development, rather than at runtime in production. But if we want to talk about bad behaviour with any kind of richness or nuance that we don't explicitly construct ourselves, we're going to need logical formalisms that don't crash and burn when they hit a contradiction. And that's the real meat of this talk. -- Now, there's one thing that all these logics we've been discussing so far -- classical, intuitionistic, predicate, and second-order -- have in common. They all uphold the principle of explosion, or _ex contradictione quodlibet_. "Out of a contradiction, you can prove anything." It's an axiom that a logic has to include if it's going to be consistent. And indeed this is the case for all the logics we've been talking about so far. -- But we can *suspend* the principle of explosion, in exactly the same way that intuitionistic logic and other paracomplete logics suspend the law of the excluded middle. When we do this, we obtain a family of logics known as the *paraconsistent* logics. These allow us to reason beyond a contradiction in a variety of ways, depending on what the logic in question allows for. There are paraconsistent classical logics, paraconsistent predicate logics, and so on; moreover, different paraconsistent logics go about suspending the principle of explosion in different ways, leading to different consequences. If we just suspend the law of non-contradiction but keep the law of the excluded middle and the rest of classical logic, then we get what's called co-intuitionistic logic or dual intuitionistic logic, which is complete but paraconsistent. Where intuitionistic logic is a logic of assertions, co-intuitionistic logic is a logic of hypotheses: assertions are justified by conclusive evidence in favour of them, and hypotheses are justified by even the slightest amount of evidence in their favour so long as there is not conclusive evidence against them. The same way that intuitionistic logic has a material implication connective, co-intuitionistic logic has a coimplication connective, which is usually called something like "subtraction" or "exclusion," but is difficult to describe more abstractly than its sequent-calculus definition. Its concept of negation is also a little bit different. As I said earlier, intuitionistic negation is basically saying "if A is true, then absurdity," as opposed to saying "A is not true" like in classical logic. There is far less agreement on what paraconsistent negation even means, or whether it even really constitutes a kind of negation at all. (It may be wiser to dial our expectations back and just think of it as *some* unary operator.) That said, the inference rule that Prolog calls "negation as failure," where the consequence of a failed search to prove P is that you can assert not-P, has been shown to be paraconsistent. So there are some wide-open questions where even the connectives of paraconsistent logics are concerned, but we keep turning up clues in the realm of computation. It's a topic that deserves much more thorough study. What if we want to have our cake and eat it too, and suspend both the law of non-contradiction and the law of the excluded middle? We can do that, and there's more than one way to do it. There are three examples of this that I'd like to get into today -- linear logic, bi-intuitionistic logic, and subtractive logic -- but this certainly isn't all of them. Before I dive into those, however, I'd like to touch on some of the other correspondences that pop up as consequences of Curry-Howard. Phil Wadler talks about a lot of these in his landmark paper "Propositions as Types," and they go way beyond just terms and functions. For example, control flow. In 1990, Tim Griffin observed that the type of the call-with-current-continuation control operator corresponds -- syntactically, just like Curry and Howard observed -- to an axiom called Peirce's law, which is basically a way of expressing the law of the excluded middle using only implication. And that's classical logic -- it upholds the law of the excluded middle, so it can't be intuitionistic. In other words, continuation-based control flow can be reasoned about on the basis of type inference, because there exists an inhabited type for it -- just not in intuitionistic type theory. There are similar correspondences where other fundamental CS concepts are concerned. The pi-calculus is a well-known process calculus for modeling processes that compute concurrently and communicate over named channels. And it happens to be the case that the behaviour of the pi-calculus corresponds rather cleanly to the properties of co-intuitionistic logic. We can say much the same for codata, or data structures that are defined by the mathematical dual of structural induction, such as streams -- and their types correspond syntactically to the axioms that describe how coinduction decomposes an infinite object into simpler objects. -- With that in mind, let's dive more deeply into these logics at the bottom of the diamond. We'll start with bi-intuitionistic logic, which is a logic that brings both intuitionistic and co-intuitionistic logic together. It has all the connectives you'd expect from intuitionistic logic, but it also incorporates co-implication and paraconsistent negation from co-intuitionistic logic. And what this means is that suddenly, we have logical tools for reasoning about both the computation that goes on within a process -- which we can reason about using intuitionistic logic -- and the computation that goes on *concurrently*, or *between* processes. And it's exactly co-implication and paraconsistent negation that allow us to do this.