Posted on 2018-01-12 by nbloomf

This post is literate Haskell; you can load the source into GHCi and play along.

Proving that programs are correct is hard. For one thing, determining what it means for a program to be correct is not trivial – it always has a certain output? never has an error? doesn’t do unnecessary work? what is the execution model? But programs can be very useful and expensive things, and any assurance that they do what we think they do is valuable.

Comparatively, proving theorems in mathematics is easy. In algebra especially lots of proofs can be written in the so-called “equational” style, which is straightforward to the point of boringness. In this style of proof we start with a list of identities; equalities between two expressions involving both constants and variables that we assume to be true for any values we might substitute for the variables. We then deduce more identities using a substitution rule. For example, we can think of the symbols $$+$$, $$2$$, and $$3$$ as constants, the symbols $$a$$ and $$b$$ as variables, and the expression $$a + b = b + a$$ as an identity. From this identity we can deduce the identity $$2 + 3 = 3 + 2$$, by making the substitution $$a = 2$$ and $$b = 3$$. This is a simple, one-step example, but the basic strategy is widely applicable – the Metamath project aims to formalize large chunks of mathematics using (more or less) this strategy. What’s more, verifying that such proofs are valid can be done by pattern matching, using the identities as rewrite rules.

So we have on the one hand some programs we’d like to prove things about, and on the other a simple proof technique. With some careful thought, we can apply one to the other – we just have to get used to thinking of programs as arithmetic in an appropriate algebra.

This is an old idea, and many books have been written about it. In this series of posts I’ll be exploring this idea for myself. In a nutshell, different kinds of data structures can be thought of as elements of an algebra with a universal property. The prototype for this point of view is the natural numbers; the universal property is essentially the principle of mathematical induction, and we can use it to define the usual arithmetic on numbers. This process generalizes to other kinds of structures, and “arithmetic” generalizes to programs. And just like induction is the power tool for proving things about the natural numbers, generalized induction can be used to prove things in the equational style.

We can build an algebra of programs, and doing so has some interesting benefits.

• In algebra, interesting objects are often defined in terms of a universal property. Instead of defining an object in terms of what it is, a universal property characterizes an object by how it behaves. It’s a very declarative way of thinking. This gives a simple, prepackaged way to detect when two programs are equivalent, such as a slow-but-obviously-correct one and a fast-but-not-obviously-correct one.
• Just as in ordinary arithmetic, many theorems come in the form of universally quantified equations. These are tailor made for property-based or generative testing.
• Equational proofs come with a simple strategy for verification: term rewriting. If we’re careful about how we write proofs, individual steps can be mechanically verified by a simple tool.

We could do this using a language designed specifically for formal verification, but I’d like to stay as close to English as possible. At the same time, in any big list of proofs there’s the danger that some of them are wrong. To help mitigate this I’ll use two different kinds of checks. First, we’ll implement our definitions in an executable language and include automated tests for as many theorems as possible. And second, as much as possible, we’ll use a term rewriting tool to check that the steps in our equational proofs are correct. If you see a blue equals sign in an equational proof, that signifies a link to the previous theorem or definition which justifies the equality. But more than that, the blue equals signs are verified by an automated tool.

## Property Testing

I’ve chosen to write my executable definitions in Haskell, because that’s what I’m most comfortable with, but that’s just a choice – many other languages would do. Many theorems will have a stackish flavor; something like Factor or J might also work well.

We’ll use the QuickCheck library to make our theorems testable. This is not the same as making our proofs machine-checkable, but can still be a useful tool for finding bugs and checking assumptions. This module reexports just enough of QuickCheck for our needs.

{-# LANGUAGE NoImplicitPrelude #-}
module Testing
( Test, runTest, testName, withTypeOf, TypeName(..)
, labelTestArgs1
, testLabel0, testLabel1, testLabel2, testLabel3, testArgs

, Equal(..)

, module Prelude
, module Test.QuickCheck
, module Test.QuickCheck.Test
, module Data.Proxy
, module Data.Typeable
) where

import Prelude
( Show(show), IO, Int, undefined, concat, Bool(..)
, putStrLn, (>>), (>>=), return, (++), String, (.), ($), Integer ) import Test.QuickCheck ( Testable(..), Args(..), Arbitrary(..), CoArbitrary(..) , quickCheckWithResult, stdArgs, variant ) import Test.QuickCheck.Test (isSuccess) import Data.Proxy (Proxy(..)) import Data.Typeable (Typeable, typeRep) import Text.Show.Functions () import System.Exit The Test type, with testName, is a shorthand for writing named tests. type Test prop = (String, prop) testName :: String -> prop -> Test prop testName name prop = (name, prop) runTest runs a named test. runTest :: Testable prop => Args -> Test prop -> IO () runTest args (name, prop) = do putStrLn ("\x1b[1;34m" ++ name ++ "\x1b[0;39;49m") result <- quickCheckWithResult args prop if isSuccess result then return () else putStrLn (show result) >> exitFailure TypeName, testLabel, and friends are used to print headers for test suites. class TypeName t where typeName :: t -> String instance TypeName Bool where typeName _ = "Bool" instance TypeName () where typeName _ = "()" testLabel :: String -> IO () testLabel msg = putStrLn ("\n\x1b[1;32m" ++ msg ++ "\x1b[0;39;49m") testLabel0 :: String -> IO () testLabel0 = testLabel labelTestArgs1 :: (Typeable a) => String -> Proxy a -> IO () labelTestArgs1 str a = testLabel$ concat
[ str, ": ", show $typeRep a ] testLabel1 :: (TypeName a) => String -> a -> IO () testLabel1 str a = testLabel$ concat
[ str, ": ", typeName a ]

testLabel2 :: (TypeName a, TypeName b)
=> String -> a -> b -> IO ()
testLabel2 str a b = testLabel $concat [ str, ": ", typeName a, ", ", typeName b ] testLabel3 :: (TypeName a, TypeName b, TypeName c) => String -> a -> b -> c -> IO () testLabel3 str a b c = testLabel$ concat
[ str, ": ", typeName a, ", ", typeName b, ", ", typeName c ]

testArgs :: Int -> Int -> Args
testArgs size cases = stdArgs
{ maxSuccess = cases, maxSize = size }

withTypeOf is used to enforce type constraints in tests. It makes more sense when you see some examples.

withTypeOf :: a -> a -> a
withTypeOf x _ = x

## Equality

To write tests, we also need a notion of “equality” for values. This is a little out of order – we’ll define the boolean truth values in a later post – but we need equality now, so I’ll put the definition here. Typically I think of equality (as in the $$=$$ symbol) as a metalanguage expression anyway.

class Equal a where
eq :: a -> a -> Bool

(Why not use the built in Eq class? No good reason.) We’ll implement Equal for new types as we encounter them. For now we need two: Bool and ().

instance Equal Bool where
eq True  True  = True
eq True  False = False
eq False True  = False
eq False False = True

instance Equal () where
eq () () = True

All our instances of Equal will be assumed to satisfy the following.

Let $$A$$ be a set.

1. For all $$a \in A$$, we have $$\beq(a,a) = \btrue$$.
2. For all $$a,b \in A$$, we have $$\beq(a,b) = \beq(b,a)$$.