Generative Testing

Chapter Goals

In this chapter, we will see a particularly elegant application of type classes to the problem of testing. Instead of testing our code by telling the compiler how to test, we simply assert what properties our code should have. Test cases can be generated randomly from this specification, using type classes to hide the boilerplate code of random data generation. This is called generative testing (or property-based testing), a technique made popular by the QuickCheck library in Haskell.

The quickcheck package is a port of Haskell's QuickCheck library to PureScript, and for the most part, it preserves the types and syntax of the original library. We will see how to use quickcheck to test a simple library, using Spago to integrate our test suite into our development process.

Project Setup

This chapter's project adds quickcheck as a dependency.

In a Spago project, test sources should be placed in the test directory, and the main module for the test suite should be named Test.Main. The test suite can be run using the spago test command.

Writing Properties

The Merge module implements a simple function merge, which we will use to demonstrate the features of the quickcheck library.

merge :: Array Int -> Array Int -> Array Int

merge takes two sorted arrays of integers and merges their elements so that the result is also sorted. For example:

> import Merge
> merge [1, 3, 5] [2, 4, 5]

[1, 2, 3, 4, 5, 5]

In a typical test suite, we might test merge by generating a few small test cases like this by hand and asserting that the results were equal to the appropriate values. However, everything we need to know about the merge function can be summarized by this property:

If xs and ys are sorted, then merge xs ys is the sorted result of both arrays appended together.

quickcheck allows us to test this property directly by generating random test cases. We state the properties we want our code to have as functions. In this case, we have a single property:

main = do
  quickCheck \xs ys ->
    eq (merge (sort xs) (sort ys)) (sort $ xs <> ys)

When we run this code, quickcheck will attempt to disprove the properties we claimed by generating random inputs xs and ys and passing them to our functions. If our function returns false for any inputs, the property will be incorrect, and the library will raise an error. Fortunately, the library is unable to disprove our properties after generating 100 random test cases:

$ spago test

Installation complete.
Build succeeded.
100/100 test(s) passed.
...
Tests succeeded.

If we deliberately introduce a bug into the merge function (for example, by changing the less-than check for a greater-than check), then an exception is thrown at runtime after the first failed test case:

Error: Test 1 failed:
Test returned false

As we can see, this error message is not very helpful, but it can be improved with a little work.

Improving Error Messages

To provide error messages along with our failed test cases, quickcheck provides the <?> operator. Simply separate the property definition from the error message using <?>, as follows:

quickCheck \xs ys ->
  let
    result = merge (sort xs) (sort ys)
    expected = sort $ xs <> ys
  in
    eq result expected <?> "Result:\n" <> show result <> "\nnot equal to expected:\n" <> show expected

This time, if we modify the code to introduce a bug, we see our improved error message after the first failed test case:

Error: Test 1 (seed 534161891) failed:
Result:
[-822215,-196136,-116841,618343,887447,-888285]
not equal to expected:
[-888285,-822215,-196136,-116841,618343,887447]

Notice how the input xs and ys were generated as arrays of randomly-selected integers.

Exercises

(Easy) Write a property that asserts that merging an array with an empty one does not modify the original array. Note: This new property is redundant since this situation is already covered by our existing property. We're just trying to give readers a simple way to practice using quickCheck.
(Easy) Add an appropriate error message to the remaining property for merge.

Testing Polymorphic Code

The Merge module defines a generalization of the merge function, called mergePoly, which works not only with arrays of numbers, but also arrays of any type belonging to the Ord type class:

mergePoly :: forall a. Ord a => Array a -> Array a -> Array a

If we modify our original test to use mergePoly in place of merge, we see the following error message:

No type class instance was found for

  Test.QuickCheck.Arbitrary.Arbitrary t0

The instance head contains unknown type variables.
Consider adding a type annotation.

This error message indicates that the compiler could not generate random test cases because it did not know what type of elements we wanted our arrays to have. In these sorts of cases, we can use type annotations to force the compiler to infer a particular type, such as Array Int:

quickCheck \xs ys ->
  eq (mergePoly (sort xs) (sort ys) :: Array Int) (sort $ xs <> ys)

We can alternatively use a helper function to specify the type, which may result in cleaner code. For example, if we define a function ints as a synonym for the identity function:

ints :: Array Int -> Array Int
ints = id

then we can modify our test so that the compiler infers the type Array Int for our two array arguments:

quickCheck \xs ys ->
  eq (ints $ mergePoly (sort xs) (sort ys)) (sort $ xs <> ys)

Here, xs and ys have type Array Int since the ints function has been used to disambiguate the unknown type.

Exercises

(Easy) Write a function bools that forces the types of xs and ys to be Array Boolean, and add additional properties that test mergePoly at that type.
(Medium) Choose a pure function from the core libraries (for example, from the arrays package), and write a QuickCheck property for it, including an appropriate error message. Your property should use a helper function to fix any polymorphic type arguments to either Int or Boolean.

Generating Arbitrary Data

Now we will see how the quickcheck library can randomly generate test cases for our properties.

Those types whose values can be randomly generated are captured by the Arbitrary type class:

class Arbitrary t where
  arbitrary :: Gen t

The Gen type constructor represents the side-effects of deterministic random data generation. It uses a pseudo-random number generator to generate deterministic random function arguments from a seed value. The Test.QuickCheck.Gen module defines several useful combinators for building generators.

Gen is also a monad and an applicative functor, so we have the usual collection of combinators at our disposal for creating new instances of the Arbitrary type class.

For example, we can use the Arbitrary instance for the Int type, provided in the quickcheck library, to create a distribution on the 256-byte values, using the Functor instance for Gen to map a function from integers to bytes over arbitrary integer values:

newtype Byte = Byte Int

instance Arbitrary Byte where
  arbitrary = map intToByte arbitrary
    where
    intToByte n | n >= 0 = Byte (n `mod` 256)
                | otherwise = intToByte (-n)

Here, we define a type Byte of integral values between 0 and 255. The Arbitrary instance uses the map function to lift the intToByte function over the arbitrary action. The type of the inner arbitrary action is inferred as Gen Int.

We can also use this idea to improve our test for merge:

quickCheck \xs ys ->
  eq (numbers $ mergePoly (sort xs) (sort ys)) (sort $ xs <> ys)

In this test, we generated arbitrary arrays xs and ys, but had to sort them, since merge expects sorted input. On the other hand, we could create a newtype representing sorted arrays and write an Arbitrary instance that generates sorted data:

newtype Sorted a = Sorted (Array a)

sorted :: forall a. Sorted a -> Array a
sorted (Sorted xs) = xs

instance (Arbitrary a, Ord a) => Arbitrary (Sorted a) where
  arbitrary = map (Sorted <<< sort) arbitrary

With this type constructor, we can modify our test as follows:

quickCheck \xs ys ->
  eq (ints $ mergePoly (sorted xs) (sorted ys)) (sort $ sorted xs <> sorted ys)

This may look like a small change, but the types of xs and ys have changed to Sorted Int instead of just Array Int. This communicates our intent in a clearer way – the mergePoly function takes sorted input. Ideally, the type of the mergePoly function itself would be updated to use the Sorted type constructor.

As a more interesting example, the Tree module defines a type of sorted binary trees with values at the branches:

data Tree a
  = Leaf
  | Branch (Tree a) a (Tree a)

The Tree module defines the following API:

insert    :: forall a. Ord a => a -> Tree a -> Tree a
member    :: forall a. Ord a => a -> Tree a -> Boolean
fromArray :: forall a. Ord a => Array a -> Tree a
toArray   :: forall a. Tree a -> Array a

The insert function inserts a new element into a sorted tree, and the member function can query a tree for a particular value. For example:

> import Tree

> member 2 $ insert 1 $ insert 2 Leaf
true

> member 1 Leaf
false

The toArray and fromArray functions can convert sorted trees to and from arrays. We can use fromArray to write an Arbitrary instance for trees:

instance (Arbitrary a, Ord a) => Arbitrary (Tree a) where
  arbitrary = map fromArray arbitrary

We can now use Tree a as the type of an argument to our test properties whenever there is an Arbitrary instance available for the type a. For example, we can test that the member test always returns true after inserting a value:

quickCheck \t a ->
  member a $ insert a $ treeOfInt t

Here, the argument t is a randomly-generated tree of type Tree Int, where the type argument disambiguated by the identity function treeOfInt.

Exercises

(Medium) Create a newtype for String with an associated Arbitrary instance which generates collections of randomly-selected characters in the range a-z. Hint: use the elements and arrayOf functions from the Test.QuickCheck.Gen module.
(Difficult) Write a property that asserts that a value inserted into a tree is still a member of that tree after arbitrarily many more insertions.

Testing Higher-Order Functions

The Merge module defines another generalization of the merge function – the mergeWith function takes an additional function as an argument to determine the order in which elements should be merged. That is, mergeWith is a higher-order function.

For example, we can pass the length function as the first argument to merge two arrays already in length-increasing order. The result should also be in length-increasing order:

> import Data.String

> mergeWith length
    ["", "ab", "abcd"]
    ["x", "xyz"]

["","x","ab","xyz","abcd"]

How might we test such a function? Ideally, we would like to generate values for all three arguments, including the first argument, which is a function.

There is a second type class that allows us to create randomly-generated functions. It is called Coarbitrary, and it is defined as follows:

class Coarbitrary t where
  coarbitrary :: forall r. t -> Gen r -> Gen r

The coarbitrary function takes a function argument of type t and a random generator for a function result of type r. It uses the function argument to perturb the random generator. That is, it uses the function argument to modify the random output of the random generator for the result.

In addition, there is a type class instance that gives us Arbitrary functions if the function domain is Coarbitrary and the function codomain is Arbitrary:

instance (Coarbitrary a, Arbitrary b) => Arbitrary (a -> b)

In practice, we can write properties that take functions as arguments. In the case of the mergeWith function, we can generate the first argument randomly, modifying our tests to take account of the new argument.

We cannot guarantee that the result will be sorted – we do not even necessarily have an Ord instance – but we can expect that the result be sorted with respect to the function f that we pass in as an argument. In addition, we need the two input arrays to be sorted concerning f, so we use the sortBy function to sort xs and ys based on comparison after the function f has been applied:

quickCheck \xs ys f ->
  let
    result =
      map f $
        mergeWith (intToBool f)
                  (sortBy (compare `on` f) xs)
                  (sortBy (compare `on` f) ys)
    expected =
      map f $
        sortBy (compare `on` f) $ xs <> ys
  in
    eq result expected

Here, we use a function intToBool to disambiguate the type of the function f:

intToBool :: (Int -> Boolean) -> Int -> Boolean
intToBool = id

In addition to being Arbitrary, functions are also Coarbitrary:

instance (Arbitrary a, Coarbitrary b) => Coarbitrary (a -> b)

This means that we are not limited to just values and functions – we can also randomly generate higher-order functions, or functions whose arguments are higher-order functions, and so on.

Writing Coarbitrary Instances

Just as we can write Arbitrary instances for our data types by using the Monad and Applicative instances of Gen, we can write our own Coarbitrary instances as well. This allows us to use our own data types as the domain of randomly-generated functions.

Let's write a Coarbitrary instance for our Tree type. We will need a Coarbitrary instance for the type of the elements stored in the branches:

instance Coarbitrary a => Coarbitrary (Tree a) where

We have to write a function that perturbs a random generator given a value of type Tree a. If the input value is a Leaf, then we will return the generator unchanged:

  coarbitrary Leaf = id

If the tree is a Branch, then we will perturb the generator using the left subtree, the value, and the right subtree. We use function composition to create our perturbing function:

  coarbitrary (Branch l a r) =
    coarbitrary l <<<
    coarbitrary a <<<
    coarbitrary r

Now we can write properties whose arguments include functions taking trees as arguments. For example, the Tree module defines a function anywhere, which tests if a predicate holds on any subtree of its argument:

anywhere :: forall a. (Tree a -> Boolean) -> Tree a -> Boolean

Now we can generate the predicate function randomly. For example, we expect the anywhere function to respect disjunction:

quickCheck \f g t ->
  anywhere (\s -> f s || g s) t ==
    anywhere f (treeOfInt t) || anywhere g t

Here, the treeOfInt function is used to fix the type of values contained in the tree to the type Int:

treeOfInt :: Tree Int -> Tree Int
treeOfInt = id

Testing Without Side-Effects

For the purposes of testing, we usually include calls to the quickCheck function in the main action of our test suite. However, there is a variant of the quickCheck function, called quickCheckPure which does not use side-effects. Instead, it is a pure function that takes a random seed as an input and returns an array of test results.

We can test quickCheckPure using PSCi. Here, we test that the merge operation is associative:

> import Prelude
> import Merge
> import Test.QuickCheck
> import Test.QuickCheck.LCG (mkSeed)

> :paste
… quickCheckPure (mkSeed 12345) 10 \xs ys zs ->
…   ((xs `merge` ys) `merge` zs) ==
…     (xs `merge` (ys `merge` zs))
… ^D

Success : Success : ...

quickCheckPure takes three arguments: the random seed, the number of test cases to generate, and the property to test. If all tests pass, you should see an array of Success data constructors printed to the console.

quickCheckPure might be useful in other situations, such as generating random input data for performance benchmarks or sample form data for web applications.

Exercises

(Easy) Write Coarbitrary instances for the Byte and Sorted type constructors.
(Medium) Write a (higher-order) property which asserts associativity of the mergeWith f function for any function f. Test your property in PSCi using quickCheckPure.
(Medium) Write Arbitrary and Coarbitrary instances for the following data type:
```
data OneTwoThree a = One a | Two a a | Three a a a
```
Hint: Use the oneOf function defined in Test.QuickCheck.Gen to define your Arbitrary instance.
(Medium) Use all to simplify the result of the quickCheckPure function – your new function should have the type List Result -> Boolean and should return true if every test passes and false otherwise.
(Medium) As another approach to simplifying the result of quickCheckPure, try writing a function squashResults :: List Result -> Result. Consider using the First monoid from Data.Maybe.First with the foldMap function to preserve the first error in case of failure.

Conclusion

In this chapter, we met the quickcheck package, which can be used to write tests in a declarative way using the paradigm of generative testing. In particular:

We saw how to automate QuickCheck tests using spago test.
We saw how to write properties as functions and how to use the <?> operator to improve error messages.
We saw how the Arbitrary and Coarbitrary type classes enable generation of boilerplate testing code and how they allow us to test higher-order properties.
We saw how to implement custom Arbitrary and Coarbitrary instances for our own data types.

PureScript by Example