A Guide to Writing Properties of Pure Functions - John Hughes at Lambda Days 2020

John Hughes, co-designer of Haskell and Quickcheck, recently discussed property-based testing at Lambda Days 2020. Hughes presented five different strategies in his talk for determining properties and compared their effectiveness. Metamorphic and model-based properties reportedly showed high effectiveness.

Hughes first discussed a common problem faced by developers using property-based testing (PBT) to test pure functions. With example-based testing, the expected output of the function under test can be computed ahead of time and compared to the actual output of the function. Property-based testing, however, typically generates (semi-randomly) a large number of test sequences. Computing the expected outputs of the function under test for a large set of values that are not known ahead of time is not easier than implementing the function in the first place. This challenge is called the test oracle problem and is described by Prof. Tsong Yueh Chen as follows:

The oracle problem refers to situations where it is extremely difficult, or impossible, to verify the test result of a given test case

Developers may end up replicating the original code in the tests, and as such introduce the same bugs in the testing code as in the tested code.

To solve this conundrum, Hughes recommended identifying and checking properties of the output of the function under test. One such property for the `reverse`

function, which takes a list and returns the list in reverse, is:

` ``reverse (reverse xs) = xs `

The challenges now turn to finding a good-enough set of properties which allows developers to either find bugs, or ensure the correctness of the function implementation. Hughes demonstrated that using only one property was often not enough by giving the following erroneous `reverse`

implementation:

` ``reverse xs = xs `

Hughes thus proceeded to give five systematic ways of generating properties and analyzed their effectiveness. The first method consists in identifying invariants of the function under test. To illustrate that method, Hughes considered a binary search tree (BST) data structure, endowed with the operations `find`

, `insert`

, `delete`

, `union`

. Such data structure has the property that the insertion of a key/value pair in a valid tree results in a valid tree. The same applies to the other deletion and union operations. A tree itself is considered valid if it adheres to the contracts of the binary search tree (in particular those referring to the ordering of keys).

The second method consists of finding a postcondition verified by the output of the function under test. For instance, Hughes gave the following post-condition for the BST’s `find`

operation:

After calling

`insert`

, we should be able to`find`

the key inserted, and any other keys present beforehand.

The third method relies on metamorphic properties. Metamorphic properties relate the outputs produced by the system under test when fed related inputs. Hughes gave the following metamorphic property for the BST’s `insert`

operation:

` ``insert k v (insert k' v' t) = insert k' v' (insert k v t) `

The previous property simply means that the resulting tree from inserting two key/value pairs does not depend on the order of the insertions. Metamorphic testing is an active research area. A thorough examination of the technique can be found in Prof. Tsong Yueh Chen’s *Metamorphic testing: A review of challenges and opportunities* paper.

The fourth method is based on what Hughes termed *inductive properties*. Scott Wlaschin, senior software architect, explained in one of his talls on PBT:

These kinds of properties are based on

structural induction– that is, if a large thing can be broken into smaller parts, and some property is true for these smaller parts, then you can often prove that the property is true for a large thing as well. (…)Induction properties are often naturally applicable to recursive structures such as lists and trees.