Functions and Records

Chapter Goals

This chapter will introduce two building blocks of PureScript programs: functions and records. In addition, we'll see how to structure PureScript programs, and how to use types as an aid to program development.

We will build a simple address book application to manage a list of contacts. This code will introduce some new ideas from the syntax of PureScript.

The front-end of our application will be the interactive mode PSCi, but it would be possible to build on this code to write a front-end in JavaScript. In fact, we will do exactly that in later chapters, adding form validation and save/restore functionality.

Project Setup

The source code for this chapter is contained in the file src/Data/AddressBook.purs. This file starts with a module declaration and its import list:

module Data.AddressBook where

import Prelude

import Control.Plus (empty)
import Data.List (List(..), filter, head)
import Data.Maybe (Maybe)

Here, we import several modules:

  • The Prelude module, which contains a small set of standard definitions and functions. It re-exports many foundational modules from the purescript-prelude library.
  • The Control.Plus module, which defines the empty value.
  • The Data.List module, provided by the lists package, which can be installed using Spago. It contains a few functions that we will need for working with linked lists.
  • The Data.Maybe module, which defines data types and functions for working with optional values.

Notice that the imports for these modules are listed explicitly in parentheses (except for Prelude, which is typically imported as an open import). This is generally a good practice, as it helps to avoid conflicting imports.

Assuming you have cloned the book's source code repository, the project for this chapter can be built using Spago, with the following commands:

$ cd chapter3
$ spago build

Simple Types

PureScript defines three built-in types corresponding to JavaScript's primitive types: numbers, strings, and booleans. These are defined in the Prim module, which is implicitly imported by every module. They are called Number, String, and Boolean, respectively, and you can see them in PSCi by using the :type command to print the types of some simple values:

$ spago repl

> :type 1.0
Number

> :type "test"
String

> :type true
Boolean

PureScript defines other built-in types: integers, characters, arrays, records, and functions.

Integers are differentiated from floating point values of type Number by the lack of a decimal point:

> :type 1
Int

Character literals are wrapped in single quotes, unlike string literals which use double quotes:

> :type 'a'
Char

Arrays correspond to JavaScript arrays, but unlike in JavaScript, all elements of a PureScript array must have the same type:

> :type [1, 2, 3]
Array Int

> :type [true, false]
Array Boolean

> :type [1, false]
Could not match type Int with type Boolean.

The last example shows an error from the type checker, which failed to unify (i.e., make equal) the types of the two elements.

Records correspond to JavaScript's objects, and record literals have the same syntax as JavaScript's object literals:

> author = { name: "Phil", interests: ["Functional Programming", "JavaScript"] }

> :type author
{ name :: String
, interests :: Array String
}

This type indicates that the specified object has two fields: a name field with the type String and an interests field with the type Array String, i.e., an array of Strings.

Fields of records can be accessed using a dot, followed by the label of the field to access:

> author.name
"Phil"

> author.interests
["Functional Programming","JavaScript"]

PureScript's functions correspond to JavaScript's functions. Functions can be defined at the top-level of a file by specifying arguments before the equals sign:

import Prelude -- bring the (+) operator into scope

add :: Int -> Int -> Int
add x y = x + y

Alternatively, functions can be defined inline using a backslash character followed by a space-delimited list of argument names. To enter a multi-line declaration in PSCi, we can enter "paste mode" using the :paste command. In this mode, declarations are terminated using the Control-D key sequence:

> import Prelude
> :paste
… add :: Int -> Int -> Int
… add = \x y -> x + y
… ^D

Having defined this function in PSCi, we can apply it to its arguments by separating the two arguments from the function name by whitespace:

> add 10 20
30

Notes On Indentation

PureScript code is indentation-sensitive, just like Haskell, but unlike JavaScript. This means that the whitespace in your code is not meaningless, but rather is used to group regions of code, just like curly braces in C-like languages.

If a declaration spans multiple lines, any lines except the first must be indented past the indentation level of the first line.

Therefore, the following is a valid PureScript code:

add x y z = x +
  y + z

But this is not a valid code:

add x y z = x +
y + z

In the second case, the PureScript compiler will try to parse two declarations, one for each line.

Generally, any declarations defined in the same block should be indented at the same level. For example, in PSCi, declarations in a let statement must be indented equally. This is valid:

> :paste
… x = 1
… y = 2
… ^D

But this is not:

> :paste
… x = 1
…  y = 2
… ^D

Certain PureScript keywords introduce a new block of code, in which declarations must be further-indented:

example x y z =
  let
    foo = x * y
    bar = y * z
  in
    foo + bar

This doesn't compile:

example x y z =
  let
    foo = x * y
  bar = y * z
  in
    foo + bar

If you want to learn more (or encounter any problems), see the Syntax documentation.

Defining Our Types

A good first step when tackling a new problem in PureScript is to write out type definitions for any values you will be working with. First, let's define a type for records in our address book:

type Entry =
  { firstName :: String
  , lastName  :: String
  , address   :: Address
  }

This defines a type synonym called Entry – the type Entry is equivalent to the type on the right of the equals symbol: a record type with three fields – firstName, lastName, and address. The two name fields will have the type String, and the address field will have the type Address, defined as follows:

type Address =
  { street :: String
  , city   :: String
  , state  :: String
  }

Note that records can contain other records.

Now let's define a third type synonym for our address book data structure, which will be represented simply as a linked list of entries:

type AddressBook = List Entry

Note that List Entry differs from Array Entry, which represents an array of entries.

Type Constructors and Kinds

List is an example of a type constructor. Values do not have the type List directly, but rather List a for some type a. That is, List takes a type argument a and constructs a new type List a.

Note that just like function application, type constructors are applied to other types simply by juxtaposition: the type List Entry is, in fact, the type constructor List applied to the type Entry – it represents a list of entries.

If we try to incorrectly define a value of type List (by using the type annotation operator ::), we will see a new type of error:

> import Data.List
> Nil :: List
In a type-annotated expression x :: t, the type t must have kind Type

This is a kind error. Just like values are distinguished by their types, types are distinguished by their kinds, and just like ill-typed values result in type errors, ill-kinded types result in kind errors.

There is a special kind called Type which represents the kind of all types which have values, like Number and String.

There are also kinds for type constructors. For example, the kind Type -> Type represents a function from types to types, just like List. So the error here occurred because values are expected to have types with kind Type, but List has kind Type -> Type.

To find out the kind of a type, use the :kind command in PSCi. For example:

> :kind Number
Type

> import Data.List
> :kind List
Type -> Type

> :kind List String
Type

PureScript's kind system supports other interesting kinds, which we will see later in the book.

Quantified Types

For illustration purposes, let's define a primitive function that takes any two arguments and returns the first one:

> :paste
… constantlyFirst :: forall a b. a -> b -> a
… constantlyFirst = \a b -> a
… ^D

Note that if you use :type to ask about the type of constantlyFirst, it will be more verbose:

: type constantlyFirst
forall (a :: Type) (b :: Type). a -> b -> a

The type signature contains additional kind information, which explicitly notes that a and b should be concrete types.

The keyword forall indicates that constantlyFirst has a universally quantified type. It means we can substitute any types for a and bconstantlyFirst will work with these types.

For example, we might choose the type a to be Int and b to be String. In that case, we can specialize the type of constantlyFirst to

Int -> String -> Int

We don't have to indicate in code that we want to specialize a quantified type – it happens automatically. For example, we can use constantlyFirst as if it had this type already:

> constantlyFirst 3 "ignored"

3

While we can choose any types for a and b, the return type of constantlyFirst has to be the same as the type of the first argument (because both of them are "tied" to the same a):

:type constantlyFirst true "ignored"
Boolean

:type constantlyFirst "keep" 3
String

Displaying Address Book Entries

Let's write our first function, which will render an address book entry as a string. We start by giving the function a type. This is optional, but good practice, since it acts as a form of documentation. In fact, the PureScript compiler will give a warning if a top-level declaration does not contain a type annotation. A type declaration separates the name of a function from its type with the :: symbol:

showEntry :: Entry -> String

This type signature says that showEntry is a function that takes an Entry as an argument and returns a String. Here is the code for showEntry:

showEntry entry = entry.lastName <> ", " <>
                  entry.firstName <> ": " <>
                  showAddress entry.address

This function concatenates the three fields of the Entry record into a single string, using the showAddress function to turn the record inside the address field into a String. showAddress is defined similarly:

showAddress :: Address -> String
showAddress addr = addr.street <> ", " <>
                   addr.city <> ", " <>
                   addr.state

A function definition begins with the name of the function, followed by a list of argument names. The result of the function is specified after the equals sign. Fields are accessed with a dot, followed by the field name. In PureScript, string concatenation uses the diamond operator (<>), instead of the plus operator like in JavaScript.

Test Early, Test Often

The PSCi interactive mode allows for rapid prototyping with immediate feedback, so let's use it to verify that our first few functions behave as expected.

First, build the code you've written:

$ spago build

Next, load PSCi, and use the import command to import your new module:

$ spago repl

> import Data.AddressBook

We can create an entry by using a record literal, which looks just like an anonymous object in JavaScript.

> address = { street: "123 Fake St.", city: "Faketown", state: "CA" }

Now, try applying our function to the example:

> showAddress address

"123 Fake St., Faketown, CA"

Let's also test showEntry by creating an address book entry record containing our example address:

> entry = { firstName: "John", lastName: "Smith", address: address }
> showEntry entry

"Smith, John: 123 Fake St., Faketown, CA"

Creating Address Books

Now let's write some utility functions for working with address books. We will need a value representing an empty address book: an empty list.

emptyBook :: AddressBook
emptyBook = empty

We will also need a function for inserting a value into an existing address book. We will call this function insertEntry. Start by giving its type:

insertEntry :: Entry -> AddressBook -> AddressBook

This type signature says that insertEntry takes an Entry as its first argument, an AddressBook as a second argument, and returns a new AddressBook.

We don't modify the existing AddressBook directly. Instead, we return a new AddressBook, which contains the same data. As such, AddressBook is an example of an immutable data structure. This is an important idea in PureScript – mutation is a side-effect of code and inhibits our ability to reason effectively about its behavior, so we prefer pure functions and immutable data where possible.

To implement insertEntry, we can use the Cons function from Data.List. To see its type, open PSCi and use the :type command:

$ spago repl

> import Data.List
> :type Cons

forall (a :: Type). a -> List a -> List a

This type signature says that Cons takes a value of some type a, takes a list of elements of type a, and returns a new list with entries of the same type. Let's specialize this with a as our Entry type:

Entry -> List Entry -> List Entry

But List Entry is the same as AddressBook, so this is equivalent to

Entry -> AddressBook -> AddressBook

In our case, we already have the appropriate inputs: an Entry, and an AddressBook, so can apply Cons and get a new AddressBook, which is exactly what we wanted!

Here is our implementation of insertEntry:

insertEntry entry book = Cons entry book

This brings the two arguments entry and book into scope – on the left-hand side of the equals symbol – and then applies the Cons function to create the result.

Curried Functions

Functions in PureScript take exactly one argument. While it looks like the insertEntry function takes two arguments, it is an example of a curried function. In PureScript, all functions are considered curried.

Currying means converting a function that takes multiple arguments into a function that takes them one at a time. When we call a function, we pass it one argument, and it returns another function that also takes one argument until all arguments are passed.

For example, when we pass 5 to add, we get another function, which takes an int, adds 5 to it, and returns the sum as a result:

add :: Int -> Int -> Int
add x y = x + y

addFive :: Int -> Int
addFive = add 5

addFive is the result of partial application, which means we pass less than the total number of arguments to a function that takes multiple arguments. Let's give it a try:

Note that you must define the add function if you haven't already:

> import Prelude
> :paste
… add :: Int -> Int -> Int
… add x y = x + y
… ^D
> :paste
… addFive :: Int -> Int
… addFive = add 5
… ^D

> addFive 1
6

> add 5 1
6

To better understand currying and partial application, try making a few other functions, for example, out of add. And when you're done, let's return to the insertEntry.

insertEntry :: Entry -> AddressBook -> AddressBook

The -> operator (in the type signature) associates to the right, which means that the compiler parses the type as

Entry -> (AddressBook -> AddressBook)

insertEntry takes a single argument, an Entry, and returns a new function, which in turn takes a single AddressBook argument and returns a new AddressBook.

This means we can partially apply insertEntry by specifying only its first argument, for example. In PSCi, we can see the result type:

> :type insertEntry entry

AddressBook -> AddressBook

As expected, the return type was a function. We can apply the resulting function to a second argument:

> :type (insertEntry entry) emptyBook
AddressBook

Note though, that the parentheses here are unnecessary – the following is equivalent:

> :type insertEntry entry emptyBook
AddressBook

This is because function application associates to the left, which explains why we can specify function arguments one after the other, separated by whitespace.

The -> operator in function types is a type constructor for functions. It takes two type arguments: the function's argument type and the return type – the left and right operands, respectively.

Note that in the rest of the book, I will talk about things like "functions of two arguments". However, it is to be understood that this means a curried function, taking a first argument and returning a function that takes the second.

Now consider the definition of insertEntry:

insertEntry :: Entry -> AddressBook -> AddressBook
insertEntry entry book = Cons entry book

If we explicitly parenthesize the right-hand side, we get (Cons entry) book. That is, insertEntry entry is a function whose argument is just passed along to the (Cons entry) function. But if two functions have the same result for every input, then they are the same! So we can remove the argument book from both sides:

insertEntry :: Entry -> AddressBook -> AddressBook
insertEntry entry = Cons entry

But now, by the same argument, we can remove entry from both sides:

insertEntry :: Entry -> AddressBook -> AddressBook
insertEntry = Cons

This process, called eta conversion, can be used (along with other techniques) to rewrite functions in point-free form, which means functions defined without reference to their arguments.

In the case of insertEntry, eta conversion has resulted in a very clear definition of our function – "insertEntry is just cons on lists". However, it is arguable whether the point-free form is better in general.

Property Accessors

One common pattern is to use a function to access individual fields (or "properties") of a record. An inline function to extract an Address from an Entry could be written as:

\entry -> entry.address

PureScript also allows property accessor shorthand, where an underscore acts as the anonymous function argument, so the inline function above is equivalent to:

_.address

This works with any number of levels or properties, so a function to extract the city associated with an Entry could be written as:

_.address.city

For example:

> address = { street: "123 Fake St.", city: "Faketown", state: "CA" }
> entry = { firstName: "John", lastName: "Smith", address: address }
> _.lastName entry
"Smith"

> _.address.city entry
"Faketown"

Querying the Address Book

The last function we need to implement for our minimal address book application will look up a person by name and return the correct Entry. This will be a nice application of building programs by composing small functions – a key idea from functional programming.

We can filter the address book, keeping only those entries with the correct first and last names. Then we can return the head (i.e., first) element of the resulting list.

With this high-level specification of our approach, we can calculate the type of our function. First, open PSCi, and find the types of the filter and head functions:

$ spago repl

> import Data.List
> :type filter

forall (a :: Type). (a -> Boolean) -> List a -> List a

> :type head

forall (a :: Type). List a -> Maybe a

Let's pick apart these two types to understand their meaning.

filter is a curried function of two arguments. Its first argument is a function, which takes an element of the list and returns a Boolean value. Its second argument is a list of elements, and the return value is another list.

head takes a list as its argument and returns a type we haven't seen before: Maybe a. Maybe a represents an optional value of type a, and provides a type-safe alternative to using null to indicate a missing value in languages like JavaScript. We will see it again in more detail in later chapters.

The universally quantified types of filter and head can be specialized by the PureScript compiler, to the following types:

filter :: (Entry -> Boolean) -> AddressBook -> AddressBook

head :: AddressBook -> Maybe Entry

We know that we will need to pass the first and last names that we want to search for as arguments to our function.

We also know that we will need a function to pass to filter. Let's call this function filterEntry. filterEntry will have type Entry -> Boolean. The application filter filterEntry will then have type AddressBook -> AddressBook. If we pass the result of this function to the head function, we get our result of type Maybe Entry.

Putting these facts together, a reasonable type signature for our function, which we will call findEntry, is:

findEntry :: String -> String -> AddressBook -> Maybe Entry

This type signature says that findEntry takes two strings: the first and last names, takes an AddressBook, and returns an optional Entry. The optional result will contain a value only if the name is found in the address book.

And here is the definition of findEntry:

findEntry firstName lastName book = head (filter filterEntry book)
  where
    filterEntry :: Entry -> Boolean
    filterEntry entry = entry.firstName == firstName && entry.lastName == lastName

Let's go over this code step by step.

findEntry brings three names into scope: firstName and lastName, both representing strings, and book, an AddressBook.

The right-hand side of the definition combines the filter and head functions: first, the list of entries is filtered, and the head function is applied to the result.

The predicate function filterEntry is defined as an auxiliary declaration inside a where clause. This way, the filterEntry function is available inside the definition of our function, but not outside it. Also, it can depend on the arguments to the enclosing function, which is essential here because filterEntry uses the firstName and lastName arguments to filter the specified Entry.

Note that, just like for top-level declarations, it was unnecessary to specify a type signature for filterEntry. However, doing so is recommended as a form of documentation.

Infix Function Application

Most functions discussed so far used prefix function application, where the function name was put before the arguments. For example, when using the insertEntry function to add an Entry (john) to an empty AddressBook, we might write:

> book1 = insertEntry john emptyBook

However, this chapter has also included examples of infix binary operators, such as the == operator in the definition of filterEntry, where the operator is put between the two arguments. These infix operators are defined in the PureScript source as infix aliases for their underlying prefix implementations. For example, == is defined as an infix alias for the prefix eq function with the line:

infix 4 eq as ==

Therefore entry.firstName == firstName in filterEntry could be replaced with the eq entry.firstName firstName. We'll cover a few more examples of defining infix operators later in this section.

In some situations, putting a prefix function in an infix position as an operator leads to more readable code. One example is the mod function:

> mod 8 3
2

The above usage works fine but is awkward to read. A more familiar phrasing is "eight mod three", which you can achieve by wrapping a prefix function in backticks (`):

> 8 `mod` 3
2

In the same way, wrapping insertEntry in backticks turns it into an infix operator, such that book1 and book2 below are equivalent:

book1 = insertEntry john emptyBook
book2 = john `insertEntry` emptyBook

We can make an AddressBook with multiple entries by using multiple applications of insertEntry as a prefix function (book3) or as an infix operator (book4) as shown below:

book3 = insertEntry john (insertEntry peggy (insertEntry ned emptyBook))
book4 = john `insertEntry` (peggy `insertEntry` (ned `insertEntry` emptyBook))

We can also define an infix operator alias (or synonym) for insertEntry. We'll arbitrarily choose ++ for this operator, give it a precedence of 5, and make it right associative using infixr:

infixr 5 insertEntry as ++

This new operator lets us rewrite the above book4 example as:

book5 = john ++ (peggy ++ (ned ++ emptyBook))

The right associativity of our new ++ operator lets us get rid of the parentheses without changing the meaning:

book6 = john ++ peggy ++ ned ++ emptyBook

Another common technique for eliminating parens is to use apply's infix operator $, along with your standard prefix functions.

For example, the earlier book3 example could be rewritten as:

book7 = insertEntry john $ insertEntry peggy $ insertEntry ned emptyBook

Substituting $ for parens is usually easier to type and (arguably) easier to read. A mnemonic to remember the meaning of this symbol is to think of the dollar sign as being drawn from two parens that are also being crossed-out, suggesting the parens are now unnecessary.

Note that $ isn't a special syntax hardcoded into the language. It's simply the infix operator for a regular function called apply, which is defined in Data.Function as follows:

apply :: forall a b. (a -> b) -> a -> b
apply f x = f x

infixr 0 apply as $

The apply function takes another function (of type (a -> b)) as its first argument and a value (of type a) as its second argument, then calls that function with that value. If it seems like this function doesn't contribute anything meaningful, you are absolutely correct! Your program is logically identical without it (see referential transparency). The syntactic utility of this function comes from the special properties assigned to its infix operator. $ is a right-associative (infixr), low precedence (0) operator, which lets us remove sets of parentheses for deeply-nested applications.

Another parens-busting opportunity for the $ operator is in our earlier findEntry function:

findEntry firstName lastName book = head $ filter filterEntry book

We'll see an even more elegant way to rewrite this line with "function composition" in the next section.

If you'd like to use a concise infix operator alias as a prefix function, you can surround it in parentheses:

> 8 + 3
11

> (+) 8 3
11

Alternatively, operators can be partially applied by surrounding the expression with parentheses and using _ as an operand in an operator section. You can think of this as a more convenient way to create simple anonymous functions (although in the below example, we're then binding that anonymous function to a name, so it's not so anonymous anymore):

> add3 = (3 + _)
> add3 2
5

To summarize, the following are equivalent definitions of a function that adds 5 to its argument:

add5 x = 5 + x
add5 x = add 5 x
add5 x = (+) 5 x
add5 x = 5 `add` x
add5   = add 5
add5   = \x -> 5 + x
add5   = (5 + _)
add5 x = 5 `(+)` x  -- Yo Dawg, I herd you like infix, so we put infix in your infix!

Function Composition

Just like we were able to simplify the insertEntry function by using eta conversion, we can simplify the definition of findEntry by reasoning about its arguments.

Note that the book argument is passed to the filter filterEntry function, and the result of this application is passed to head. In other words, book is passed to the composition of the functions filter filterEntry and head.

In PureScript, the function composition operators are <<< and >>>. The first is "backwards composition", and the second is "forwards composition".

We can rewrite the right-hand side of findEntry using either operator. Using backwards-composition, the right-hand side would be

(head <<< filter filterEntry) book

In this form, we can apply the eta conversion trick from earlier, to arrive at the final form of findEntry:

findEntry firstName lastName = head <<< filter filterEntry
  where
    ...

An equally valid right-hand side would be:

filter filterEntry >>> head

Either way, this gives a clear definition of the findEntry function: "findEntry is the composition of a filtering function and the head function".

I will let you decide which definition is easier to understand, but it is often useful to think of functions as building blocks in this way: each function executes a single task, and solutions are assembled using function composition.

Exercises

  1. (Easy) Test your understanding of the findEntry function by writing down the types of each of its major subexpressions. For example, the type of the head function as used is specialized to AddressBook -> Maybe Entry. Note: There is no test for this exercise.
  2. (Medium) Write a function findEntryByStreet :: String -> AddressBook -> Maybe Entry which looks up an Entry given a street address. Hint reusing the existing code in findEntry. Test your function in PSCi and by running spago test.
  3. (Medium) Rewrite findEntryByStreet to replace filterEntry with the composition (using <<< or >>>) of: a property accessor (using the _. notation); and a function that tests whether its given string argument is equal to the given street address.
  4. (Medium) Write a function isInBook that tests whether a name appears in a AddressBook, returning a Boolean value. Hint: Use PSCi to find the type of the Data.List.null function, which tests whether a list is empty or not.
  5. (Difficult) Write a function removeDuplicates which removes "duplicate" address book entries. We'll consider entries duplicated if they share the same first and last names, while ignoring address fields. Hint: Use PSCi to find the type of the Data.List.nubByEq function, which removes duplicate elements from a list based on an equality predicate. Note that the first element in each set of duplicates (closest to the list head) is the one that is kept.

Conclusion

In this chapter, we covered several new functional programming concepts and learned how to:

  • Use the interactive mode PSCi to experiment with functions and test ideas.
  • Use types as both a correctness tool and an implementation tool.
  • Use curried functions to represent functions of multiple arguments.
  • Create programs from smaller components by composition.
  • Structure code neatly using where expressions.
  • Avoid null values by using the Maybe type.
  • Use techniques like eta conversion and function composition to refactor code into a clear specification.

In the following chapters, we'll build on these ideas.