Functions and Records
Chapter Goals
This chapter will introduce two building blocks of PureScript programs: functions and records. In addition, we'll see how to structure PureScript programs, and how to use types as an aid to program development.
We will build a simple address book application to manage a list of contacts. This code will introduce some new ideas from the syntax of PureScript.
The front-end of our application will be the interactive mode PSCi, but it would be possible to build on this code to write a front-end in JavaScript. In fact, we will do exactly that in later chapters, adding form validation and save/restore functionality.
Project Setup
The source code for this chapter is contained in the file src/Data/AddressBook.purs
. This file starts with a module declaration and its import list:
module Data.AddressBook where
import Prelude
import Control.Plus (empty)
import Data.List (List(..), filter, head)
import Data.Maybe (Maybe)
Here, we import several modules:
- The
Prelude
module, which contains a small set of standard definitions and functions. It re-exports many foundational modules from thepurescript-prelude
library. - The
Control.Plus
module, which defines theempty
value. - The
Data.List
module, provided by thelists
package, which can be installed using Spago. It contains a few functions that we will need for working with linked lists. - The
Data.Maybe
module, which defines data types and functions for working with optional values.
Notice that the imports for these modules are listed explicitly in parentheses (except for Prelude
, which is typically imported as an open import). This is generally a good practice, as it helps to avoid conflicting imports.
Assuming you have cloned the book's source code repository, the project for this chapter can be built using Spago, with the following commands:
$ cd chapter3
$ spago build
Simple Types
PureScript defines three built-in types corresponding to JavaScript's primitive types: numbers, strings, and booleans. These are defined in the Prim
module, which is implicitly imported by every module. They are called Number
, String
, and Boolean
, respectively, and you can see them in PSCi by using the :type
command to print the types of some simple values:
$ spago repl
> :type 1.0
Number
> :type "test"
String
> :type true
Boolean
PureScript defines other built-in types: integers, characters, arrays, records, and functions.
Integers are differentiated from floating point values of type Number
by the lack of a decimal point:
> :type 1
Int
Character literals are wrapped in single quotes, unlike string literals which use double quotes:
> :type 'a'
Char
Arrays correspond to JavaScript arrays, but unlike in JavaScript, all elements of a PureScript array must have the same type:
> :type [1, 2, 3]
Array Int
> :type [true, false]
Array Boolean
> :type [1, false]
Could not match type Int with type Boolean.
The last example shows an error from the type checker, which failed to unify (i.e., make equal) the types of the two elements.
Records correspond to JavaScript's objects, and record literals have the same syntax as JavaScript's object literals:
> author = { name: "Phil", interests: ["Functional Programming", "JavaScript"] }
> :type author
{ name :: String
, interests :: Array String
}
This type indicates that the specified object has two fields: a name
field with the type String
and an interests
field with the type Array String
, i.e., an array of String
s.
Fields of records can be accessed using a dot, followed by the label of the field to access:
> author.name
"Phil"
> author.interests
["Functional Programming","JavaScript"]
PureScript's functions correspond to JavaScript's functions. Functions can be defined at the top-level of a file by specifying arguments before the equals sign:
import Prelude -- bring the (+) operator into scope
add :: Int -> Int -> Int
add x y = x + y
Alternatively, functions can be defined inline using a backslash character followed by a space-delimited list of argument names. To enter a multi-line declaration in PSCi, we can enter "paste mode" using the :paste
command. In this mode, declarations are terminated using the Control-D key sequence:
> import Prelude
> :paste
… add :: Int -> Int -> Int
… add = \x y -> x + y
… ^D
Having defined this function in PSCi, we can apply it to its arguments by separating the two arguments from the function name by whitespace:
> add 10 20
30
Notes On Indentation
PureScript code is indentation-sensitive, just like Haskell, but unlike JavaScript. This means that the whitespace in your code is not meaningless, but rather is used to group regions of code, just like curly braces in C-like languages.
If a declaration spans multiple lines, any lines except the first must be indented past the indentation level of the first line.
Therefore, the following is a valid PureScript code:
add x y z = x +
y + z
But this is not a valid code:
add x y z = x +
y + z
In the second case, the PureScript compiler will try to parse two declarations, one for each line.
Generally, any declarations defined in the same block should be indented at the same level. For example, in PSCi, declarations in a let statement must be indented equally. This is valid:
> :paste
… x = 1
… y = 2
… ^D
But this is not:
> :paste
… x = 1
… y = 2
… ^D
Certain PureScript keywords introduce a new block of code, in which declarations must be further-indented:
example x y z =
let
foo = x * y
bar = y * z
in
foo + bar
This doesn't compile:
example x y z =
let
foo = x * y
bar = y * z
in
foo + bar
If you want to learn more (or encounter any problems), see the Syntax documentation.
Defining Our Types
A good first step when tackling a new problem in PureScript is to write out type definitions for any values you will be working with. First, let's define a type for records in our address book:
type Entry =
{ firstName :: String
, lastName :: String
, address :: Address
}
This defines a type synonym called Entry
– the type Entry
is equivalent to the type on the right of the equals symbol: a record type with three fields – firstName
, lastName
, and address
. The two name fields will have the type String
, and the address
field will have the type Address
, defined as follows:
type Address =
{ street :: String
, city :: String
, state :: String
}
Note that records can contain other records.
Now let's define a third type synonym for our address book data structure, which will be represented simply as a linked list of entries:
type AddressBook = List Entry
Note that List Entry
differs from Array Entry
, which represents an array of entries.
Type Constructors and Kinds
List
is an example of a type constructor. Values do not have the type List
directly, but rather List a
for some type a
. That is, List
takes a type argument a
and constructs a new type List a
.
Note that just like function application, type constructors are applied to other types simply by juxtaposition: the type List Entry
is, in fact, the type constructor List
applied to the type Entry
– it represents a list of entries.
If we try to incorrectly define a value of type List
(by using the type annotation operator ::
), we will see a new type of error:
> import Data.List
> Nil :: List
In a type-annotated expression x :: t, the type t must have kind Type
This is a kind error. Just like values are distinguished by their types, types are distinguished by their kinds, and just like ill-typed values result in type errors, ill-kinded types result in kind errors.
There is a special kind called Type
which represents the kind of all types which have values, like Number
and String
.
There are also kinds for type constructors. For example, the kind Type -> Type
represents a function from types to types, just like List
. So the error here occurred because values are expected to have types with kind Type
, but List
has kind Type -> Type
.
To find out the kind of a type, use the :kind
command in PSCi. For example:
> :kind Number
Type
> import Data.List
> :kind List
Type -> Type
> :kind List String
Type
PureScript's kind system supports other interesting kinds, which we will see later in the book.
Quantified Types
For illustration purposes, let's define a primitive function that takes any two arguments and returns the first one:
> :paste
… constantlyFirst :: forall a b. a -> b -> a
… constantlyFirst = \a b -> a
… ^D
Note that if you use
:type
to ask about the type ofconstantlyFirst
, it will be more verbose:: type constantlyFirst forall (a :: Type) (b :: Type). a -> b -> a
The type signature contains additional kind information, which explicitly notes that
a
andb
should be concrete types.
The keyword forall
indicates that constantlyFirst
has a universally quantified type. It means we can substitute any types for a
and b
– constantlyFirst
will work with these types.
For example, we might choose the type a
to be Int
and b
to be String
. In that case, we can specialize the type of constantlyFirst
to
Int -> String -> Int
We don't have to indicate in code that we want to specialize a quantified type – it happens automatically. For example, we can use constantlyFirst
as if it had this type already:
> constantlyFirst 3 "ignored"
3
While we can choose any types for a
and b
, the return type of constantlyFirst
has to be the same as the type of the first argument (because both of them are "tied" to the same a
):
:type constantlyFirst true "ignored"
Boolean
:type constantlyFirst "keep" 3
String
Displaying Address Book Entries
Let's write our first function, which will render an address book entry as a string. We start by giving the function a type. This is optional, but good practice, since it acts as a form of documentation. In fact, the PureScript compiler will give a warning if a top-level declaration does not contain a type annotation. A type declaration separates the name of a function from its type with the ::
symbol:
showEntry :: Entry -> String
This type signature says that showEntry
is a function that takes an Entry
as an argument and returns a String
. Here is the code for showEntry
:
showEntry entry = entry.lastName <> ", " <>
entry.firstName <> ": " <>
showAddress entry.address
This function concatenates the three fields of the Entry
record into a single string, using the showAddress
function to turn the record inside the address
field into a String
. showAddress
is defined similarly:
showAddress :: Address -> String
showAddress addr = addr.street <> ", " <>
addr.city <> ", " <>
addr.state
A function definition begins with the name of the function, followed by a list of argument names. The result of the function is specified after the equals sign. Fields are accessed with a dot, followed by the field name. In PureScript, string concatenation uses the diamond operator (<>
), instead of the plus operator like in JavaScript.
Test Early, Test Often
The PSCi interactive mode allows for rapid prototyping with immediate feedback, so let's use it to verify that our first few functions behave as expected.
First, build the code you've written:
$ spago build
Next, load PSCi, and use the import
command to import your new module:
$ spago repl
> import Data.AddressBook
We can create an entry by using a record literal, which looks just like an anonymous object in JavaScript.
> address = { street: "123 Fake St.", city: "Faketown", state: "CA" }
Now, try applying our function to the example:
> showAddress address
"123 Fake St., Faketown, CA"
Let's also test showEntry
by creating an address book entry record containing our example address:
> entry = { firstName: "John", lastName: "Smith", address: address }
> showEntry entry
"Smith, John: 123 Fake St., Faketown, CA"
Creating Address Books
Now let's write some utility functions for working with address books. We will need a value representing an empty address book: an empty list.
emptyBook :: AddressBook
emptyBook = empty
We will also need a function for inserting a value into an existing address book. We will call this function insertEntry
. Start by giving its type:
insertEntry :: Entry -> AddressBook -> AddressBook
This type signature says that insertEntry
takes an Entry
as its first argument, an AddressBook
as a second argument, and returns a new AddressBook
.
We don't modify the existing AddressBook
directly. Instead, we return a new AddressBook
, which contains the same data. As such, AddressBook
is an example of an immutable data structure. This is an important idea in PureScript – mutation is a side-effect of code and inhibits our ability to reason effectively about its behavior, so we prefer pure functions and immutable data where possible.
To implement insertEntry
, we can use the Cons
function from Data.List
. To see its type, open PSCi and use the :type
command:
$ spago repl
> import Data.List
> :type Cons
forall (a :: Type). a -> List a -> List a
This type signature says that Cons
takes a value of some type a
, takes a list of elements of type a
, and returns a new list with entries of the same type. Let's specialize this with a
as our Entry
type:
Entry -> List Entry -> List Entry
But List Entry
is the same as AddressBook
, so this is equivalent to
Entry -> AddressBook -> AddressBook
In our case, we already have the appropriate inputs: an Entry
, and an AddressBook
, so can apply Cons
and get a new AddressBook
, which is exactly what we wanted!
Here is our implementation of insertEntry
:
insertEntry entry book = Cons entry book
This brings the two arguments entry
and book
into scope – on the left-hand side of the equals symbol – and then applies the Cons
function to create the result.
Curried Functions
Functions in PureScript take exactly one argument. While it looks like the insertEntry
function takes two arguments, it is an example of a curried function. In PureScript, all functions are considered curried.
Currying means converting a function that takes multiple arguments into a function that takes them one at a time. When we call a function, we pass it one argument, and it returns another function that also takes one argument until all arguments are passed.
For example, when we pass 5
to add
, we get another function, which takes an int, adds 5 to it, and returns the sum as a result:
add :: Int -> Int -> Int
add x y = x + y
addFive :: Int -> Int
addFive = add 5
addFive
is the result of partial application, which means we pass less than the total number of arguments to a function that takes multiple arguments. Let's give it a try:
Note that you must define the
add
function if you haven't already:> import Prelude > :paste … add :: Int -> Int -> Int … add x y = x + y … ^D
> :paste
… addFive :: Int -> Int
… addFive = add 5
… ^D
> addFive 1
6
> add 5 1
6
To better understand currying and partial application, try making a few other functions, for example, out of add
. And when you're done, let's return to the insertEntry
.
insertEntry :: Entry -> AddressBook -> AddressBook
The ->
operator (in the type signature) associates to the right, which means that the compiler parses the type as
Entry -> (AddressBook -> AddressBook)
insertEntry
takes a single argument, an Entry
, and returns a new function, which in turn takes a single AddressBook
argument and returns a new AddressBook
.
This means we can partially apply insertEntry
by specifying only its first argument, for example. In PSCi, we can see the result type:
> :type insertEntry entry
AddressBook -> AddressBook
As expected, the return type was a function. We can apply the resulting function to a second argument:
> :type (insertEntry entry) emptyBook
AddressBook
Note though, that the parentheses here are unnecessary – the following is equivalent:
> :type insertEntry entry emptyBook
AddressBook
This is because function application associates to the left, which explains why we can specify function arguments one after the other, separated by whitespace.
The ->
operator in function types is a type constructor for functions. It takes two type arguments: the function's argument type and the return type – the left and right operands, respectively.
Note that in the rest of the book, I will talk about things like "functions of two arguments". However, it is to be understood that this means a curried function, taking a first argument and returning a function that takes the second.
Now consider the definition of insertEntry
:
insertEntry :: Entry -> AddressBook -> AddressBook
insertEntry entry book = Cons entry book
If we explicitly parenthesize the right-hand side, we get (Cons entry) book
. That is, insertEntry entry
is a function whose argument is just passed along to the (Cons entry)
function. But if two functions have the same result for every input, then they are the same! So we can remove the argument book
from both sides:
insertEntry :: Entry -> AddressBook -> AddressBook
insertEntry entry = Cons entry
But now, by the same argument, we can remove entry
from both sides:
insertEntry :: Entry -> AddressBook -> AddressBook
insertEntry = Cons
This process, called eta conversion, can be used (along with other techniques) to rewrite functions in point-free form, which means functions defined without reference to their arguments.
In the case of insertEntry
, eta conversion has resulted in a very clear definition of our function – "insertEntry
is just cons on lists". However, it is arguable whether the point-free form is better in general.
Property Accessors
One common pattern is to use a function to access individual fields (or "properties") of a record. An inline function to extract an Address
from an Entry
could be written as:
\entry -> entry.address
PureScript also allows property accessor shorthand, where an underscore acts as the anonymous function argument, so the inline function above is equivalent to:
_.address
This works with any number of levels or properties, so a function to extract the city associated with an Entry
could be written as:
_.address.city
For example:
> address = { street: "123 Fake St.", city: "Faketown", state: "CA" }
> entry = { firstName: "John", lastName: "Smith", address: address }
> _.lastName entry
"Smith"
> _.address.city entry
"Faketown"
Querying the Address Book
The last function we need to implement for our minimal address book application will look up a person by name and return the correct Entry
. This will be a nice application of building programs by composing small functions – a key idea from functional programming.
We can filter the address book, keeping only those entries with the correct first and last names. Then we can return the head (i.e., first) element of the resulting list.
With this high-level specification of our approach, we can calculate the type of our function. First, open PSCi, and find the types of the filter
and head
functions:
$ spago repl
> import Data.List
> :type filter
forall (a :: Type). (a -> Boolean) -> List a -> List a
> :type head
forall (a :: Type). List a -> Maybe a
Let's pick apart these two types to understand their meaning.
filter
is a curried function of two arguments. Its first argument is a function, which takes an element of the list and returns a Boolean
value. Its second argument is a list of elements, and the return value is another list.
head
takes a list as its argument and returns a type we haven't seen before: Maybe a
. Maybe a
represents an optional value of type a
, and provides a type-safe alternative to using null
to indicate a missing value in languages like JavaScript. We will see it again in more detail in later chapters.
The universally quantified types of filter
and head
can be specialized by the PureScript compiler, to the following types:
filter :: (Entry -> Boolean) -> AddressBook -> AddressBook
head :: AddressBook -> Maybe Entry
We know that we will need to pass the first and last names that we want to search for as arguments to our function.
We also know that we will need a function to pass to filter
. Let's call this function filterEntry
. filterEntry
will have type Entry -> Boolean
. The application filter filterEntry
will then have type AddressBook -> AddressBook
. If we pass the result of this function to the head
function, we get our result of type Maybe Entry
.
Putting these facts together, a reasonable type signature for our function, which we will call findEntry
, is:
findEntry :: String -> String -> AddressBook -> Maybe Entry
This type signature says that findEntry
takes two strings: the first and last names, takes an AddressBook
, and returns an optional Entry
. The optional result will contain a value only if the name is found in the address book.
And here is the definition of findEntry
:
findEntry firstName lastName book = head (filter filterEntry book)
where
filterEntry :: Entry -> Boolean
filterEntry entry = entry.firstName == firstName && entry.lastName == lastName
Let's go over this code step by step.
findEntry
brings three names into scope: firstName
and lastName
, both representing strings, and book
, an AddressBook
.
The right-hand side of the definition combines the filter
and head
functions: first, the list of entries is filtered, and the head
function is applied to the result.
The predicate function filterEntry
is defined as an auxiliary declaration inside a where
clause. This way, the filterEntry
function is available inside the definition of our function, but not outside it. Also, it can depend on the arguments to the enclosing function, which is essential here because filterEntry
uses the firstName
and lastName
arguments to filter the specified Entry
.
Note that, just like for top-level declarations, it was unnecessary to specify a type signature for filterEntry
. However, doing so is recommended as a form of documentation.
Infix Function Application
Most functions discussed so far used prefix function application, where the function name was put before the arguments. For example, when using the insertEntry
function to add an Entry
(john
) to an empty AddressBook
, we might write:
> book1 = insertEntry john emptyBook
However, this chapter has also included examples of infix binary operators, such as the ==
operator in the definition of filterEntry
, where the operator is put between the two arguments. These infix operators are defined in the PureScript source as infix aliases for their underlying prefix implementations. For example, ==
is defined as an infix alias for the prefix eq
function with the line:
infix 4 eq as ==
Therefore entry.firstName == firstName
in filterEntry
could be replaced with the eq entry.firstName firstName
. We'll cover a few more examples of defining infix operators later in this section.
In some situations, putting a prefix function in an infix position as an operator leads to more readable code. One example is the mod
function:
> mod 8 3
2
The above usage works fine but is awkward to read. A more familiar phrasing is "eight mod three", which you can achieve by wrapping a prefix function in backticks (`):
> 8 `mod` 3
2
In the same way, wrapping insertEntry
in backticks turns it into an infix operator, such that book1
and book2
below are equivalent:
book1 = insertEntry john emptyBook
book2 = john `insertEntry` emptyBook
We can make an AddressBook
with multiple entries by using multiple applications of insertEntry
as a prefix function (book3
) or as an infix operator (book4
) as shown below:
book3 = insertEntry john (insertEntry peggy (insertEntry ned emptyBook))
book4 = john `insertEntry` (peggy `insertEntry` (ned `insertEntry` emptyBook))
We can also define an infix operator alias (or synonym) for insertEntry.
We'll arbitrarily choose ++
for this operator, give it a precedence of 5
, and make it right associative using infixr
:
infixr 5 insertEntry as ++
This new operator lets us rewrite the above book4
example as:
book5 = john ++ (peggy ++ (ned ++ emptyBook))
The right associativity of our new ++
operator lets us get rid of the parentheses without changing the meaning:
book6 = john ++ peggy ++ ned ++ emptyBook
Another common technique for eliminating parens is to use apply
's infix operator $
, along with your standard prefix functions.
For example, the earlier book3
example could be rewritten as:
book7 = insertEntry john $ insertEntry peggy $ insertEntry ned emptyBook
Substituting $
for parens is usually easier to type and (arguably) easier to read. A mnemonic to remember the meaning of this symbol is to think of the dollar sign as being drawn from two parens that are also being crossed-out, suggesting the parens are now unnecessary.
Note that $
isn't a special syntax hardcoded into the language. It's simply the infix operator for a regular function called apply
, which is defined in Data.Function
as follows:
apply :: forall a b. (a -> b) -> a -> b
apply f x = f x
infixr 0 apply as $
The apply
function takes another function (of type (a -> b)
) as its first argument and a value (of type a
) as its second argument, then calls that function with that value. If it seems like this function doesn't contribute anything meaningful, you are absolutely correct! Your program is logically identical without it (see referential transparency). The syntactic utility of this function comes from the special properties assigned to its infix operator. $
is a right-associative (infixr
), low precedence (0
) operator, which lets us remove sets of parentheses for deeply-nested applications.
Another parens-busting opportunity for the $
operator is in our earlier findEntry
function:
findEntry firstName lastName book = head $ filter filterEntry book
We'll see an even more elegant way to rewrite this line with "function composition" in the next section.
If you'd like to use a concise infix operator alias as a prefix function, you can surround it in parentheses:
> 8 + 3
11
> (+) 8 3
11
Alternatively, operators can be partially applied by surrounding the expression with parentheses and using _
as an operand in an operator section. You can think of this as a more convenient way to create simple anonymous functions (although in the below example, we're then binding that anonymous function to a name, so it's not so anonymous anymore):
> add3 = (3 + _)
> add3 2
5
To summarize, the following are equivalent definitions of a function that adds 5
to its argument:
add5 x = 5 + x
add5 x = add 5 x
add5 x = (+) 5 x
add5 x = 5 `add` x
add5 = add 5
add5 = \x -> 5 + x
add5 = (5 + _)
add5 x = 5 `(+)` x -- Yo Dawg, I herd you like infix, so we put infix in your infix!
Function Composition
Just like we were able to simplify the insertEntry
function by using eta conversion, we can simplify the definition of findEntry
by reasoning about its arguments.
Note that the book
argument is passed to the filter filterEntry
function, and the result of this application is passed to head
. In other words, book
is passed to the composition of the functions filter filterEntry
and head
.
In PureScript, the function composition operators are <<<
and >>>
. The first is "backwards composition", and the second is "forwards composition".
We can rewrite the right-hand side of findEntry
using either operator. Using backwards-composition, the right-hand side would be
(head <<< filter filterEntry) book
In this form, we can apply the eta conversion trick from earlier, to arrive at the final form of findEntry
:
findEntry firstName lastName = head <<< filter filterEntry
where
...
An equally valid right-hand side would be:
filter filterEntry >>> head
Either way, this gives a clear definition of the findEntry
function: "findEntry
is the composition of a filtering function and the head
function".
I will let you decide which definition is easier to understand, but it is often useful to think of functions as building blocks in this way: each function executes a single task, and solutions are assembled using function composition.
Exercises
- (Easy) Test your understanding of the
findEntry
function by writing down the types of each of its major subexpressions. For example, the type of thehead
function as used is specialized toAddressBook -> Maybe Entry
. Note: There is no test for this exercise. - (Medium) Write a function
findEntryByStreet :: String -> AddressBook -> Maybe Entry
which looks up anEntry
given a street address. Hint reusing the existing code infindEntry
. Test your function in PSCi and by runningspago test
. - (Medium) Rewrite
findEntryByStreet
to replacefilterEntry
with the composition (using<<<
or>>>
) of: a property accessor (using the_.
notation); and a function that tests whether its given string argument is equal to the given street address. - (Medium) Write a function
isInBook
that tests whether a name appears in aAddressBook
, returning a Boolean value. Hint: Use PSCi to find the type of theData.List.null
function, which tests whether a list is empty or not. - (Difficult) Write a function
removeDuplicates
which removes "duplicate" address book entries. We'll consider entries duplicated if they share the same first and last names, while ignoringaddress
fields. Hint: Use PSCi to find the type of theData.List.nubByEq
function, which removes duplicate elements from a list based on an equality predicate. Note that the first element in each set of duplicates (closest to the list head) is the one that is kept.
Conclusion
In this chapter, we covered several new functional programming concepts and learned how to:
- Use the interactive mode PSCi to experiment with functions and test ideas.
- Use types as both a correctness tool and an implementation tool.
- Use curried functions to represent functions of multiple arguments.
- Create programs from smaller components by composition.
- Structure code neatly using
where
expressions. - Avoid null values by using the
Maybe
type. - Use techniques like eta conversion and function composition to refactor code into a clear specification.
In the following chapters, we'll build on these ideas.