A Brief Introduction to Haskell
-
A brief introduction to Haskell
So, what I really wanted to do was to make a post about how functional programming concepts are useful after a discussion that came up in the shoutbox over my suggested use of a function as the second argument of
string.replacein @Scuzz’s omdb plugin thread, but I realized that some of the more subtle examples are hard to show without rigorous syntax. So, I thought I would use this post to outline a bit of Haskell syntax which will touch on some functional concepts and lay the ground work for understanding examples in a later post arguing the utility of functional programming. DISCLAIMER: I’m not an expert in Haskell, nor have I built any real software with it, so forgive me if I remain in the conceptual domainSo, here we go!
First, if you want to play around with Haskell, it seems the go to is the Glasgow Haskell Compiler (GHC). Check out the Installation page at haskell.org
(P.S. on mac
brew install ghci)Maths
Arithmetic
As with all of our favorite interpreted languages, we can sit at the interpreter and do math all day:
> 2 + 4 6 > 2 * 4 8 > 2 ** 4 -- 2 ^ 4 (line comments in Haskell begin with --) 16.0Great, so we know we have some predefined types and operations (technically speaking in the manner we’ve presented above, those literals are not primitives; they are actually boxed, but we’ll get there in time).
To exit the interpreter, use
:quitVariables
Like other languages, we can bind an expression to a name.
> let x = 10 > x 10Variables MUST begin with a lowercase letter.
Unlike other languages, we cannot modify the value of x in a literal sense. We can rebind x, but this will create a new object rather than updating the old one. This leads to an important result:
> let x = 10 > x 10 > let x = x + 10 > xIn our favorite imperative language, we would expect to see the value of
xupdated to 20. However in Haskell it doesn’t evaluate.
Why? because we have definedxin terms of itself.
If we try to expand it, we getx + 10. Expand that, we getx + 10 + 10.If that doesn’t make sense, think about it this way:
what we said is “where ever you seexreplace it withx + 10”.
Now, if I asked you “what isx?” here, you can only answer “x + 10”
As you can see, this can never resolve to a value.Functions
Of course, we can write our own math functions
> let addOne x = x + 1 > addOne 1 2The syntax for defining a function is
let <function name> <variables> = <expression>
The syntax for calling is just
<function name> variables.Function names MUST begin with a lowercase letter.
The interesting thing in Haskell (as well as other functional languages) is that functions are first-class citizens, meaning they can be treated like any other data. This allows us to do things like pass functions as arguments.
> let square x = x * x > let applyTwice f x = f (f x) > square 2 4 > applyTwice square 2 -- (2^2)^2 16This is obviously a contrived example, but it shows that we can pass a function and then call that function.
Loading Haskell files
Ok, before we go any further, lets get off the cumbersome interpreter. A lot of future example code will be in files. When you see the ‘>’ you’ll know I’ve switched back to the interpreter, but just to make it clear, I’ll always show myself loading the file in question
e.g.-- add.hs add x y = x + y -- Function declaration syntax in a file (no need for let)> :load add -- load syntax (leave off the .hs) [1 of 1] Compiling Main ( add.hs, interpreted ) Ok, modules loaded: Main. > add 2 4 6Data
Custom Data Types
So at this point, we know how to
- do stuff with built in types
- declare variables
- declare and invoke functions
- load code from a file
Next, we would like to be able declare our own data types, much like being able to define our own classes.
For this section, we will be implementing our own
Booltype which we’ll callLoobwith valuesYesandNo-- loob.hs data Loob = Yes | Nodata types MUST begin with a capital letter, like how functions must be lowercase.
What did this actually do, @Almost?
This defines a new data type,Loob
with 2 data constructors,YesandNo.There isn’t a good equivalent to a data constructor in imperative languages, but the easiest way to think about it is kind of like an enum. We will see later how this notion isn’t quite right.
enum Loob { Yes, No }Let’s take a look at our new type.
> :load loob > let x = Yes > x Yes > :type x -- Gets type information from a variable x :: Loob -- Format: <name> :: <type>The
let x = Yesstatement instructs the interpreter to use theYesdata constructor to instantiate a newLooband assign it to the variablex.
Meaning the data ofxis nowYesand the type ofxis nowLoob.In imperative space this is kind of like
Loob x = Loob.YesGoing back to loob.hs, let’s define our own
andfunction which we’ll calldna.-- loob.hs data Loob = Yes | No dna Yes Yes = Yes dna No _ = No dna _ No = No> load loob.hs > dna Yes Yes Yes > dna Yes No No > dna No Yes No > dna No No NoThis example shows off a couple of important points in Haskell.
Functions can take arbitrary parameters
the_in the function definition means “any argument” in can be any type, any value, we don’t care. We know thatNoand Anything is alwaysNo, so we don’t even need to look at the second argument. This does have the side effect thatdna No 1234is defined asNowhich may or may not be desirable. Since we don’t give_a name, we can’t use it in the body of our function. It literally means “I don’t need to know, nor even care what this data is”Functions can have multiple definitions over the same argument data type
It might be hard to see what I mean due to the usage of_but we could have written out all the cases explicitly:dna Yes Yes = Yes dna No Yes = No dna Yes No = No dna No No = NoIn an imperative world, this should be alarming. We just overloaded
dna4 times and each one takes 2Loobarguments! If I were to calls:> let x = No > let y = Yes > dna x yHow should it know which of the 4 functions to use?
This is where our enum notion falls apart.
Haskell uses the data constructor for the variables to perform pattern matching to find the first definition that matches the input.
therefore our example above explicitly matches the 2nd definition for
dna, so that one will be used.While defining our own pseudo primitives is nice and all, can we define actually useful types, like say
Pair?
Sure. While we didn’t take advantage of it before, our data constructors can act as a tag for other data:-- pair.hs data Pair t = Pair t t -- Declare data type Pair of t (where t is an arbitrary type). You might write this as Pair<T> in C++ or Java left (Pair x y) = x right (Pair x y) = yDon’t be confused here. I’ve named both the data type and data constructor “Pair”.
in our functions,leftandright, the argument is weird. That looks like how we construct aPairdoesn’t it?That’s exactly what it is. Haskell’s pattern matching basically says “Could I use this syntax to construct the object that you gave me? If yes then we have a match, if not we don’t have a match”
To try it out:
> :load pair > let p = Pair 1 2 > left p 1 > right p 2Lists
Haskell’s list syntax is similar to that of python:
> [1,2,3,4] [1,2,3,4]Haskell, however, has syntax to create a list over a range:
> [1..10] [1,2,3,4,5,6,7,8,9,10]One interesting fact about Haskell lists, due to the fact that Haskell performs lazy evaluation, an infinite list is completely legal and will not cause a crash or hang unless you attempt to evaluate the whole list:
> let x = [1..] > -- no crash :)Accessing an element of the list can be done with the
!!operator.> let x = [1..] -- infinite list > x !! 9 10 -- 0 indexedRemember, the lazy evaluation means that only as much as is actually used is evaluated which means that indexing an infinite list is completely legal.
Given a list, you can prepend it with the
:operator> let x = [2,3,4] > 1:x [1,2,3,4]This is handy since we can use this list building syntax as a pattern in a function (as we discussed in the last section)
> let head (x:xs) = x -- x is the first element, xs is the rest of the list > head [1,2,3] 1 > let tail(x:xs) = xs > tail [1,2,3] [2,3]What we’re saying here is that the argument of
headandtailis “Something (x) appended to a list (xs)”An aside, Haskell represents strings as lists of characters
> head "Almost" 'A':type
Simple types
I brought up:typediscreetly in the custom data types section. You can learn a lot about how Haskell works by examining the output of:type.> :type 1 1 :: Num a => aRead:
1 has type (Num a) which reduces to aWhat this actually tells us is that:
1is not a primitive, but rather of typeaais restricted to those types in the TypeclassNum
That second point is similar to the concept of requiring that a generic type extends some known type in Java:
public class Foo<? extends Collection>Other simple examples:
> :t True -- :t works in place of :type True :: Bool > :t "Almost" "Almost" :: [Char] > :load loob -- from a previous example > :t Yes Yes :: LoobFunction types
we see some new syntax when we look at functions
> let square x = x * x > :t square square :: Num a => a -> aLet’s ignore the point that
amust be in theNumType class and focus on thea -> a
You can read this asa goes to a
Meaning, square takes a variable of typeaand returns a typea.
As a concrete example,squarecan take an integer and return an integer.What’s the type of a function that takes 2 variables?
> let add x y = x + y > :t add add :: Num a => a -> a -> aHmm.
a -> a -> ahuh?
“a goes to a goes to a?” I don’t get it.
well, let’s apply some grouping to help us out.
Let’s read it asa -> (a -> a)For now, trust me that this is better than(a -> a) -> a. I’ll explain why this is the case later.
In this view, we read “a goes to (a goes to a).”
Still a little fuzzy? Well, remember that
a -> ameans that it’s a function that takes anaand returns ana.
Therefore
a -> (a -> a)should mean that it’s a function that takes anaand returns a function that takes anaand returns ana.But that doesn’t make sense. Add takes 2 numbers and returns one number. It doesn’t take 1 number and return a function.
Or does it?> :t (add 1) (add 1) :: Num a => a -> aWhy did that work? Add takes two arguments god damn it! That should have given us an error!
In imperative languages we’re used to the fact that a function that has two arguments requires that you supply two arguments when you call it (or at least that you specify default in the case of optional values such as
int myFunc(int myInt, bool myBool = false)).Haskell has no such requirement.
When we evaluate
add 1 2anywhere we seexwe replace it with1and anywhere we seeywe replace it with2.
Thereforeadd 1 2is equivalent to1 + 2
But what forces us to replace bothxandy? Why can’t we replace justx?
Why shouldn’t we calladd 1and get back1 + y?
> Because you can’t evaluate1 + y, @Almost…
Exactly. We can’t evaluate it until we are supplied with ay. So instead of trying to evaluate it, we just give that expression back and say “give me ayand I’ll evaluate this.”Gee, that sounds like a function, doesn’t it?
Rather than just looking at the type, let’s try giving that new function a name
> let addOne = add 1 > addOne 2 3Woah. So it really does give us back a function.
This magic process of partially evaluating functions is called Currying (Named after one Haskell Curry. The guy was a bit of a narcissist: he named both a programming language and a mathematical concept after himself.)
Currying has a very useful property in that it naturally supports code reuse and encapsulation
A simple contrived example:
> let pow x y = y ** x --Yes, this is a bit backwards from how we usually think of pow > let square = pow 2 > let cube = pow 3As you can see, it gives us a really simple way to build software without reinventing the wheel or writing many many lines of code. On a philosophical level, this makes it easy for us to show that
cubeis just a specific case of the more generalpow. We don’t get such an obvious connection if we writecubeascube x = x * x *x.Now. An astute reader might be thinking: isn’t it kind of weird that we have 2 ways to evaluation functions: fully and partially?
Actually this is not the case. There is only one way to evaluate a function: By supplying one argument.
add 1 2is not passing 2 arguments toadd, rather it is actually evaluating as(add 1) 2- apply1toaddand the apply2to the resulting function.What implication does this have on how Haskell treats functions?
In Haskell a function has exactly one input and exactly one output.
Either or both of the input and output can be functions, but there cannot be more than one input or output.Now, before we wrap up our section on function types and currying, let’s talk about why we chose
a -> (a -> a)instead of(a -> a) -> a.
How do we read the second one?
“addtakes a function and returns ana”.This should be valid then:
> add addclearly that doesn’t make sense, therefore (with all the handwaving I have in me :P) this is not a valid way to read that type signature.
In the case where the argument truly is a function, Haskell will indicate that to us
>let applyToOne f = f 1 > :t applyToOne applyToOne :: Num a => (a -> t) -> tDon’t be scared by
t. This just tells us that the return type of the functionfas well asapplyToOneisn’t necessarily the same type as the type o the input tofe.g.
> let equalsOne x = x == 1 > :t equalsOne equalsOne:: (Eq a, Num a) => a -> boolConclusion
tl;dr There is no tl;dr. Go read it.
Whew. Rushed is probably a better word than brief. I hope that was reasonably easy to follow, and if not, please ask questions. I want to both help you understand and improve my own technical communication.
If some sections make absolutely no sense even with some of my clarification, don’t worry too much. This stuff is very different from the usual “state-machine” style of programming that most of us are familiar with. But just because it’s different, don’t count it out. A lot of the concepts introduced in functional programming are making their way back into mainstream programming languages (e.g. Lambda Expressions, Nullable Types, LINQ).
If you’re interested in learning more about Haskell, Learn You a Haskell for Great Good is the standard Textbook. It’s free online legally, so no need to go searching for it.
Hope you enjoyed that. Help me improve this!
<3 Almost