The bottom of the Haskell Pyramid
- 2021-03-16 -Inspired by Kowainik's Knowledge Map that tries to cover many Haskell concepts a Haskeller will probably run into through their journey, as well as libraries and tools, I've decided to make a more modest list of things one should be familiar with to be productive with Haskell, the bottom of The Haskell Pyramid if you will, hoping it will help to focus learners of the language that might get loss in the vast amount of information.
Installing and running a Haskell toolchain
This could be a bit overwhelming to beginners as there are many choices, should they install the toolchain from their package manager? Download ghc binaries? Install stack? ghcup? maybe even nix?
In my opinion installing Haskell using stack or ghcup probably lead to the best results.
A new user should be able to run ghc
and ghci
when starting, the rest can come later.
Defining values and functions
The basis of everything. A Haskell module is made of definitions.
Expressions, operators and applying functions
What goes in the right-hand side of the equals sign in a definition?
Things like "Hello world"
, 1 + 1
, min (max 43212 12345) 43251
.
Indentation rules
How does Haskell know where a definition begins and where it end? Which arguments are applied to which functions? Haskell uses indentation using spaces to figure that out - best to learn how it works quickly to avoid subtle bugs and cryptic errors.
Compiler errors
The compiler will notify us when our program doesn't make sense. The fact that Haskell provides a lot of feedback about our program helps us a lot from trying to figure our everything ourselves, gotta get used to Haskell errors because they are here to help. But it also means we have to learn what they mean to be effective.
First, try to read them carefully so you get used to them. At some point you won't even need to read them, just know they are there.
If expressions & Recursion
Recursion is very foundational to Haskell. It is used for iteration, managing state, and more. It is best to get good control over it as it will appear many times.
The evaluation model, substitution, equational reasoning and referential transparency
A lot of big words that describe how to "understand Haskell code".
Understanding how Haskell evaluates our program is probably the most important tool available when we need to debug our programs (and when writing them, of course)
Debug tracing, stack traces
... but often programs can be quite big and we need debug traces and stack traces to help guide us in the right direction of the bug.
Total vs partial functions
How to make sure all input cases are taken into account? What does it mean when they aren't? How can Haskell help us make sure our program won't crash during runtime because we didn't think of a certain input? How can it guide our API design to make it more robust?
Parametric Polymorphism
Also known as "generics" in other circles (but not in Haskell as Generics mean something else). Helps us write fuctions that are more reusable, composable and precise.
Higher order functions
Higher order functions are also foundational, they help us avoid repeating ourselves and help us extract the essence of algorithms such as traversing structures or sorting a structure without having to go into the details of "what to do with the elements when traversing" or "how should I compare two elements when sorting".
Lists and tuples
We use them all the time, maybe too much to be honest.
Annotating types
In Haskell annotating types isn't mandatory in most cases, but type annotation can sometimes serve as a tiny summary of code, and also will provide some safety net that checks that our summary match our code, so we are more likely to implement the thing we wanted to implement.
Using and chaining IO operations
To create a useful program that interacts with the world we often need to use IO
.
Does that break referential transparency? (spoiler: no)
Functional core imperative shell
This is a pattern for designing programs that interact with the world. In Haskell we like to keep our functions uneffectful because they are more composable, reusable and are easier to test, but we still need some effects. One way to keep that to a minimum is to design a program in layers: one thin imperative and effectful layer does that interacting, and then the core layer, the logic of the program, that doesn't do effects.
An example of mixing effects and logic: when making a fizzbuzz program, write a loop that increments an integer and checks which bucket it falls, prints that number, increments the number, and repeats the process until we reach the final number to check.
To write the same program with "functional core imperative shell" in mind, We could write the core logic that generates a list of results, and then write a function that writes the numbers on screen
(and that function looks like this):
writeFizzBuzzResults :: [String] -> IO ()
writeFizzBuzzResults results = mapM_ putStrLn results
Modules
Learn about organizing code and how to import functions and other things from other modules.
Using hoogle
A wonderful tool for finding functions and types that many other languages lack.
ADTs
Let's make our programs safer, easier to understand and more modular.
Pattern matching
Let's write our own control flow over ADTs.
Installing editor tools & Ghcid
In my opinion having fast feedback dramatically increase a programmer's productivity.
Some people will say Ghcid is all you need, but I think editor features like error-at-point provide faster feedback than ghcid when working on a single module, so those are preferrable. Though when making a big change across many modules it's hard to beat Ghcid at the moment.
newtypes
They are like a special case of data
but have some benefits over it so
it's worth to know as they are wildly used.
Common data types: Maybe, Either
We use them (and often over use them) a lot.
Either
especially can be used for handling errors in uneffectful code
instead of IO exceptions.
A package manager
How can I generate random numbers? How do I parse a json string? Send a file over the network? Write a graphical snake game?
The easiest way to accomplish these goals is to use an external library (packages in Haskell), and the easiest way to import a library, build it, and use it with ghc is through a package manager.
There are many options of package managers, stack and cabal-install are the most common ones.
Make sure to read the docs. If you go the cabal-install path,
make sure you use the v2
version of the commands. Note that you are not tied to one
package manager whichever you choose, you can always switch or even use both if you'd like.
Finding packages
How to find a proper Haskell package for your usecase can be tricky. Most packages can be found on Hackage, some can be found on Stackage.
It's possible to google the usecase + Haskell for example "Haskell json", click on a few links, see if the documentation makes sense and if the package is popular, But sometimes it's easier to ask in Haskell circles.
Using the right data structures: text, containers, bytestring, vectors
A lot of Haskell programs are slower than one would expect because of choosing
the wrong algorithm or the wrong data structures such as using String
instead of Text
,
using lists when one needs fast lookup or indexing instead of using HashMap
, Map
or Vector
,
or using Set
when one needs fast insertion (and order doesn't matter), or using nub
to
remove duplicate elements in a list instead of Set.toList . Set.fromList
.
Remember that using the right data structure is also useful for understanding the code, as the choice of data structure reveals which algorithms are important for the program.
Parse, don't validate
Parse, don't validate is an important pattern that helps write code that is safer and more predictable.
Typeclasses and deriving typeclasses instances
One of the defining features of Haskell. Adds ad-hoc polymorphism and gives us the ability to create interfaces and abstractions over groups of types.
With typeclasses with can write functions like sort :: Ord a => [a] -> [a]
,
that can sort a list of any type as long as its internal order is defined.
Numbers typeclasses
Working with numbers can be a bit complicated because number groups (such as integers, floating points, etc) are both similar and different. The numbers typeclass hierarchy tries to capture their similarities and differences.
Common Typeclasses: Show, Read, Eq, Ord, Semigroup, Monoid
How do we display a value for debugging purposes as a String? How do we convert a string to a value? How do we compare values? How do we append two values?
Kinds
Kinds are to types what types are to values. It creates some sort of a separation between different types and defines where we can use different kinds of types.
Common Typeclasses: Foldable, Traversable, Functor, Applicative, Monad
These typeclasses are the most common ones that define abstraction around certain types and their selected operations. You will find them at the core API of many data structures, concurrency libraries, parsing, and more.
Reader, State, Except
These modules help reduce boilerplate by emulating imperative style programming
using their various type classes interfaces (Such as Functor
and Monad
).
They are frequently used in Haskell code in the wild so they're worth knowing.
Language extensions
The GHC compiler provides extra (often experimental) language features beyond what
standard Haskell provides. Some are nice to have such as LambdaCase
and NumericUnderscores
,
Some are very useful but can make valid Haskell code invalid such as OverloadedStrings
,
Some provide really important functionality such as FlexibleInstances
and BangPatterns
,
Some are sometimes incredibly useful such as TypeFamilies
but often aren't,
and some should probably not be used such as NPlusKPatterns
.
It's probably better to run into language extensions organically and learn when you need to rather than try to learn them all.
The GHC2021 proposal classifies Haskell extensions, might be useful to go over it at some point.
Tests: tasty, hspec, hunit, doctest, etc
Ever heard that you don't need to write tests for Haskell? Forget that, choose a unit testing framework, write tests and enjoy the gurantees that come from both strong static typing and tests!
Also learn about property-based testing and QuickCheck it's beautiful.
Monad transformers and mtl
While not strictly necessary, many programs and libraries in the wild use monad transformers (Types that provide a monadic interface that can be combined with one another), and mtl, which is a library of interfaces of capabilities.
Extras
forkIO, async, STM, TVar
Want to do some concurrency? Haskell got you covered.
Simple lenses
It took me maybe 3-4 years to finally think about learning lenses, and even then I've only ever scratched the surface with them. While very powerful and expressive, they aren't really necessary to build programs, but many programs and libraries use basic lenses, so if (or when) you run into them in the wild come back and learn them.
In my experience
view
,get
,over
,.
,lens
, andmakeLenses
are all you need to know for most cases.
Profiling
At some point the average Haskell will write a program that won't perform well enough. It's a good idea to learn how to figure out what makes the program slow. But this can be posponed to later.
Template Haskell
Template Haskell (a Haskell metaprogramming framework) might creep at you sometimes as a user, but learning to write template haskell isn't usually necessary.
Lambda Calculus
You might have heard about the lambda calculus, it is a minimalistic functional language that is at the heart of Haskell. Learning the lambda calculus means learning about the power of functions, how functions can be used to represent data, and how to evaluate expressions using pen and paper.
I found it to be very interesting and eye opening, seeing the core of Haskell naked and exposed without syntactic sugar, but it's not required to use or understand Haskell.
Summary
Haskell has a really high ceiling of concepts, but one doesn't need master Haskell to be productive with it. Yes, there might be additional concepts to learn in order to use certain libraries, but many applications and libraries can (and have) been built on these concepts alone.
Don't believe me? Here are a few applications that (as far as I could tell - corrections are welcomed!) don't use anything more sophisticated than what I mentioned in this article:
- Aura - A secure, multilingual package manager for Arch Linux and the AUR.
- Elm-compiler - Compiler for Elm, a functional language for reliable webapps.
- Haskellweekly.news - Publishes curated news about the Haskell programming language.
- Patat - Terminal-based presentations using Pandoc.
- vado - A demo web browser engine written in Haskell.