A TypeFamilies Primer
Last updated: Dec 11, 2022
TypeFamilies
is a GHC extension that lets you create formerly-impossible
abstractions in a very straightforward way. It took me several tries before
they clicked for me though, so this is the introduction to TypeFamilies
that
I wish I had read first (although I just found Brent Yorgey’s, which would have done the trick).
I’m treating the subject very narrowly for most of this post, and try to round things out a little at the very end.
Primal Ignorance
If this isn’t the first thing you’ve read about TypeFamilies
, it might be
helpful to forget a few things. The question “what precisely is a type family?”
isn’t going to be very helpful; in general, the terminology for the
constellation of constructions that TypeFamilies
gives you is a huge mess,
with multiple partially-overlapping terms in the wild, none of which are
helpful for developing an intuition about what all of this is about.
I also found various analogies I’d read to be useless, so forget those too.
Type synonyms as type-level functions
Consider the familiar type synonym:
type PairOf a = (a,a)
Normally this is presented as syntactic sugar, with little to do with the type system.
A more interesting way of thinking about PairOf
is as a function (as
suggested by the =
), where evaluation involves substituting occurrences of
the left hand side (LHS) with the right, in the usual way. These functions are
evaluated in your type signatures at compile time.
The analogous regular term-level function would of course be:
pairOf a = (a,a)
Simple enough. Now let’s think about a simple term-level function, and see what an analogous type-level type synonym/function might look like:
last (a: []) = a
last (a: (b:bs)) = last (b:bs)
For our type-level Last
we need something like lists at the type-level, so
we’ll use the common nested tuple representation of (,)
as cons and ()
as
the empty list, e.g.:
x :: (Int,(Int,(Int,()))) -- like [1,2,3]
Hopefully I didn’t just lose you. Remember for now we just care about using this list-like tuple thing in our type signatures.
If you were to charge ahead and try to define Last
using type synonyms
treated as full blown functions, you might come up with:
-- this isn't okay:
type Last (a, ()) = a
type Last (a, (b,bs)) = Last (b,bs)
Unfortunately the compiler will laugh at you. Type synonyms can only have abstract variable arguments on the LHS where above we have tried to deconstruct them using pattern matching, and to define a different RHS for both cases. Further we’ve made the definition recursive. None of that is okay.
In fact the humble type synonym is only a very simple sort of function (a natural transformation or something close) which is very easy to evaluate, but also very limited.
Finally, Enter TypeFamilies
The TypeFamilies
extension lets us define Last
successfully almost exactly
as we did above.
{-# LANGUAGE TypeFamilies #-}
-- we have to "declare" `Last` separately, and the "family"
-- here distinguishes the syntax from a normal type synonym:
type family Last l
-- ...and then can define our "cases":
type instance Last (a,()) = a
type instance Last (a,(b,bs)) = Last (b,bs)
At this point when the type-checker sees Last (a,(b,bs))
in a type signature
it will replace it with Last (b,bs)
, and continue until all of these “type
functions” are evaluated. I may be fudging things a bit but that’s the
general idea.
Since these are a more general sort of type function, they can even be used to replace traditional type synonyms:
type family PairOf a
type instance PairOf a = (a,a)
But what is that good for?
It would be neat to be able to work with “lists” that look like e.g.
(1,('a',("hello",())))
; they are heterogeneous, operations like head
would
be type-safe, etc. So imagine we want to define a last
on types of this
list-like tuple sort of data.
What would the type of last
look like? We know it has to be polymorphic,
since its arguments might look like ()
or (1,(2,()))
, different types of
course. So we’ll need a type-class (and a couple other standard extensions):
{-# LANGUAGE FlexibleInstances, FlexibleContexts #-}
class LastOfTupleList l where
last' :: l -> Last l -- < we use our `Last` type function
Our instances are trivial:
instance LastOfTupleList (a,()) where
last' (a,()) = a -- < the type-checker can see that, indeed, `a`
-- is equal to `Last (a,())`
instance (LastOfTupleList (b, bs))=> LastOfTupleList (a,(b,bs)) where
last' (a,(b,bs)) = last' (b,bs)
Letting us do:
>>> last' (1,(2,()))
2
>>> last' (1,('a',("hello",())))
"hello"
Notice how our instances of Last
and last
have almost the same structure;
this is very common.
Functions… but open!
If you’re a programmer type you were probably irritated by my initial clumsy
definition of last
; why not:
last (a:[]) = a
last (a:as) = last as -- `as` matches whatever can fall through the pattern
-- above; in this case only non-empty lists
Well, I was being sneaky because the type-level analogue isn’t allowed!
type instance Last (a,()) = a
type instance Last (a,as) = Last as -- BAD!
This is because unlike functions, type families are open meaning, like typeclasses, a new instance can be added at any moment. Therefore there’s no way to define a “default” instance to use after all other matches fail, you simply get illegal overlapping synonym instances such as the one above; the order in which we defined the two doesn’t matter.
For some use cases this is what we need, for others (such as our last'
) we’d
really prefer that type synonym families be closed so that we can pattern
match in the usual way.
This feature is apparently
coming soon.
Other details
That should give you an intuition. At this point you might want to stop and read through the following documentation, or continue reading below before coming back to these links for a more refined understanding and additional details:
Associated type synonyms
Since we so often define define a type class in terms of one or more type families, we’re given a simplified syntax for combining them in one place.
class LastOfTupleList l where
type Last l -- an *associated* type family
last' :: l -> Last l
instance LastOfTupleList (a,()) where
type Last (a,()) = a
last' (a,()) = a
When people say “associated types” they mean type functions that are associated with a typeclass using the syntax above.
Injectivity
Type synonym family instances are said to be not injective, meaning two different type functions can map to the same type on the RHS, e.g.
type instance F Int = Bool
type instance F Char = Bool
It’s easy to forget this when building new abstractions, and assume that the
typechecker will infer from the RHS (e.g. Bool
above) the argument passed in
to the type function (Int
or Char
).
Data families
I’ve completely focused on the type synonym flavor of TypeFamilies
above,
but there is also a data/newtype
flavor in which, for each instance
definition the RHS is a brand new type declaration, rather than mapping to an
existing type
-- from http://www.haskell.org/ghc/docs/7.4.1/html/users_guide/type-families.html
data family T a
data instance T Int = T1 Int | T2 Bool -- new constructors T1 and T2 defined here
newtype instance T Char = TC Bool
Because each instance maps to a unique type, data families are injective allowing the type checker to infer the LHS of the equation knowing the right.
More other things
TypeFamilies
provides the syntaxa ~ b
to indicate type equality constraints; this is especially useful with type synonym functions, but can be useful on its own as well.- kind signatures are required for type functions on types taking arguments, e.g.
Maybe