simple-actors 0.4.0: a few interesting design bits
Last updated: Dec 11, 2022
simple-actors is my library for more structured concurrent programming based on the Actor model. It’s been a fun vehicle for exploring concurrent semantics, and an opportunity to solve some tricky API design problems in pretty clever ways. I want to present a handful of these problems, and the solutions I came up with, below.
Short digression: I love distributed systems, but this library has nothing to do with distributed programming. Also performance.
Goals and Constraints
To frame things, these were more or less the goals of the library design:
- create a friendly, light-weight (hopefully intuitive), non-leaky, non-brittle eDSL for concurrent algorithms
- base functionality on existing typeclasses and abstractions wherever possible
- employ an economy of concepts; avoid creating new things requiring names and explanations, unless absolutely necessary
And here are the three little case-studies.
Eschew channel abstraction
The first challenge was how to keep the abstraction for message routing as
minimal as possible. Ideally we would like our spawn
function to return
a token (we call it a “Mailbox”) that is used by other actors as a reference
for sending messages to the spawn
-ed actor:
spawn :: Behavior a -> Action (Mailbox a)
But then how do you handle two actors sending messages to each other? Or an actor sending itself a message for that matter?
do a <- spawn $ senderTo b
b <- spawn $ senderTo a
c <- spawn $ senderTo c
send a ...
The sequencing of monadic actions in a do
block make scoping an issue here,
so what can we do?
Should we insist that actors like a
and b
pass their mailboxes explicitly
in messages? But then we’d have to build implicit access to an actor’s
own Mailbox into our Behavior abstraction since we can’t even define a
Behavior
closed over it’s own Mailbox
, as in c
above.
What about separating channel creation and spawning into two distinct functions,
where we spawn a Behavior
listening on a channel? But then we have to
decide what should happen when two actors are spawned on the same channel, etc.
Let’s just not do any of that.
The solution turns out to be a matter of knowing the right classes, in
particular the exotic MonadFix
(read up on it
here). Combined with GHC’s lovely
DoRec
extension, our crisis resolves itself, allowing:
{-# LANGUAGE DoRec #-}
do rec a <- spawn $ senderTo b
b <- spawn $ senderTo a
c <- spawn $ senderTo c
send a ...
IMHO this looks much tastier than what erlang has to offer.
Mailbox should support a rich set of transformations
A second design challenge has been to try to realize the full potential of
our internal “chan” pair type (of which Mailbox
only is exposed) in terms
of CT-ish transformations supported.
Some background: a concurrent Chan
separated into a pair of “read” and
“write” sides is attractive, because it suggests the possibility for
the “read” end to have a Functor
instance, while the “write” side suggests
it could be a contravariant functor, supporting an operation:
contramap :: (Contravariant f)=> (b -> a) -> f a -> f b
Consider how Control.Concurrent.Chan
doesn’t permit that possibility.
Initial versions of chan-split (used internally) defined Functor and Contravariant instances, supported with a clumsy GADT representation which looked like, e.g.
data InChan i where
InChan :: (i -> a) -> C.Chan a -> InChan i
This was removed when I realized I could support the operations (“read” / “write”)
and powerful transformations that weren’t possible before, by defining Mailbox
as simply a wrapper around writeChan c
itself. Likewise, our internal “write
end” becomes a wrapped readChan c
action:
newtype Mailbox a = Mailbox (a -> IO ())
newtype Messages a = Messages (IO a)
Here are the nice transformations we’ve defined so far; more are possible. N.B. that these don’t add anything in terms of expressiveness to the actor model, i.e. we could envision doing the same sort of thing trivially with actors.
Getting more expressive with join patterns
The final and most recent design bit I wanted to share address the problem of “synchronization”.
Consider how you would go about trying to definean actor that “pairs up” inputs received from a pair of actors; your solution would involve an actor that kept a possibly-ever-expanding buffer of unmatched messages. This is a limitation inherent in the actor model, and leads to needless tragedy like erlang’s “selective receive”.
What we want is to be able to define an actor that can block on multiple channels. How do we do that without weird channel abstractions creeping into our API?
I’d been mulling around the idea of creating a completely new beast, sort of dual to an Actor that could read from arbitrary chans, and feed inputs one-at-a-time to an actor. A “Reducer” or something.
Luckily I had a better idea while reading about process calculi, in particular the join calculus. The solution makes the library formally more expressive while removing complexity from the UI and leaving the semantics of behaviors and message-passing unchanged!
The trick was to come up with a class that would allow the spawn function to
introduce assymetry between spawned Behavior
inputs and Mailbox
es, i.e.
we create a spawn
that can return multiple Mailbox
es which are “joined”
to a single Behavior
input in the background:
sumTuple :: Behavior (Int, Int)
do b <- spawn sumTuple
send b (4, 1)
-- or. like magic...
(b1, b2) <- spawn sumTuple
send b1 4
send b2 1
See the docs
for a few other examples of the new behavior of spawn
. The Sources
class
works using TypeFamilies and an associated Joined
type that is a function
of the return type of spawn
. Use the source.
Meta
Is this sort of post interesting, or is this kind of case study too esoteric or domain-specific to be useful? Let me know. And if you have ideas of your own or want to help with performance testing, do a
git clone https://github.com/jberryman/simple-actors.git
and play with it.