code / art / projects

Simple-actors 0.4.0: A Few Interesting Design Bits

simple-actors is my library for more structured concurrent programming based on the Actor model. It’s been a fun vehicle for exploring concurrent semantics, and an opportunity to solve some tricky API design problems in pretty clever ways. I want to present a handful of these problems, and the solutions I came up with, below.

Short digression: I love distributed systems, but this library has nothing to do with distributed programming. Also performance.

Goals and Constraints

To frame things, these were more or less the goals of the library design:

  • create a friendly, light-weight (hopefully intuitive), non-leaky, non-brittle eDSL for concurrent algorithms
  • base functionality on existing typeclasses and abstractions wherever possible
  • employ an economy of concepts; avoid creating new things requiring names and explanations, unless absolutely necessary

And here are the three little case-studies.

Eschew channel abstraction

The first challenge was how to keep the abstraction for message routing as minimal as possible. Ideally we would like our spawn function to return a token (we call it a “Mailbox”) that is used by other actors as a reference for sending messages to the spawn-ed actor:

spawn :: Behavior a -> Action (Mailbox a)

But then how do you handle two actors sending messages to each other? Or an actor sending itself a message for that matter?

do a <- spawn $ senderTo b
   b <- spawn $ senderTo a
   c <- spawn $ senderTo c
   send a ...

The sequencing of monadic actions in a do block make scoping an issue here, so what can we do?

Should we insist that actors like a and b pass their mailboxes explicitly in messages? But then we’d have to build implicit access to an actor’s own Mailbox into our Behavior abstraction since we can’t even define a Behavior closed over it’s own Mailbox, as in c above.

What about separating channel creation and spawning into two distinct functions, where we spawn a Behavior listening on a channel? But then we have to decide what should happen when two actors are spawned on the same channel, etc. Let’s just not do any of that.

The solution turns out to be a matter of knowing the right classes, in particular the exotic MonadFix (read up on it here). Combined with GHC’s lovely DoRec extension, our crisis resolves itself, allowing:

{-# LANGUAGE DoRec #-}
do rec a <- spawn $ senderTo b
       b <- spawn $ senderTo a
       c <- spawn $ senderTo c
   send a ...

IMHO this looks much tastier than what erlang has to offer.

Mailbox should support a rich set of transformations

A second design challenge has been to try to realize the full potential of our internal “chan” pair type (of which Mailbox only is exposed) in terms of CT-ish transformations supported.

Some background: a concurrent Chan separated into a pair of “read” and “write” sides is attractive, because it suggests the possibility for the “read” end to have a Functor instance, while the “write” side suggests it could be a contravariant functor, supporting an operation:

contramap :: (Contravariant f)=> (b -> a) -> f a -> f b

Consider how Control.Concurrent.Chan doesn’t permit that possibility.

Initial versions of chan-split (used internally) defined Functor and Contravariant instances, supported with a clumsy GADT representation which looked like, e.g.

data InChan i where
    InChan :: (i -> a) -> C.Chan a -> InChan i

This was removed when I realized I could support the operations (“read” / “write”) and powerful transformations that weren’t possible before, by defining Mailbox as simply a wrapper around writeChan c itself. Likewise, our internal “write end” becomes a wrapped readChan c action:

newtype Mailbox a = Mailbox (a -> IO ())
newtype Messages a = Messages (IO a)

Here are the nice transformations we’ve defined so far; more are possible. N.B. that these don’t add anything in terms of expressiveness to the actor model, i.e. we could envision doing the same sort of thing trivially with actors.

Getting more expressive with join patterns

The final and most recent design bit I wanted to share address the problem of “synchronization”.

Consider how you would go about trying to definean actor that “pairs up” inputs received from a pair of actors; your solution would involve an actor that kept a possibly-ever-expanding buffer of unmatched messages. This is a limitation inherent in the actor model, and leads to needless tragedy like erlang’s “selective receive”.

What we want is to be able to define an actor that can block on multiple channels. How do we do that without weird channel abstractions creeping into our API?

I’d been mulling around the idea of creating a completely new beast, sort of dual to an Actor that could read from arbitrary chans, and feed inputs one-at-a-time to an actor. A “Reducer” or something.

Luckily I had a better idea while reading about process calculi, in particular the join calculus. The solution makes the library formally more expressive while removing complexity from the UI and leaving the semantics of behaviors and message-passing unchanged!

The trick was to come up with a class that would allow the spawn function to introduce assymetry between spawned Behavior inputs and Mailboxes, i.e. we create a spawn that can return multiple Mailboxes which are “joined” to a single Behavior input in the background:

sumTuple :: Behavior (Int, Int)

do b <- spawn sumTuple
   send b (4, 1) 
   -- or. like magic...
   (b1, b2) <- spawn sumTuple
   send b1 4
   send b2 1

See the docs for a few other examples of the new behavior of spawn. The Sources class works using TypeFamilies and an associated Joined type that is a function of the return type of spawn. Use the source.


Is this sort of post interesting, or is this kind of case study too esoteric or domain-specific to be useful? Let me know. And if you have ideas of your own or want to help with performance testing, do a

git clone

and play with it.