Translating some stateful bit-twiddling to haskell

I just started implementing SipHash in hashabler and wanted to share a nice way I found to translate stateful bit-twiddling code in C (which makes heavy use of bitwise assignment operators) to haskell.

I was working from the reference implementation. As you can see statefulness and mutability are an implicit part of how the algorithm is defined, as it modifies the states of the v variables.

#define SIPROUND                                        \
  do {                                                  \
    v0 += v1; v1=ROTL(v1,13); v1 ^= v0; v0=ROTL(v0,32); \
    v2 += v3; v3=ROTL(v3,16); v3 ^= v2;                 \
    v0 += v3; v3=ROTL(v3,21); v3 ^= v0;                 \
    v2 += v1; v1=ROTL(v1,17); v1 ^= v2; v2=ROTL(v2,32); \
  } while(0)

int  siphash( uint8_t *out, const uint8_t *in, uint64_t inlen, const uint8_t *k )
{

  /* ... */

  for ( ; in != end; in += 8 )
  {
    m = U8TO64_LE( in );
    v3 ^= m;

    TRACE;
    for( i=0; i<cROUNDS; ++i ) SIPROUND;

    v0 ^= m;
  }

I wanted to translate this sort of code as directly as possible (I’d already decided if it didn’t work on the first try I would burn my laptop and live in the woods, rather than debug this crap).

First we’ll use name shadowing to “fake” our mutable variables, making it easy to ensure we’re always dealing with the freshest values.

{-# OPTIONS_GHC -fno-warn-name-shadowing #-}

We’ll also use RecordWildCards to make it easy to capture the “current state” of these values, through folds and helper functions.

{-# LANGUAGE RecordWildCards #-}

And finally we use the trivial Identity monad (this trick I learned from Oleg) which gets us the proper scoping we want for our v values:

import Data.Functor.Identity

Here’s a bit of the haskell:

siphash :: Hashable a => SipKey -> a -> Word64
siphash (k0,k1) = \a-> runIdentity $ do
    let v0 = 0x736f6d6570736575
        v1 = 0x646f72616e646f6d
        v2 = 0x6c7967656e657261
        v3 = 0x7465646279746573

    ...

    v3 <- return $ v3 `xor` k1;
    v2 <- return $ v2 `xor` k0;
    v1 <- return $ v1 `xor` k1;
    v0 <- return $ v0 `xor` k0;

    ...

    -- Initialize rest of SipState:
    let mPart          = 0
        bytesRemaining = 8
        inlen          = 0
    SipState{ .. } <- return $ hash (SipState { .. }) a

    let !b = inlen `unsafeShiftL` 56

    v3 <- return $ v3 `xor` b
    -- for( i=0; i<cROUNDS; ++i ) SIPROUND;
    (v0,v1,v2,v3) <- return $ sipRound v0 v1 v2 v3
    (v0,v1,v2,v3) <- return $ sipRound v0 v1 v2 v3
    v0 <- return $ v0 `xor` b

    ...

    (v0,v1,v2,v3) <- return $ sipRound v0 v1 v2 v3

    return $! v0 `xor` v1 `xor` v2 `xor` v3

If you were really doing a lot of this sort of thing, you could even make a simple quasiquoter that could translate bitwise assignment into code like the above.