Brandon.Si(mmons)

code / art / projects

Hacking Around on an FV-1 Based Guitar Pedal

This is a diary-style post where I chronicle what I learn digging into a digital guitar effects pedal for the first time. I’m not sure how useful or interesting it will be to others.


I picked up an oddball modulation pedal from a “boutique” maker (being coy here) and decided there were a bunch of things I could improve about it, so I thought I’d try to tweak it. I know almost nothing about electronics.

I took the guts out of the enclosure and first googled the chips on the board. The largest was labeled “SpinSemiconductor FV-1” This is a “batteries included” DSP chip and the only real product from Spin.

IYI (IF YOU’RE INTERESTED): KEITH BARR AND THE FV-1:

One of the pleasures of this project has been learning a little about the designer of the FV-1, Keith Barr. He passed away in 2010 and some of his contributions are summarized in this tribute: cofounder of MXR and creator of the Phase 90, founder of Alesis, pioneer in bringing digital recording and then digital reverb to the masses, etc.

The FV-1 was developed and released in the mid-2000s and is responsible for the boom in “boutique” reverb and (along with the PT2399) non-BBD delay pedals, being used by the likes of: Old Blood Noise, Catalinbread, Neunaber, Earthquaker, Red Panda, Keeley, Dr. Scientist, Walrus Audio, etc, etc.

I’m new to the space but it seems like Keith was the only person doing this sort of work (cheap DSP chip that acts like another analog component, accessible to hobbyists) and with his death there is sadly no “next-gen” FV-1 coming along.

I also noticed what turned out to be an EEPROM chip And a spot marked “pgm” where a 5-pin connector header could be soldered, presumably for programming. I guessed the EEPROM was where the code itself was stored (having read this was a capability).

I traced the “pgm” solder pads carefully (using the setting on my multimeter that beeps when a circuit is formed), and found they all connected to the EEPROM. I drew up a little picture showing how the header and EEPROM chip relate.

At this point I was feeling pretty hopeful that I might be able to both write and possibly dump/decompile code from the EEPROM and tweak the functionality to my liking (maybe… eventually).

Dumping the dang ROM

I didn’t have any kind of special programmer, but found that reading and writing to an EEPROM is easily done with an arduino

However I got concerned looking at the hardware setup: they connect 5v to pin A2, but all of mine are grounded.

I looked at the data sheet for the EEPROM and was surprised to find it really readable even for someone without much embedded background! I learned (sec 5.0 of the manual) these pins were for configuring the I²C address for the chip (this is clear in the tutorial, but I didn’t get it at first). So my EEPROM was hardwired to communicate at a different address (0x50).

Another thing that confused me for a second: the manual states the EEPROM’s write-protect pin needs to be connected to either Vcc or Vss (ground), but it didn’t seem to be connected to either (according to my multimeter). But after looking closer I found it was connected to Vcc via a 1k resistor (I guess too much resistance for my multimeter to see continuity). I think the idea is that the corresponding pin from the header could be connected to ground and current would flow there overriding the write-protect). I’m sure this is probably a common technique.

I have an older “Duemilanove” arduino and had to look up the SDA and SCL pins on my board (pretty easy to find on the arduino site).

I hoped the pedal’s circuitboard would have the necessary pullup resistors soldered on so I didn’t have to use a breadboard at all but that didn’t seem to be the case. The arduino’s ATmega’s built-in pull-up resistors also apparently won’t work reliably for i2c communication, so I just wired things up like the SparkFun article above shows.

Here’s what that looked like (mostly for my own future reference); mystery pedal on the left, Duemilanove in the background:

wired up for flashing EEPROM

(the dangling green wire can be connected to Gnd on the arduino to make the EEPROM writeable).

I used the arduino sketches from the sparkfun tutorial above, just tweaking some basics:

  • modified EEPROM_ADR (zero out least sig bits, since my EEPROM was hardwired to 000 since all three address pins were grounded)
  • loop over 32k bits with #define EEPROM_SIZE 4096

For developing and uploading the arduino sketch I used this this vim plugin. I just had to add to my .vimrc:

let g:arduino_board = 'arduino:avr:diecimila'

…then an :ArduinoUpload in VIM builds and uploads the sketch to the board.

And the following seemed to be the simplest way to listen on the arduino serial port for the dumped EEPROM bytes. Note the baud passed to -s here should match Serial.begin(<baud>) in the arduino code:

$ busybox microcom -s 115200 /dev/ttyUSB0 > pedal.rom

After running this for the first time and inspecting the output with hexdump the project went off the rails for a while…: the output repeated every 512 bytes (so was this actually a 4096 bit EEPROM?) Microchip’s docs describe it as “a single block of 4K x 8-bit memory”

IYI: in fact “4k” here means 4096; they also write “kbit” to mean 1024 bits, which is not what I understand the term to mean. Bafflingly nowhere in the datasheet are they unambiguous about the actual size of the EEPROM. Also AFAIU there is no way to determine the size of an EEPROM without writing to it and reading back.

Or was it communicating with the arduino’s internal EEPROM on the same i2c address or something? (I unplugged one of the lines going to the EEPROM and got garbage, so ruled this out)

Or is the reader program actually buggy and skipping every 8 bytes, and then wrapping?

After inspecting the code I became more and more convinced the latter was what was happening:

  • Arduino’s Wire library doesn’t look like i2c at all (in fact it’s communicating with a separate chip that does i2c asynchronously (although the Wire library doesn’t take advantage of this))
  • the library is an undocumented mess, though this helped a litte
  • most of the EEPROM code I tried didn’t actually match the documented spec as far as I could tell (e.g. no repeated START)

In short fertile ground for wasting a bunch of time chasing red herrings…

What was really happening is the chip actually contained 8 banks of identical programs, which is what the FV-1 in fact expects (they can be eight different programs obviously). Had I done a little more initial basic research about the FV-1, or taken the time to quickly rule out the idea that the EEPROM dump was correct despite the fact that I thought that was unlikely (easily done by writing to the last byte and reading again, which is what I eventually did), I would have saved myself a lot of time. This is like bayesian reasoning and it’s really easy to not do.

Also, oops. I had 5v running to the EEPROM (connected to the FV-1), although FV-1 takes 3.3v. Luckily this didn’t seem to fry the board…

Assembling / Disassembling

Someone named Igor has written an FV1 decompiler presumably as a PHP script. Unfortunately the code doesn’t seem to be open source, and I wasn’t immediately sure how to contact the author. After uploading my ROM I got back a legitimate looking but at-this-point-totally-meaningless-to-me bunch of annotated assembly, like:

   CHO     RDAL, SIN0      ; { ACC = SIN0 } ; Get LFO value
   WRAX    REG7 , 0        ; { REG7 = ACC; ACC = 0 * ACC ; PACC = ACC } ; * Note: C=0(16 bit) - s1.14 format
   OR      0x20000         ; { ACC = ACC | %00000000000000100000000000000000 }
   RDAX    REG7 , 1        ; { ACC = ACC + REG7 * 1 ; PACC = ACC } ; * Note: C=1(16 bit) - s1.14 format

So that’s cool. Now I need to get an assembler and I’m off to the races.

The standard SpinASM assembler is Windows only (I haven’t tried it with Wine yet), so I tried an alternate one github user ndf-zz has written in python called asfv1:

$ pip3 install asfv1
$ asfv1 -b pedal.rom.disassembled pedal.rom.reassembled

I got some errors which seemed related to a documented quirk that I didn’t totally understand. After adding decimals to the problematic literals I was able to successfully assemble my disassembled program!

Unfortunately comparing the first 512 bytes dumped from the EEPROM with my disassembled/reassembled output from asfv1 showed some differences. I upload the new ROM to the disassembler service again and looked at a diff and it appeared many immediate values of 1 were turned into e.g. 6.103515625E-5 or 0.001953125 after going through asfv1 (and the mystery blockbox disassembler).

I re-read the asfv1 README more carefully (I’d read “fixed-point” as “floating point”), did a little research and looked at a diff of the hexdumps of the roms and what was happening was pretty obvious:

asfv1 compiled the first line of assembly below to 0x0001, while the binary output for the original dumped ROM was achieved by using the decimal literal, as in the second line below:

                 `1` as unsigned 16-bit int

                            -
                          /   \
RDAX   ADCL , 1           00 01 02 84 ...etc
RDAX   ADCL , 1.0         40 00 02 84 ...etc
                          \   /
                            - 

               `1` as s1.14 fixed point value:
                   0b0100000000000000
                       \ fractional /

I wasn’t familiar with the notation “s1.14” used in the asfv1 and FV-1 ASM docs, but it quite simply means a fixed-point real value represented by: a sign bit, followed by 1 integer bit, followed by 14 fractional bits (totalling 16 bits).

I dug into the asfv1 code and tweaked things so that, for real arguments, we treat decimal literals as real literals and values entered in hexadecimal or binary notation as a raw bit value.

With my fork I successfully assembled the output I got from the blackbox disassembler, and miraculously the output from the new asfv1 matches the original program we dumped from the EEPROM (head -c 512 pedal.rom)!

$ md5sum * | sort                                                                                                    
38092c4673ff63f561ad3413c732db43  pedal.rom.reassembled_with_my_asfv1_fork
38092c4673ff63f561ad3413c732db43  pedal.rom.1
9d13dcb79754603e85eca19cbf405c4a  pedal.rom.reassembled
...

I did quick hack job on asfv1 without understanding the code or SpinASM very deeply, so beware, but if you want to try out the fork you can do:

$ pip3 install git+https://github.com/jberryman/asfv1.git

Building SpinCAD Designer

I get the impression many if not most pedal makers are developing their algorithms with the GPL’d drag-and-drop editor SpinCAD Designer.

It seems difficult to build, but luisfcorreia has a fork where they’ve made it buildable with maven. Here’s what I had to do to get it to build and run successfully on my debian stretch box:

$ git clone https://github.com/HolyCityAudio/SpinCAD-Designer.git
$ apt install maven
$ git branch maven
$ git pull https://github.com/luisfcorreia/SpinCAD-Designer.git dev
$ cd spincad-designer  # seems this work was done on a copy of the codebase to a different dir in the repo...?
$ _JAVA_OPTIONS=-Djdk.net.URLClassPath.disableClassPathURLCheck=true  mvn package
$ java -classpath ./target/spincad-1.0-SNAPSHOT.jar:./lib/elmGen-0.5.jar  com.holycityaudio.SpinCAD.SpinCADFrame

SpinCAD is able to import “Spin Hex” which apparently means the Intel HEX encoded ROM data. This is I guess a common format to feed to EEPROM writers, programmers, etc.

After some trial and error I was able to convert the binary image into HEX in such a way that “File > Open Hex” in SpinCAD didn’t choke:

$ sudo apt install srecord
$ srec_cat pedal.rom.1 -binary -output pedal.rom.1.hex -Intel --line-length=19

I was curious if SpinCAD would be able to disassemble and recognize the idioms from a spin ROM but unsurprisingly it does not. I probably won’t be using SpinCAD for this project, but the library of “modules” in the source code might be really valuable to learn from, or maybe I’ll build a visual “disassembler” myself using them some day.


Appendix: links, etc

A nice write-up/experience report from a first-time FV-1 developer, with good advice and links:

Easy DIY dev boards for FV-1 (why aren’t there more of these? What is the cheapest commercial FV-1 based pedal available?)

A alternatives to the FV-1 and some history:

Great resources for DIY pedal building generally to be found here. I found discussions of the FV-1 on all of these, though most builders focusing on analog electronics:

Choosing a Binary-to-text Encoding

I had an occasion to think about text-friendly binary encoding schemes for the first time at work. The obvious choice is Base64, but my immediate subsequent thought was “there must be something much more efficient for our purposes”, and a quick google led here in which OP echos the same feeling:

It seems to me that we can greatly improve since on my keyboard I already see 94 easily distinguishable printable keys.

But of course if our goal is to “greatly improve” over Base64, then with a with a little thought we might conclude that the answer is “no, we can’t”. In Base64 we use 2^6 = 64 tokens, each of which represents a 6-bit string. Since those tokens are ASCII they take up 8-bits. So with Base64 we’re already at 75% efficiency (or “25% bloat”, or “33% larger than binary”), which seems… not so bad.

You can read about other binary encoding schemes here. From that table, we can see that Base85 which @rene suggests is modestly-better at 80% efficient. Base122 (the best on the list that can reasonably be called a “text encoding”) is 86% efficient.

Decision criteria

So you can make your messages ~13% smaller by ditching Base64 for the most exotic Base122, but what about other considerations?

Things you really want in a binary-to-text encoding, aside from size-efficiency are:

  • correct cross-platform compatible encoding/decoding; easy to get right
  • fast to encode/decode
  • compresses well (gzip is everywhere)

Other things that would be nice are that the encoding make some sense to the naked eye, be easily diffable, maybe even support some operations without requiring decoding (a binary search say).

It’s possible that all of these are more important to you than the efficiency of the raw encoding. With that in mind let’s consider a third (in addition to Base64 and BaseExoticInteger): the venerable hex encoding.

Hexadecimal (base-16) encoding requires two ASCII characters to represent a byte, so it’s 50% efficient. But as we’ll see it’s arguably better than these other two according to every other of our criteria!

Base64 is not sensible to the naked eye

Base64 encodes 6 bits per character. This means 3 octets (bytes) of input become 4 characters of the output. In a world where our data is overwhelmingly based on bytes and words our Base64 encoding is horribly out of phase!

When we see the two strings:

MTEyMlhYWVk=
WFhZWTExMjI=

…our senses don’t tell us anything. Whereas in hex the lizard brain is able to perceive patterns, symmetry and magnitude right away:

3131323258585959
5858595931313232

There must be value to being able to use our eyes (especially when it’s the only sense we haven’t abandoned for the work we’re doing). The former might represent an obscured bug in an encryption routine for instance.

Interestingly a Base85 encoding is also superior to Base64 in this respect: every 5 characters represent 4 bytes of the input, so we retain the ability to recognize and compare 32- and 64-bit word chunks.

Base85 is tricky, but Base64 is the worst kind of tricky

It’s a nearly-alphanumeric encoding, which reserves for the (in some cases, more rare) last two code words the + and / characters. Furthermore the choice of these last two characters varies among implementations. I have no doubt that this has caused bugs, e.g. a validation regex that assumed an alphanumeric encoding.

Similarly, the encoding must itself be url-encoded if the + and / scheme is used, which has certainly caused bugs. Same story for the = padding rule (quite possible to misunderstand, fail to test against, or never observe in examples).

Base85 schemes are of course more complicated (and likely slower). We’d hope to find well-tested implementations on all the platforms we require but even so we should be prepared for the possibility that we’d need to implement it ourselves in the future.

More compact encodings compress worse

Much of the data flying around the internet is gzipped at some layer of a protocol. Because Base64/85 etc. are out of phase with bytes, and word sizes, they tend to frustrate compression schemes by obscuring patterns in block oriented data. Here are examples of gzip applied to the same tortured Hobbes quote (270 bytes of ASCII text, compressing to 194 bytes):

Encoding | Original size | Compressed size
-------- | ------------- | ---------------
hex      | 541           | 249
Base64   | 361           | 289
Base85   | 342           | 313

So for uncompressed binary data we can probably expect a more compact encoding to result in more bloat over the wire in a gzipped payload.

Two other things that were interesting to me:

  • all of the compression tools I tried did worse on the hex encoded string than on the original ascii. Maybe that’s due to the size required for the decoding table? We could test on larger strings
  • gzip was able to compress 361 bytes drawn from /dev/urandom to 316 bytes, so it’s clear Base64 doesn’t wholly obscure the structure of the data to our compression algorithm

Other alternatives and conclusions

It probably doesn’t matter, so just use Base64. If size is the only thing that matters then I’d suggest zipping first and then using the most gnarly encoding you can stomach. But maybe you should be using a proper binary protocol in that case.

In a world where it was totally ubiquitous I would suggest using either the terminal-friendly ZeroMQ Base85 flavor or a simple hex encoding.

I also like that encodings like this one exist though. It’s worth stepping back, doing some quick math, and making sure that you’re optimizing for the right thing.

Almost Inline ASM in Haskell With Foreign Import Prim

With help from Reid Barton in questions here and here I discovered it’s pretty easy to call assembly from GHC haskell with minimal overhead, so I cleaned up an example of this technique and posted it here:

https://github.com/jberryman/almost-inline-asm-haskell-example

This is especially useful if you want to return multiple values from a foreign procedure, where otherwise with the traditional FFI approach you would have to do some allocation and stuff the values into a struct or something. I find the above more understandable in any case.

Here’s an example of the dumped ASM from the Main in the example above:

...
    call newCAF
    addq $8,%rsp
    testq %rax,%rax
    je _c73k
_c73j:
    movq $stg_bh_upd_frame_info,-16(%rbp)
    movq %rax,-8(%rbp)
    movq $block_info,-24(%rbp)
    movl $4,%edi
    movl $3,%esi
    movl $2,%r14d
    movl $1,%ebx
    addq $-24,%rbp
    jmp sipRound_s_x3
_c73z:
    movq $104,904(%r13)
    movq $block_info,-32(%rbp)
    movq %r14,-24(%rbp)
    movq %rsi,-16(%rbp)
    movq %rdi,-8(%rbp)
    movq %rbx,(%rbp)
    addq $-32,%rbp
...

You can see we just prepare argument registers, do whatever with the stack pointer, do a jump, and then push the return values onto the stack. For my purposes this was almost too much overhead to make this worthwhile (you can look at notes in the code).

I thought about sketching out a ghc proposal about a way to formalize this, maybe make it safer, and maybe somehow more efficient but I don’t have the time right now and don’t really have the expertise to know if this is even a good idea or how it could work.

Echo

K

echo

Announcing: Unagi-bloomfilter

I just released a new Haskell library called unagi-bloomfilter that is up now on hackage. You can install it with:

$ cabal install unagi-bloomfilter

The library uses the bloom-1 variant from “Fast Bloom Filters and Their Generalization” by Yan Qiao, et al. I’ll try to write more about it when I have the time. Also I just gave a talk on things I learned working on the project last night at the New York Haskell User Group:

http://www.meetup.com/NY-Haskell/events/233372271/

It was quite rough, but I was happy to hear from folks that found some interesting things to take away from it.

Thanks to Gershom for inviting me to speak, for my company Signal Vine for sponsoring my trip out, and to Yan Qiao for generously answering my silly questions and helping me understand the paper.

P.S. We’re hiring haskell developers

Signal Vine is an awesome group of people, with interesting technology and problems to solve, and we’re looking to grow the small development team. If you have some experience with haskell (you don’t have to be a guru) and are interested, please reach out to Jason or me at:

brandon@signalvine.com
jason@signalvine.com

Announcing: Hashabler 1.0. Now Even More Hashy With SipHash

I’ve just released version 1.0 of a haskell library for principled, cross-platform & extensible hashing of types. It is available on hackage, and can be installed with:

cabal install hashabler

(see my initial announcement post which has some motivation and pretty pictures)

You can see the CHANGELOG but the main change is an implementation of SipHash. It’s about as fast as our implementation of FNV-1a for bytestrings of length fifty and slightly faster when you get to length 1000 or so, so you should use it unless you’re wanting a hash with a simple implementation.

If you’re implementing a new hashing algorithm or hash-based data structure, please consider using hashabler instead of hashable.

Translating Some Stateful Bit-twiddling to Haskell

I just started implementing SipHash in hashabler and wanted to share a nice way I found to translate stateful bit-twiddling code in C (which makes heavy use of bitwise assignment operators) to haskell.

I was working from the reference implementation. As you can see statefulness and mutability are an implicit part of how the algorithm is defined, as it modifies the states of the v variables.

#define SIPROUND                                        \
  do {                                                  \
    v0 += v1; v1=ROTL(v1,13); v1 ^= v0; v0=ROTL(v0,32); \
    v2 += v3; v3=ROTL(v3,16); v3 ^= v2;                 \
    v0 += v3; v3=ROTL(v3,21); v3 ^= v0;                 \
    v2 += v1; v1=ROTL(v1,17); v1 ^= v2; v2=ROTL(v2,32); \
  } while(0)

int  siphash( uint8_t *out, const uint8_t *in, uint64_t inlen, const uint8_t *k )
{

  /* ... */

  for ( ; in != end; in += 8 )
  {
    m = U8TO64_LE( in );
    v3 ^= m;

    TRACE;
    for( i=0; i<cROUNDS; ++i ) SIPROUND;

    v0 ^= m;
  }

I wanted to translate this sort of code as directly as possible (I’d already decided if it didn’t work on the first try I would burn my laptop and live in the woods, rather than debug this crap).

First we’ll use name shadowing to “fake” our mutable variables, making it easy to ensure we’re always dealing with the freshest values.

{-# OPTIONS_GHC -fno-warn-name-shadowing #-}

We’ll also use RecordWildCards to make it easy to capture the “current state” of these values, through folds and helper functions.

{-# LANGUAGE RecordWildCards #-}

And finally we use the trivial Identity monad (this trick I learned from Oleg) which gets us the proper scoping we want for our v values:

import Data.Functor.Identity

Here’s a bit of the haskell:

siphash :: Hashable a => SipKey -> a -> Word64
siphash (k0,k1) = \a-> runIdentity $ do
    let v0 = 0x736f6d6570736575
        v1 = 0x646f72616e646f6d
        v2 = 0x6c7967656e657261
        v3 = 0x7465646279746573

    ...

    v3 <- return $ v3 `xor` k1;
    v2 <- return $ v2 `xor` k0;
    v1 <- return $ v1 `xor` k1;
    v0 <- return $ v0 `xor` k0;

    ...

    -- Initialize rest of SipState:
    let mPart          = 0
        bytesRemaining = 8
        inlen          = 0
    SipState{ .. } <- return $ hash (SipState { .. }) a

    let !b = inlen `unsafeShiftL` 56

    v3 <- return $ v3 `xor` b
    -- for( i=0; i<cROUNDS; ++i ) SIPROUND;
    (v0,v1,v2,v3) <- return $ sipRound v0 v1 v2 v3
    (v0,v1,v2,v3) <- return $ sipRound v0 v1 v2 v3
    v0 <- return $ v0 `xor` b

    ...

    (v0,v1,v2,v3) <- return $ sipRound v0 v1 v2 v3

    return $! v0 `xor` v1 `xor` v2 `xor` v3

If you were really doing a lot of this sort of thing, you could even make a simple quasiquoter that could translate bitwise assignment into code like the above.

Announcing Hashabler: Like Hashable Only More So

I’ve just released the first version of a haskell library for principled, cross-platform & extensible hashing of types, which includes an implementation of the FNV-1a algorithm. It is available on hackage, and can be installed with:

cabal install hashabler

hashabler is a rewrite of the hashable library by Milan Straka and Johan Tibell, having the following goals:

  • Extensibility; it should be easy to implement a new hashing algorithm on any Hashable type, for instance if one needed more hash bits

  • Honest hashing of values, and principled hashing of algebraic data types (see e.g. #30)

  • Cross-platform consistent hash values, with a versioning guarantee. Where possible we ensure morally identical data hashes to indentical values regardless of processor word size and endianness.

  • Make implementing identical hash routines in other languages as painless as possible. We provide an implementation of a simple hashing algorithm (FNV-1a) and make an effort define Hashable instances in a way that is well-documented and sensible, so that e.g. one can (hopefully) easily implement string hashing routine in JavaScript that will match the way we hash strings here.

Motivation

I started writing a fast concurrent bloom filter variant, but found none of the existing libraries fit my needs. In particular hashable was deficient in a number of ways:

  • The number of hash bits my data structure requires can vary based on user parameters, and possibly be more than the 64-bits supported by hashable

  • Users might like to serialize their bloomfilter and store it, pass it to other machines, or work with it in a different language, so we need

    • hash values that are consistent across platforms
    • some guarantee of consistency across library versions

I was also very concerned about the general approach taken for algebraic types, which results in collision, the use of “hashing” numeric values to themselves, dubious combining functions, etc. It wasn’t at all clear to me how to ensure my data structure wouldn’t be broken if I used hashable. See below for a very brief investigation into hash goodness of the two libraries.

There isn’t interest in supporting my use case or addressing these issues in hashable (see e.g. #73, #30, and #74) and apparently hashable is working in practice for people, but maybe this new package will be useful for some other folks.

Hash goodness of hashable and hashabler, briefly

Hashing-based data structures assume some “goodness” of the underlying hash function, and may depend on the goodness of the hash function in ways that aren’t always clear or well-understood. “Goodness” also seems to be somewhat subjective, but can be expressed statistically in terms of bit-independence tests, and avalanche properties, etc.; various things that e.g. smhasher looks at.

I thought for fun I’d visualize some distributions, as that’s easier for my puny brain to understand than statistics. We visualize 32-bit hashes by quantizing by 64x64 and mapping that to a pixel following a hilbert curve to maintain locality of hash values. Then when multiple hash values fall within the same 64x64 pixel, we darken the pixel, and finally mark it red if we can’t go any further to indicate clipping.

It’s easy to cherry-pick inputs that will result in some bad behavior by hashable, but below I’ve tried to show some fairly realistic examples of strange or less-good distributions in hashable. I haven’t analysed these at all. Images are cropped ¼ size, but are representative of the whole 32-bit range.

First, here’s a hash of all [Ordering] of size 10 (~59K distinct values):

Hashabler:

Hashable:

Next here’s the hash of one million (Word8,Word8,Word8) (having a domain ~ 16 mil):

Hashabler:

Hashable:

I saw no difference when hashing english words, which is good news as that’s probably a very common use-case.

Please help

If you could test the library on a big endian machine and let me know how it goes, that would be great. See here.

You can also check out the TODOs scattered throughout the code and send pull requests. I mayb not be able to get to them until June, but will be very grateful!

P.S. hire me

I’m always open to interesting work or just hearing about how companies are using haskell. Feel free to send me an email at brandon.m.simmons@gmail.com