Sandbox Minus John Dillinger Equals What?
Last updated: Jan 2, 2026
It didn’t look like anyone had ever tried to answer the question, so I thought I would have a go. This post probably won’t be interesting to folks already familiar with embeddings or linear algebra. It represents my rough, non-expert understanding of the subject matter
Huh. What could it mean to subtract two words, and how
would you even do that? If we just consider the how, it’s not too hard to
come up with a scheme. You could map a word to its place in the dictionary,
say, then subtract the numbers and interpret the result N to mean the Nth
word. But that’s not interesting, because alphabetical order (the basis of our
mapping) is totally unrelated to a word’s meaning.
Well it turns out there are ways1 to map words to numbers that capture meaning, and that even produce meaningful results when we do arithmetic over them: a word embedding.2
This is just a dictionary that maps a word to a vector,
concretely just a row of numbers of some fixed length, e.g. [3,7,-5] is a
three-dimensional vector. You add and subtract vectors by adding or subtracting
corresponding elements, so e.g. [3,7,-5] minus [2,7,1] equals [1,0,-6].
Because of how the embedding dictionary is trained3, these numbers end up
something like coordinates in a semantic space. Think of [3,7,-5]
as [x,y,z] coordinates, but in a space of meaning rather than physical space.
Okay, so words are associated with numbers (a “vector”) that captures meaning, and these can be subtracted (and added, presumably?). What does that do? How should we interpret the result of that arithmetic?
Addition and Subtraction
Subtraction of words can actually be kind of intuitive. Take the canonical
example, king - man. If we remove the “adult maleness” from the concept of
king, we’re left with something like… royalness? And what if we add that
royalness to woman? We get… queen, right? Another name for this type of
move is analogy.
Playing aroud with it
Here’s the scratch Python I’m using, if you want to follow along. The rest of what follows will be in a repl.
First, let’s try the canonical example:
>>> # This function does our `(x - y) + z` arithmetic, plus some boring stuff,
>>> # returning the closest matches, along with a closeness score.
>>> # So this is read like "king is to man as _______ is to woman"
>>> play.analogize("king", "man", "woman")
[('queen', '0.71182'), ('monarch', '0.61897'), ('princess', '0.59024'), ('crown_prince', '0.54995'), ('prince', '0.53773')]
That seemed to work! We get “queen” as our closest match. Let’s try a couple others:
>>> play.analogize("France", "Paris", "Rome")
[('Italy', 0.714356005191803), ('Italians', 0.5599663257598877), ('Sicily', 0.55470210313797), ('Flaminio_Stadium', 0.5299075245857239), ('Spain', 0.5046595335006714)]
>>> play.analogize("doctor", "hospital", "school")
[('guidance_counselor', 0.5969595313072205), ('teacher', 0.5755364298820496), ('eighth_grade', 0.5226408243179321), ('schoolers', 0.5168289542198181), ('elementary', 0.5085657238960266)]
>>> play.analogize("good", "better", "worse")
[('bad', 0.7338365912437439), ('terrible', 0.6607799530029297), ('horrible', 0.6248188018798828), ('awful', 0.5503717660903931), ('dreadful', 0.5382158756256104)]
But they don’t always work. In fact, relationships that are even a little bit abstract disappointingly seem never to work:
>>> play.analogize("composer", "symphony", "blueprint") # architect
[('roadmap', 0.4200657904148102), ('outline', 0.41546958684921265), ('proposes', 0.38996848464012146), ('envisions', 0.3866680860519409), ('proposals', 0.38342446088790894)]
>>> play.analogize("microscope", "small", "distant") # telescopt
[('microscopy', 0.5025578737258911), ('microscopes', 0.49063798785209656), ('magnification', 0.3960527181625366), ('scanning', 0.3829149007797241), ('confocal', 0.3811110556125641)]
>>> play.analogize("trial", "verdict", "government") # election?
[('Government', 0.46542492508888245), ('governments', 0.45814740657806396), ('goverment', 0.455424964427948), ('administration', 0.43941164016723633), ('govenrment', 0.4306579828262329)]
Where it fails you just stay in the semantic neighborhood of one of the input words6. This isn’t looking promising…
So… sandbox minus John Dillinger equals what?
As mentioned in the note above, we can’t answer this directly; the type of thing we got from subtraction, a displacement, can’t be looked up in our dictionary of embeddings. But maybe we can get enough of a feel for this displacement to give it a name by adding it to some other words and sitting with them for a bit.
Let’s start with something sandbox-like for the first new jumping off point:
>>> play.analogize("sandbox", "Dillinger", "swimming_pool")
[('playground', '0.56098'), ('sandpit', '0.55073'), ('wading_pool', '0.51528'), ('swimming_pools', '0.51434'), ('sandboxes', '0.50598')]
Okay, that… didn’t work. I expected Dillinger minus sandbox, a “big
displacement”, would shoot us from some 3rd word (“swimming pool”) to a
seemingly-unrelated-to-the-third 4th word. Instead, it seems to not move us
much at all.
My hazy intuition/guess for why consists of:
- High-dimensional space is much, much emptier than we imagine? We can move far in a line from a cluster of words, but that cluster we left is still the closest thing to us.
- When two words have no semantic relationships (“Dillinger” and “sandbox”), and we subtract them, the displacement is going to be an uncoordinated twiddling of another target vector (“pool”). We’re just randomly perturbing it without coordinated effort along any real semantic axis, so we stay in the same neighborhood, more or less.
- It also seems like this gimmick with arithmetic might only work locally. Like, I didn’t see it working when we used analogies where the antecedants were themselves not closely related, as they are in “king” and “queen”.
Conclusion
I hoped we would be able to look at sandbox - Dillinger applied to a bunch of
different words and intuit something from that set of mappings, like maybe a
chain of two or three concepts that get us from one to the other. Instead we’ve
done some combination of:
- measured objectively that the question is meaningless and inscrutible (the effect Brautigan was aiming for), and
- showed that this word arithmetic thing doesn’t work well, maybe isn’t very interesting or useful
And a couple papers that I might dig into
if I want to try to understand this better, but probably won’t
- Analogies Explained: Towards Understanding Word Embeddings by Allen and Hospedales
- ESSAYS ON UNDERSTANDING ANALOGIES AND ASSOCIATIONS IN WORD EMBEDDING SPACES by Kawin Ethayarajh
-
I used the python
gensimlibrary and tried out three models:word2vec-google-news-300,glove-wiki-gigaword-300, andconceptnet-numberbatch-17-06-300. First I had to make sure it had an understanding of “John Dillinger” which involved case-insensitive search of the vocabulary to find potential matches (these usually were just “dillinger”, capitalized or not depending on model), then looking at its neighbors with e.g.model.most_similar("dillinger", topn=10)to make sure it looked like we were in a solidly “gangster-ish” space, rather than e.g. a Jamaican DJ-ish space.conceptnethad absolutely the cleanest understanding of both “sandbox” (the others had lots of tech/gaming conceptual bleed) and “Dillinger”. Unfortunately it doesn’t do the arithmetic trick. (Womp womp.) I use the word2vec model in the examples here. ↩︎ -
I first learned about word embeddings from Allison Parrish’s inspiring Strange Loop talk (IRL!). Honestly just skip this and watch that… ↩︎
-
This involves some form of: algorithm iterates through huge amount of text (like all of Wikipedia) playing a game of Guess the Word From Context (like: “Last night a blue thing, the smoke itself, from our _______ drifted down the valley, entering into”); it tries to predict the word by number-crunching a bunch of “weights”; after each guess nudges the weights such that it gets better and better at the game. ↩︎
-
Technically, “close” in this system is defined not as close in space to a particular point, but close to a ray emanating from the origin and passing through that point; the distance from the origin of a point (word) ends up being mostly noise. ↩︎
-
You’re already familiar with this sort of thing, even if you haven’t realized it. Consider how the combination of points-in-time and duration work. What is June 22, 1903 minus June 21, 1903? One day. One day duration. What’s January 30, 1935 plus one day? January 31, 1935. What’s one day plus two days? Three days. Now, what’s January 30, 1935 plus June 22, 1903…? I think you’ll agree the latter is meaningless. So we can see when dealing with an algebra of durations and points-in-time that we can subtract points in time but not add them. And when we subtract points-in-time, the result is a duration. And durations can be added and subtracted from each other. You can also add a duration to a point-in-time to get another point-in-time. So you have to be mindful of the unit of the numeric (time-like, in this case) thing you’re holding. This is the same thing going on geometrically with our word vectors in high dimensions instead of the one here (a line, as in “timeline”). The name I know for this flavor of thing is an affine space. ↩︎
-
specifically it seems like the middle word is basically “ignored”. I played with this in my
hm()function and depressingly (considering the time of sunk into this stupid post) you can get similar results to many of our “successful” examples by ignoring the second word entirely ↩︎