Skip to content

Teaching Clojure to beginning programmers: Functions before “variables”

2016 August 5
by Richard Harrington

tl;dr While creating a brief Clojure tutorial for people with zero programming experience — A Gentle Introduction to Clojure Through Turtles — we realize that we should be teaching functions all the way down.

JavaScript: Variables before functions

A couple of weeks ago, I finished teaching a ten-week, part-time JavaScript course at General Assembly in New York. There was a range of experience levels in the class, and while most of the students knew some HTML and CSS, the majority were completely new to the concept of programming — of bending a computer to their will.

As in probably all such courses in any language normally taught to beginners, the concept of a “variable” was introduced very early on in the curriculum:

var favoriteNumber = 5;

And look, we can change it (hence the term “variable” — watch it vary!):

// => 5
favoriteNumber = 6;
// => 6

We didn’t get to the topic of functions — and creating our own functions — until several more lessons had passed, and we’d already gone over every single data type and control flow structure in the JavaScript language (or ES5, anyway).

This was certainly the order in which I learned these language features X years ago when I first taught myself BASIC and Pascal, so I didn’t think to question it — even when most of the beginners displayed noticeable pain and confusion at having the equals sign they had learned in second grade repurposed as this strange new thing called the “assignment operator”.

It was one of many humbling eye-openers for me, as I watched people grappling with things I learned a long time ago in a galaxy far, far away.

As will probably be clear by now, I had never taught a class before.

In the end everybody lived happily ever after, everybody learned to love variables and functions in JavaScript, and I like to think we all (including me!) learned a little bit more than we did before about how to solve problems with code. And we may also have learned a tiny bit about the evils of mutating data willy-nilly, although that’s a more elusive concept to get across.

Clojure: functions before vars

The weekend after the class was over, I volunteered with ClojureBridge — an organization working to increase diversity in the community around the Clojure programming language, by facilitating one-day beginner-friendly workshops in Clojure, for women.

(It’s based on its successful predecessor, RailsBridge, whose purpose is to… well, you get the idea.)

Our instance of the workshop — ClojureBridge NYC — was organized by Stuart Sierra of Cognitect, who did a great job putting it together and organizing the volunteers. I and Brian Gruber (an engineer at Meetup) were assigned to lead the beginner group — the people with no programming experience whatsoever.

I was very curious to see how this would go, as Clojure is a language dear to my heart but not one that’s normally taught to beginners. I think one of the primary reasons for that is that it’s not normally used by beginners, so the tooling and the ecosystem around it are mostly for experienced programmers. But we’d be introducing it with a recently-developed beginner-friendly IDE, Nightcode, so I had high hopes.

Brian and I decided that while the extensive slides provided by the ClojureBridge organization are really good for people who were at least a little bit familiar with another programming language and wanted an overview of Clojure — they go over the basics of the language, the syntax, and a few of the main data types, and culminate in some turtle graphics using a small app written for the occasion — we needed to come up with a supplemental lesson plan for total beginners. Ideally, this lesson plan would accomplish two goals in the few hours that we had with the workshop participants: 1) Show them how to do something cool, and 2) Give them a reason why they might want to continue studying Clojure (or programming of any kind) in the future.

To that end, we decided to skip the list of datatypes, and skip even talking about what a datatype is, and jump straight to the turtles, in the interest of providing immediate visual feedback.

Brian put together a basic outline, and my plan was to flesh it out into a README. Here was the outline:


  1. Have fun with turtles
  2. Learn about programming
  3. Learn about turtles


  1. Hello World — print something
  2. Turtles — forward, right, left, make some shapes
  3. Arithmetic — +, -, *, / — then use arithmetic in turtle commands (stealth intro to function composition!)
  4. def — assign names to values
  5. defn — assign names to values which also happen to be functions
  6. More fun with turtles — reuse code using dotimes
  7. Culminating exercise — make a function to draw a polygon of arbitrary size and arbitrary number of sides.

It looked great to me, and I fleshed out a basic README for the students, but when we looked at my end result, we realized something — it wasn’t doing a very good job of explaining why anyone would ever want to use def, in these very simple examples. The best I could come up with was “in case you forget what angle 90 degrees is”:

;; make a square
(def right-angle 90)
(forward 50)
(right right-angle)
(forward 50)
(right right-angle)
(forward 50)
(right right-angle)
(forward 50)
(right right-angle)

And indeed, when I later showed something like this code sample to beginners in the workshop, their reaction was usually along the lines of, but 90 is shorter to type. In other words, why would you want to give a second name to something (“right-angle”), when you already have its original name (“90”) that more clearly indicates the nature of the value in question? (Of course, it would be different if we were dealing with long strings instead of short numbers, but that wasn’t the case here.)

Functions, on the other hand — the thing that is taught much later in most programming classes — were quite easy to explain the purpose of. Even functions that are purely used for side effects and take no arguments are obviously useful to eliminate code repetition:

;; make a square
(defn draw-square []
  (forward 50)
  (right 90)
  (forward 50)
  (right 90)
  (forward 50)
  (right 90)
  (forward 50)
  (right 90))

And then when you start parameterizing them, their power becomes obvious:

;; make a square of any size!
(defn draw-square [size]
  (forward size)
  (right 90)
  (forward size)
  (right 90)
  (forward size)
  (right 90)
  (forward size)
  (right 90))

And I think this is the best way to start introducing the concept of binding names to values. Function arguments seem much more obviously useful to a person on their first day of programming than the top-level vars that most closely resemble global variables in JavaScript.

How did ClojureBridge go?

As it turned out, we had a particularly excellent volunteer turnout, so after I gave an hour-long lecture from the existing curriculum to everyone — beginners and experienced programmers alike — for the rest of the day every student pretty much had an individual tutor.

As someone who had just taught a ten-week course to people of wildly different experience levels, to all of whom I was supposed to have taught the same material to by the end, this was a dream. Everybody got to work on exactly what they wanted to, at their own pace, with as little or as much guidance as they wanted and with multiple mentors hanging around to answer questions at all times. A good time was had by all.

As for the beginner’s supplement that we wrote, I was actually scrambling to finish it during the workshop, so it never got used in its entirety, but we used enough of the concepts in practice with the workshop participants to confirm that we had been right in our intuition that defn should come before def. Here is the end result, slightly fixed up since the workshop:

A Gentle Introduction to Clojure Through Turtles

Help out ClojureBridge

Want to increase diversity in the Clojure community by volunteering with ClojureBridge? You can go here to find out what’s already going on in your city, or to find out how to organize your own ClojureBridge event!

Disrupt to Bullshit

2014 July 15
by Richard Harrington

In the past few months I’ve been endlessly entertained by a browser plugin called “Cloud to Butt.” There are a couple different versions on the Interwebs, but the most recent one takes every instance of the word “cloud” on any website you’re looking at and turns it into “butt,” and in addition, “the cloud” becomes “my butt.” You can find the code for the latest version here, and the Chrome extension here.

I recently decided to provide the same treatment for a word I’ve seen too much recently in tech and business articles, on book covers, and on Oakley ads in the Union Square subway station: “disrupt.” For those who want to cut straight to the chase, here are the links:

Disrupt to Bullshit: the Firefox add-on
Disrupt to Bullshit: the Chrome extension
Disrupt to Bullshit: the source code

I’m quite happy with the outcome now, but I did not immediately settle on the word “bullshit” as the replacement word. My friends and I discussed a number of other possibilities early on, including “fart,” “poop,” “shit,” “eviscerate,” and “disembowel.” The word had to retain some of the juvenile humor of “butt” (which “fart” and especially “poop” had in spades), while also capturing some of the unnecessary, unpleasant, and hilariously out-of-place violence of the word “disrupt” when used in articles about startups that want to help you organize your email (best represented by words like “disembowel”).

In the end, we found “bullshit” to be funnier than all the others, as it connotes the best mix of immaturity and aggression. It also has the same number of syllables as “disrupt,” which generally helps the affected web content to retain some of the flow that the authors originally intended, albeit with a different meaning.

The word “bullshit” is not, of course, perfect. Companies that get called “bullshit” in websites transformed by the plugin are unfairly maligned if they are doing good in the world, and they are let off the hook if they are doing bad (in the latter case because calling their efforts “bullshit”, i.e. pointless and illusory, is inaccurate if they are accomplishing real harm).

It is also important to point out that Disrupt to Bullshit is not a commentary on any particular theory of business cycles. I didn’t even know until recently that the concept of disruptive innovation is pretty much owned by one guy, a Harvard Business School professor named Clayton Christensen, who’s been talking about it for almost twenty years. He seems like a nice person, but from what I’ve read so far, his attempts to apply the theory of disruptive innovation to the non-profit and governmental sector, or even to transactions between large enterprises (as opposed to economic activity involving individual consumers), seem misguided and possibly dangerous to society.

But I am, as of yet, not fully informed on this topic. I might get back to this in a blog post after I’ve done some more research and read a couple of his books. In the meantime, this plugin is not about him or his work. In fact, I’ve seen Christensen agree in an interview that the word “disrupt” has gotten totally out of hand. So I think he might enjoy this plugin.

And I hope you all enjoy it as well. I offer it for the entertainment of anyone who wishes to install it in the privacy of their own browser, in the hope that they may take full advantage of the comedic opportunities afforded to us all by the current over-saturation of the word ‘disrupt.’

And in a later blog post, I’ll get into some of the technical challenges involved in making the plugin. Turns out verbs (like disrupt) are a lot harder than nouns (like cloud)!

traceroute is a hack! (the meandering portion of Week 8 of Hacker School)

2013 July 28
by Richard Harrington

Like many of my weeks in Hacker School, this one started out meandering and got more focused towards the end. Some of my fellow students have told me they fall into the exact opposite pattern — they start out strong and then peter out — but I tend to get my wind at the end of the week and try to do as much as I can over the weekend.

That’s not to say that I in any way regret puttering around. One of my favorite things to do at Hacker School is to get into a conversation with someone about something I know nothing about, and then end up working on the project with them for a little while. Often I end up following some tangent that turns into a project idea for later in the summer, that I never would have thought of.

For instance, on Monday morning, my fellow Hacker Schooler Sunil Abraham got into a long conversation across the table from me with Jessica McKellar, our resident for the week. One of her specialties is networking, and Sunil was having her explain a few things, because one of his goals for the summer was to learn more about networking and the protocols governing the internet. One of his other goals was to learn a language like OCaml or Haskell, so he figured he would follow Jessica’s suggestion to re-implement the Unix utility ping, but he wanted to do it in OCaml.

So I figured I’d help him out, since I knew a little bit about SML, a forerunner to OCaml, and since I’d never even thought of implementing a network utility until that morning, so I was curious to see if it could be done.

Two mornings later, after having switched from trying to implement ping to trying to implement traceroute instead (for reasons too obscure to go into in this blog post), we wisely abandoned this project, but I came out of it knowing quite a few little tidbits I never knew before, like:

  • How to get OCaml running in VIM and piped to a repl using the utility tmux, which I’d never used before,
  • How to use the OCaml package manager (opam),
  • How to access C networking utilities using the OCaml socket library,
  • and last but not least, that traceroute is a total hack!

Let me explain that last point.

Before this week, I had a big misconception about what traceroute actually does. I assumed that it traces a specific route, meaning that a packet leaves my computer and sends back messages from every node along the way until it reaches its target, but that’s not what happens at all. What actually happens is that for each hop, three packets are sent to the target address with a special field called a TTL (Time To Live) set to expire immediately, and return messages to the originator. Then three more packets are sent out to the target, but set to expire after two hops, at which point the nodes two hops away send back messages. Then three more, set to expire after three hops, and so on, until the target is reached or a node is encountered which blocks traceroute (this happens a lot). So the path may or may not represent a path that any particular packet actually took, but it’s likely to be a close approximation, unless something major happens to the internet in between when the originator sends one set of packets and when it sends the next.

In case you’ve never seen a traceroute output, this is what it looks like. Notice that some nodes have multiple addresses, meaning that the three different packets went to more than one place, but that there are always three times listed, for each line number:

Harrington-MacBook-Pro:~ richardharrington$ traceroute
traceroute: Warning: has multiple addresses; using
traceroute to (, 64 hops max, 52 byte packets
 1 (  1.044 ms  0.901 ms  0.704 ms
 2 (  12.072 ms  9.925 ms  13.181 ms
 3 (  12.111 ms  13.007 ms  14.814 ms
 4 (  15.769 ms  20.855 ms  19.957 ms
 5 (  18.767 ms  20.741 ms  13.818 ms
 6 (  12.291 ms  16.625 ms  20.841 ms
 7 (  14.191 ms  15.702 ms (  14.615 ms
 8 (  17.473 ms  13.600 ms (  13.725 ms
 9 (  15.267 ms (  11.700 ms (  14.945 ms
10 (  17.421 ms  19.263 ms  14.790 ms
11 (  13.693 ms  17.985 ms  16.313 ms

Interestingly, its use in traceroute is not the main purpose of the TTL field, which was intended as a way for unreliable communications protocols to signal a message failure.

Communication protocols that guarantee delivery of packets, like TCP (the protocol layer that underlies web pages and email, among other things), use very robust communication channels, including requests for redelivery of packets that fail. The originator certainly knows very quickly when a packet has not reached its destination. But there are other protocols, such as UDP (used for streaming video, among other things) and ICMP (used for all kinds of things, particularly message-passing), which are much cheaper and more efficient by virtue of the fact that they’re not actually guaranteed to reach their destination. Packets sent via these protocols need some simple way to alert the sender that the packet has failed, though, and they normally do this by having their TTL field set to a high number (the default is 64, I believe) and decremented by one at each hop. If the TTL reaches zero at any node short of its destination, the node drops the packet and sends an ICMP “Time To Live exceeded in transit” message back to the originator of the packet.

So traceroute takes advantage of this system by sending out messages with TTLs of 1, then 2, then 3, etc., corresponding to the line numbers above. The contents of the message are irrelevant; only the TTL fields themselves matter.

So it’s a hack, but as my fellow Hacker Schooler Martin Törnwall points out, it’s a very elegant hack. And something I may try to implement in Clojure or ClojureScript later in the summer. If I have time.

Miscellaneous Fun Things from Week 7 of Hacker School

2013 July 28
by Richard Harrington

I wrote a previous post about what I spent the majority of Week 7 doing — working on the Robotwar project — but here’s a couple more items, in the interest of giving people an idea of all the little things that go on there:

  • Pairing on an implementation of Conway’s Game of Life in Clojure with fellow Hacker Schooler Jimmy Li on Thursday. The main algorithm was twelve lines of code, and probably could be a bit shorter. Compare that to the JavaScript version I made a couple years ago, which is about two hundred. Most of that discrepancy can be attributed to a general increase in my programming ability without respect to any specific language, but a fair bit is due to the elegance of Clojure itself. I’m really starting to like that language.
  • A back-and-forth e-mail exchange over the weekend between fellow Hacker Schooler Sebastian Bauersfeld and myself over what is the best algorithm in Clojure to generate the powerset of a set, that is, the set of all subsets of that set (including the full set itself and the empty set). My first version was quite clever but somewhat convoluted and, it turned out, totally wrong — it only worked for sets of three items or less, which is all I had tested it on:

    (defn powerset [s]
      (let [range-len (range (count s))]
        (set (for [rot range-len
                   n range-len
                   sub-seq (split-at n (apply concat (reverse (split-at rot (seq s)))))]
               (set sub-seq)))))

    Sebastian’s was better — instead of doing weird stuff with rotation and splitting sets into two pieces as above, it had the correct idea, which was to progressively fold each element of the original set into the growing list of subsets:

    (defn powerset [s]
      (if (empty? s)
        (let [e (first s)
              power-e (powerset (disj s e))]
            (set (map #(conj % e) power-e))))))

    But then he showed me this one that he found on the internets:

    (defn power-set [s]
      (reduce (fn [ss x] 
                (concat ss (map #(conj % x) ss))) 

    which put us both to shame. Actually, Sebastian’s was pretty close — it just used direct recursion instead of the `reduce` function. Mine was very far off.

    But that last one is so simple! So short! So obvious in retrospect! (although, to be fair, it returns a sequence, not a set, but making it return a set would just require one more function call).

    Sometimes I feel as though I have a mountain of things to learn, but then I’ll see that I was able to figure out how this guy’s powerset function works almost immediately, which I would NOT have been able to do a month ago. So although I don’t always see it day to day, occasionally I see inklings of the possibility that I will look back on this summer and realize that I learned an enormous amount.

On Quitting While You’re Ahead

2013 July 26
by Richard Harrington

In the early part of Hacker School, I was involved in a study group making its way through the book, Types and Programming Languages, by Benjamin Pierce. It’s a very interesting book, and I started reading it not only because I felt I needed a better grounding in lambda calculus and type theory, but because I thought I would actually enjoy it. I’d really loved a class I took on Coursera last spring, Dan Grossman’s Programming Languages course, and the material in TAPL seemed to be a deeper treatment of some of the theory brought up in that course, in the same way that my college course in abstract algebra (which I loved) was a re-examination of grade-school arithmetic. I’m not comparing Grossman’s course to grade school, but you know what I mean — first you learn how to use the stuff, and then you really delve into how and why it works in exhaustive detail.

In the third week of Hacker School, we had our second meeting of the group in the “fishbowl,” which is what we called the central room in the Hacker School space that has windows on all four sides looking out on… the rest of Hacker School.

I’d already started to spend a lot more of my time coding and a lot less time reading the book, and I was mostly unprepared for the meeting. Once the group got past the part of the chapter on lambda calculus that I had read and understood, I was completely lost. I wasn’t exactly sure when I would catch up, either, as I was feeling less and less inclined to spend my summer poring over a graduate level textbook in type theory, not having ever taken a CS course in my life.

At a certain point during the study group meeting I was reminded of my friend Krista, who grew up in Manitoba and was a star cross-country runner in high school, winning province-wide awards. Then one day she decided to stop doing cross country, and followed through on her decision immediately — in the middle of a race. I’ve never watched cross country myself, but apparently, unlike in track races, it’s quite common to find yourself alone for stretches of time, and it was during one of these stretches that she thought to herself, “I don’t want to do this any more.” She slowed down, walked to the finish line, and never ran another race.

Inspired by her example, I decided to leave the fishbowl and go work on my tic tac toe project in Clojure. I felt pretty good about it. I still think it’s incredibly interesting material, and I have greatly benefited since then in my Clojure studies from the refresher course on set theory and the definition of a function in the first chapter of the book, but I never looked back. I’ve spent my summer since then making things and helping other people make things. Some day, when I’ve made enough things, I’ll go back and look at how exactly they’re made, from the ground up.

This experience has now repeated itself a number of times over the summer. Being as unstructured as it is, Hacker School is an exercise in knowing when you need to keep persevering on a project, and when you’ve learned as much as you’re going to learn and need to move on, even if you’re not “finished” in any traditional sense.

For instance, this past week, fellow Hacker Schooler Sunil Abraham and I spent a couple mornings trying to re-implement the Unix utility traceroute in the OCaml language, not knowing much about OCaml or about networking. Eventually we wisely gave up on that. I now know slightly more about networking and the traceroute utility, but that’s the subject of another post…

Week 7 of Hacker School (July 15 to July 21)

2013 July 21
by Richard Harrington

This week I made slow but steady progress on compiling the Robotwar code down to the virtual machine code that came with the original game. I paired with a couple of the facilitators and residents (one — Kevin Lynagh — over Skype) and reworked the code to the point where I liked it, but I spent perhaps too much time refactoring and not enough time moving forward. The question of how much time to spend making one’s code look elegant and perfect is a tough one, and I think I may have been erring too much on the side of endless tinkering, because it’s Hacker School, not a workplace, and no one’s going to stop me. For instance, here are four versions of a parsing function that I created in consultation with a few different people, each of whom had different opinions about how it should be done.

; The parser-priority list is a sequence of pairs. The first of each pair 
; is a function which takes a token-string and returns a parsed value, 
; e.g. "-6" returns the integer 6. The second of each pair is the associated 
; type which will be added onto the token as it moves forward through the 
; parsing and compiling process. 
(def parser-priority
  [[registers  :register]
   [commands   :command]
   [str->int   :number]
   [valid-word :label]
   [return-err :error]])
; And here are the parsing functions. They all take a token and run it 
; through the parser-priority list, running the parsing functions in 
; order until they find one that returns a non-nil value, at which point 
; they take that value, and the corresponding type from the parser-priority 
; list, and use those two things (and the pos value from the original token) 
; to return a new token-map:
(defn parse-token-original
 [{:keys [token-str pos]}]
  (fn [[parser token-type]]
   (when-let [token-val (parser token-str)]
    {:val token-val, :type token-type, :pos pos}))
(defn parse-token-with-for-comprehension
 [{:keys [token-str pos]}]
 (first (for [[parser token-type] parser-priority 
         :let [token-val (parser token-str)] 
         :when token-val]
         {:val token-val, :type token-type, :pos pos})))
(defn parse-token-hybrid
 [{:keys [token-str pos]}]
 (some identity (for [[parser token-type] parser-priority]
                 (when-let [token-val (parser token-str)]
                  {:val token-val, :type token-type, :pos pos}))))
(defn parse-token-loop-recur
 [{:keys [token-str pos]}]
 (loop [[[parser token-type] & tail] parser-priority]
  (if-let [token-val (parser token-str)]
   {:val token-val, :type token-type, :pos pos}
   (recur tail))))

The first one uses an anonymous function, the second two use ‘for’ comprehension, and the last one uses loop-recur. I’ve been getting the feeling that directly using the loop-recur structure is less Clojure-y, but I could be totally wrong about that. In some ways it’s the Wild West, and best practices are still being developed.

In the end I think all these versions of the parsing function are fine, with the possible exception of the loop-recur one, because it has no provision for dealing with the case for when you get to the end of the list (is that called the null case?). In fact, it relies upon the fact that the last function in the parser-priority list actually always returns an error string no matter what its input, but this function shouldn’t have to know that.

These are all fine, really. In the end, I think I prefer the one I had in the first place, but only slightly.

As one of the facilitators pointed out, I should arguably not even be reducing this Robotwar source code to its object code, because they’re so similar (see here for an example), and why not just write an interpeter, and have it start doing things already? Like having robots fight? I do wonder sometimes why I’m sticking to the exact form of the original game, but it’s been fun so far, and I’ve also had this idea recently that I want to write a compiler from Scheme down to this Robotwar VM code. We’ll see if I have time for that!

The “trie” data structure

2013 July 15
by Richard Harrington

Every Monday night at Hacker School we have a lecture of some kind, usually by one of the residents. Last Monday night it was a super-interesting one by our resident for the last two weeks, Peter Norvig, about natural language processing. Turns out with a few simple lines of Python and a large enough corpus of words (like a book), you can do an amazing amount of things, such as correct bad spelling mistakes and determinewhichwordsshouldbeseparatedwhenthey’rewrittenwithoutspaces.

On a mostly unrelated note, after I got home from the lecture I took a look at some code that two of my fellow Hacker Schoolers Doron Roberts-Kedes and David Lichtenberg had put together that day: an implementation of the “trie” data structure in JavaScript. It’s a variation on a basic key-value store, where each key is represented by a path in a tree. Each node on the tree has pointers to zero or more other nodes, and each traversal of the tree consists of an ordered series of these pointers. Usually the pointers are letters, and thus each traversal forms a string.

“Trie” is short for “retrieval,” and the person who first came up with it pronounced it like “tree,” but people since then have pronounced it like “try” in order to distinguish it from other kinds of trees. For more fun facts and a nice diagram, see the Wikipedia article.

Each node can optionally hold a value as well, but I’m told by fellow Hacker Schooler Sunil Abraham that one of the most obvious uses of this kind of data structure is for auto-completion. If someone has already typed in “at”, and the only other words in your corpus that they might be searching for that begin with those two letters are “ate” and “atelier”, you can start your search very efficiently for words to put into a drop-down menu, starting at the “at” node in the trie.

I got kind of obsessive about it and stayed up far too late, but it was fun being able to do a code review with Doron and David the next morning, learning about data structures from them and teaching them about JavaScript idioms. That’s the best of Hacker School.

The feature set of my final version of the code is kind of a hybrid of what I thought a trie was supposed to be for (an efficient key-value store), what Doron and Michael originally focused on (a way to count the frequencies of words), and what Sunil told me it’s mainly used for (auto-completion). Here it is:

(function(exports) {
    var splitWord = function(word, emptyCb, nonEmptyCb) {
        if (word.length === 0) {
            return emptyCb();
        else {
            return nonEmptyCb(word[0], word.substring(1));
    function Trie(){
        // will be set with properties "val" and/or "children"
    Trie.prototype = {
        walk: function(cb) {
            function recur(node, prefix) {
                cb(prefix, node.val);
                for (var letterKey in node.children) {
                    recur(node.children[letterKey], prefix + letterKey);
            recur(this, "");
        getNode: function(word) {
            var self = this;
            return splitWord(word, function() {
                return self;
            }, function(firstLetter, restOfWord) {
                var children = self.children;
                return children && children[firstLetter] && children[firstLetter].getNode(restOfWord);
        setNode: function(word, cb) {
            var self = this;
            splitWord(word, function() {
                if (cb) {
            }, function(firstLetter, restOfWord) {
                self.children = self.children || {};
                self.children[firstLetter] = self.children[firstLetter] || new Trie();
                self.children[firstLetter].setNode(restOfWord, cb);
        setNodes: function(words, cb) {
            var self = this;
            words.forEach(function(word) {
                self.setNode(word, cb);
        getValue: function(word) {
            var node = this.getNode(word);
            return node && node.val;
        setValue: function(word, val) {
            this.setNode(word, function(node) {
                node.val = val;
        setValues: function(obj) {
            for (var key in obj) {
                this.setValue(key, obj[key]);
        getKeyValuePairsFromPrefix: function(prefix) {
            var subTrie = this.getNode(prefix);
            var results = {};
            subTrie.walk(function(word, val) {
                if (val !== undefined) {
                    results[prefix + word] = val;
            return results;
        getSuffixesFromPrefix: function(prefix) {
            var subTrie = this.getNode(prefix);
            var results = [];
            subTrie.walk(function(suffix, val) {
                if (val !== undefined) {
            return results;
        getKeysFromPrefix: function(prefix) {
            return this.getSuffixesFromPrefix(prefix).map(function(suffix) {
                return prefix + suffix;
        inspect: function(){
            return this.getKeyValuePairsFromPrefix("");
        inspectKeys: function() {
            return this.getKeysFromPrefix("");
    // Now for WordCountStore
    var addOrIncrementVal = function(node) {
        node.val = (node.val? node.val : 0) + 1;
    var WordCountStore = function(){
        this.trie = new Trie();
    WordCountStore.prototype = {
        insertWord: function(word) {
            this.trie.setNode(word, addOrIncrementVal);
        insertWords: function(words) {
            this.trie.setNodes(words, addOrIncrementVal);
        getWords: function() {
            return this.trie.inspectKeys();
        getWordCounts: function() {
            return this.trie.inspect();
    // And for an auto-completer, the basic functionality would just be to get a list using
    // the "getKeysFromPrefix" method from Trie. Other stuff could be added,
    // but it's not currently implemented as a separate class.
    exports.Trie = Trie;
    exports.WordCountStore = WordCountStore;
}(typeof exports === 'undefined' ? this : exports));

(This code can also be found on github.)

Hacker School, Weeks 4, 5 and 6

2013 July 14
by Richard Harrington

tl;dr It’s been a bit of a whirlwind these past few weeks, but I’m back on track with my Robotwar project.

Before the end of the 4th week of Hacker School, I went to Ottawa, Ontario for five days to perform in a comedy I wrote with my friend Chris Kauffman, at a theater festival there (see It was a lot of fun and I was able to reconnect with some old friends from the Canadian fringe theater circuit, a part of my life that seems very far away this summer. But I had to spend more time preparing for it than I expected, since I hadn’t done the show in two years.

I returned on Monday, July 1, and the 4th of July was the following Thursday, so my 5th week of Hacker School ended up being only two days long.

Because of all this, I got a little off my stride in terms of the big project I was theoretically going to work on over the course of the Hacker School batch: the reverse engineering of the Robotwar game. I started to feel a bit adrift again, as I had in the beginning of June. I wasn’t sure I was appropriately equipped to thrive in the unstructured environment of Hacker School, and I was nervous that I wouldn’t accomplish a tenth of what I’d hoped to accomplish. By the end of the 5th week, I had only barely gotten started on Robotwar.

But late in the 6th week (which just ended), I got my second wind with the help of a long pairing session with one of the facilitators, Zach Allaun, who helped me tease apart the various steps in the compilation process for the Robotwar language, making each one more sensible.

The end result of the compilation step (which itself is only the first of many things I have to do to make a functioning game) will be to turn this:

     AIM+5 TO AIM

into this:

0      ,      AIM
1      +      5
2      TO     AIM
3      ,      AIM
4      TO     RADAR
5      IF     RADAR
6      <      0
7      GOSUB  FIRE
8      GOTO   SCAN
9      ,      0
10     -      RADAR
11     TO     SHOT
12     ENDSUB

And that begins with lexing. For those of you (like myself before this summer) who have never heard of the term "lex," it means "lexical analysis." It's the first step in the process of translating human-written source code into machine-readable code, and it involves separating the strings of source code into words and other meaningful chunks. It's also the most notoriously difficult and tedious step, because a lot of syntactic forms look obvious to us but are baffling to a computer.

This Robotwar language is actually pretty simple in that respect -- the only tricky part is determining whether "-" characters are binary subtraction operator or unary minus sign operators.

So here's the version of the lexer I had before I worked with Zach. It separates strings into tokens and also adds an index property to each token, which will be useful information later when we add error reporting to the program. Note the nasty minus sign logic, which checks to see whether the previous token is a numeric value (either a number literal or a register name).

Also, note the use of the core.match module to do pattern matching. Since it's not binding any variables in the matches, one could argue core.match is overkill here, but I feel it does prevent me from having to use a series of suspiciously similar "if" expressions -- that is, if we're in the middle of parsing a token do one thing, otherwise do something else -- for each of the cases.

(use '[clojure.core.match :only (match)])
(def registers (set (concat (map #(-> % char str) (range (int \A) (inc (int \Z))))
                            ["AIM" "SHOT" "RADAR" "DAMAGE" "SPEEDX" "SPEEDY" "RANDOM" "INDEX"])))
(def operators #{\= \< \> \# \+ \- \* \/})
(defn digit?
  (re-find #"\d" (str ch)))
(defn conj-with-metadata 
  [coll s n] 
    (conj coll {:token s, :pos n}))
(defn lex-line
    [line initial-line
     partial-token []
     saved-pos 0
     result []]
    (let [close-partial-token (fn [] (conj-with-metadata result (apply str partial-token) saved-pos))
          current-pos (- (count initial-line) (count line))
          previous-token (:token (last result) "")
          parsing-token? (not (empty? partial-token))
          ; a binary operator is any character on the operator list,
          ; unless it's a minus sign that is actually a unary operator
          ; because the token before it could not be a number
          binary-op? #(and (operators %)
                           (or (not= % \-)
                               (digit? (last previous-token))
                               (registers previous-token)))
          head (first line)
          tail (rest line)]
      (match [head parsing-token?]
        [(:or \; nil) true ]          (close-partial-token)
        [(:or \; nil) false]          result
        [(:or \space \t) true ]       (recur tail [] nil (close-partial-token))
        [(:or \space \t) false]       (recur tail [] nil result)
        ; if it's an operator of any kind and we're currently parsing a token, 
        ; close the partial token and recur on the same character.
        [(_ :guard operators) true]   (recur line [] nil (close-partial-token))
        ; we don't have to check here if the binary operator is false because
        ; if it weren't, it would have been caught by the less restrictive line above.
        [(_ :guard binary-op?) _]     (recur tail 
                                             (conj-with-metadata result (str head) current-pos))
        [_ true ]                     (recur tail (conj partial-token head) saved-pos result)
        [_ false]                     (recur tail (conj partial-token head) current-pos result)))))
(lex-line "IF RADAR < -5 GOTO SHOOT   ; some comments")
; -> [{:token "IF", :pos 0} 
;     {:token "RADAR", :pos 3} 
;     {:token "<", :pos 9} 
;     {:token "-5", :pos 11} 
;     {:token "GOTO", :pos 14} 
;     {:token "SHOOT", :pos 19}]

I was thinking of having one parsing step after the lexing step in the above code, and then a compiling step down to the Robotwar virtual machine code. But after this week I realized that I should have as many steps as I need to make it easy, and I also realized that there's no reason for this first step to have any logic in it at all. I ended up separating a couple of things out into their own functions: stripping out the comments (which happens before the lexing step now) and dealing with the ambiguity of the "-" character (which happens after the lexing step). The minus sign issue turns out to be pretty easy to do in the next step, the parsing step, once we've isolated all the operators and words in the lexing step.

Last but not least, I jettisoned core.match for the time being, since I realized that by getting rid of that minus sign nonsense, I was down to a level of simplicity where I could use regular expressions (it's possible I could have used them before, but I suspect not, and I wasn't quite enough of a regexp master to attempt that one). Here's the new code for the lexer, which is drastically shorter while only sacrificing a tiny bit of functionality that is better off elsewhere in the program anyway:

(def operators "-+*/=#<>")
(defn re-seq-with-pos
  "like re-seq, but returns a sequence of 2-element sequences
  in the form [match pos], where pos is the index of match in s"
  [re s]
  (let [matches (re-seq re s)
        indexes (reduce (fn [idxs match]
                          (conj idxs (.indexOf s match (inc (or (last idxs) -1)))))
    (map list matches indexes)))
(defn build-lex-metadata 
  [s n] 
  {:token-str s, :pos n})
(def lex-re (re-pattern (str "[" operators "]|[^" operators "\\s]+")))
(defn lex-line
  (map #(apply build-lex-metadata %) 
       (re-seq-with-pos lex-re line)))
(lex-line "IF RADAR < -5 GOTO SHOOT")
; -> ({:token-str "IF", :pos 0} 
;     {:token-str "RADAR", :pos 3} 
;     {:token-str "<", :pos 9} 
;     {:token-str "-", :pos 11} 
;     {:token-str "5", :pos 12} 
;     {:token-str "GOTO", :pos 14} 
;     {:token-str "SHOOT", :pos 19})

All that remains now is parsing, compiling to the virtual machine object code, writing an interpreter for it, and making a graphics engine to show robots fighting!

Now that I feel a little more grounded and solidly positioned to make some headway this coming week, I look over the last few weeks and I can see that I actually learned a great deal helping others on their projects, learning Clojure (particularly by doing problems on, implementing a "trie" data structure in JavaScript, having a thousand fascinating conversations about programming, and slowly coalescing ideas for another semi-major project I might work on if I have the time: abstracting out the generic parts of my tic-tac-toe-playing AI, so that you could pass in a game description and have it play any game -- tic-tac-toe, connect four, checkers, chess, whatever. (More on that in a later blog post.)

Last but not least, the above code can be found on github in the following locations:

the main repository
the version that was current when I wrote this blog post
the version from the first code block above

Hacker School, Week 3

2013 July 1
by Richard Harrington

tl;dr Tic-tac-toe, and more Tic-tac-toe

I’d gotten started on my Tic-tac-toe AI (artificial intelligence — a pretty fancy word for a Tic-tac-toe algorithm, but that’s the term people use) in Clojure at the very end of the previous week, and my plan for Week 3 was to wrap that up by the end of Monday, then do an implementation of the Game of Life in Clojure, then start working on my Robotwar project (see my blog post from Week 2). By the end of the week, I was still working on Tic-tac-toe. This was a little discouraging, but I was learning a new language, so I tried to give myself a break.

I actually got a basic Tic-tac-toe AI going fairly easily, using a heuristic algorithm, which means a series of basic rules for picking good moves. Mine picked them in the following order of priority:

  1. Pick a square that will be a winning move.
  2. Pick a square that will prevent the opponent from making a winning move on their next turn.
  3. Pick the center.
  4. Pick a corner.
  5. Pick any other free square.

This pretty much works every time, but as I learned early in the week, there is a more elegant solution, called the minimax algorithm, which can be applied to any zero-sum game like Tic-tac-toe (or really any competitive game, I think, with some modifications and compromises).

Minimax seemed like magic to me when I heard about it. The basic idea is simple to understand: when given any board, either at the beginning of the game or in the middle, have the AI play out every possible combination of moves, figure out who wins at the end of each branch in the tree of all possible moves, and work back recursively to figure out what is the best move for a given player at the given point in the game. (When doing the calculation, the AI makes the assumption that each player will play as well as the AI does.)

There are a number of variations and optimizations for speed, the most important of them being the one where you have to evaluate intermediate game states, because playing every game out until the end would consume too many computing resources (I gather this is true of pretty much any game except Tic-tac-toe). In this version of the algorithm, you set a depth limit, say 10 moves into the future. Then for each branch where the AI ends up calculating 10 moves into the future before it reaches a final board, it has to use some kind of heuristic algorithm to analyze that board 10 moves into the future and figure out which player that board is most favorable to. This means that even in games like Tic-tac-toe where there are only a few outcomes (win, lose, and draw), point values are assigned to intermediate boards so that they can be compared with each other.

The minimax algorithm gets its name from the fact that one player is designated the “maximizing” player, and the other player is designated the “minimizing” player. The maximizer wants the score to be as high as possible, and the minimizer wants the score to be as low as possible. There are a few variations, but a common version is that winning boards are assigned the value positive infinity (if it’s the maximizing player’s turn) or negative infinity (if it’s the minimizing player’s turn), and all boards that must be calculated in an intermediate state receive positive or negative scores, of various values. So in evaluating any given square before the depth limit is reached, its value is the numerical maximum (for the maximizing player) of all the values of the opponent’s next move, or the numerical minimum (for the minimizing player).

If you understand this immediately and can write out the algorithm, you’re way ahead of me. It actually took me a week to get this working. What I kept getting stuck on were the signs, and what it meant for one player to arbitrarily be positive and the other negative.

I ended up finding it helpful to first implement a preliminary version of the algorithm — one without numbers. In a game which has only a few possible outcomes (in the case of Tic-tac-toe, win, lose and draw) and which can also be played out to the end, it is actually not strictly necessary to use numbers for the return values of the recursive function. You can instead use symbols or strings like “win,” “lose,” and “draw,” or better yet, “X”, “O”, and “draw”. The reason it’s better to use “X” and “O” instead of “win” and “lose” is that if you use “win” and “lose”, you have to pass the opposite symbol (“win” instead of “lose”, or vice versa) up the chain at each level of the recursion, because one player’s win is the other player’s loss. If you use “X” or “O”, you don’t have that problem.

But we’re going to start with the most arduous version here, the version that uses no numbers and uses the generic symbols :win and :lose instead of absolute player-specific values like :X and :O. I wrote it because it helped me to work through the logic behind the algorithm, and it also helped me appreciate the numerical version, once I got to it!

Some helper functions not included below include winner, which returns :X, :O, or nil, board-full?, which is self-explanatory, and free-squares, which returns a set of coordinate pairs representing all the free squares on the board. In the first let expression, potential-board is a new board with a marker added. The values of the squares are either :X, :O, or :e for empty, and the values that the minimax function may return are :win, :lose, and :draw.

(defn opposite-marker [marker]
  (case marker
    :X :O
    :O :X))
(defn opposite-result
  (case result
    :win :lose
    :lose :win
    :draw :draw))
(defn value-of-square-minimax-no-numbers
  [board square marker]
  (let [potential-board (assoc-in board square marker)
        winning-player (winner potential-board)
        opponent (opposite-marker marker)]
      (= winning-player marker) :win
      (board-full? potential-board) :draw
      :else (opposite-result
              (reduce (fn [result new-square]
                        (let [val-of-new-square (value-of-square-minimax-no-numbers
                            (or (= val-of-new-square :win) (= result :win)) :win
                            (or (= val-of-new-square :draw) (= result :draw)) :draw
                            :else :lose)))
                      (free-squares potential-board))))))

And now, here’s the eventual version I wrote using the more usual numerical version of the algorithm to compute the value of a square in Tic-tac-toe. It needs to use the Clojure declare form because it involves two mutually recursive functions, but it is more elegant than the one above.

Because I’m still not bothering with intermediate depth calculations — I haven’t gotten to that point in my study of the minimax algorithm — I’m just assigning the arbitrary values 1 and -1 to wins by X and O. I could have actually used any positive and negative numbers.

In the version below, 1, -1, and 0 are actually used to represent both the values of the markers (X, O, and empty, respectively) and the return values of the minimax function (X wins, O wins, and it’s a draw, respectively). The final-score function statically analyzes a board and returns 1 for X winning, -1 for O winning, 0 for a draw, and nil for a game that is not over.

Note the lack of the need for an opposite-result function, and also the lack of a need for a complicated reduce operation (we can just use the min and max functions instead of reduce:

(declare value-of-square-minimax)
(defn aggregate-value-of-free-squares-minimax
  "helper function to use recursively with value-of-square-minimax"
  [board marker]
  (apply (if (= marker 1) max min)
         (map (fn [square]
                (value-of-square-minimax board square marker))
              (free-squares board))))
(defn value-of-square-minimax
  [board square marker]
      (let [potential-board (assoc-in board square marker)]
        (or (final-score potential-board)
              (- marker)))))

That’s about it. All of this code (with slight variations) can be found at the github repo richardharrington/hs-tic-tac-toe. And here is the old version, without numbers.

Why does Dog.prototype.constructor === Dog ?

2013 June 26
by Richard Harrington

Question: In JavaScript, why does a constructor’s prototype’s constructor equal the constructor itself? In other words, why does

Dog.prototype.constructor === Dog


Short Answer: So that this will work:

var Dog = function(){}
var myDog = new Dog();
myDog.constructor === Dog;
=> true

Long Answer:

What confused me when I first started learning this stuff was the use of the word “prototype” as a property name. It seems like the “prototype” property of an object should refer to that object’s prototype, but it doesn’t.* A constructor’s “prototype” property is not a pointer to the object that the constructor looks to when it wants to start looking up its own prototype chain; rather, it is a pointer to the object that the objects created by that constructor look to when they want to start looking up their prototype chains.

Therefore, when one of those objects wants to see which object created it, it accesses its “constructor” property. Since it has no “constructor” property of its own, it starts up its prototype chain to find a “constructor” property. The first place it looks is the “prototype” property of its constructor function, which contains a property called “constructor,” and that “constructor” property should point to the constructor of the object that started the prototype chain lookup.

In order for that to work, every constructor has to have a “prototype” property which has a “constructor” property which points back to itself, the constructor, so that objects created by that constructor will be pointed to it when they start looking up their prototype chains. So that’s why MyObject.prototype.constructor == MyObject.

Keep in mind also that this “constructor” property is totally unsafe and not to be trusted, unless you take care to keep it intact. At the time any function is defined, it gets a “prototype” property which points to an object having exactly one property: “constructor.” After that, it’s fair game. You can reassign the “constructor” property to anything you want, or, more commonly, you can reassign the entire “prototype” property itself, which will eliminate the “constructor” property. This can sometimes be convenient, but if you care about preserving the “constructor” property (and you may not), then you have to reset it manually after reassigning the “prototype” property. For instance:

var Seal = function(){}
Seal.prototype = {
    waveFlippers: function() { ... },
    bark: function() { ... },
    chillOutOnTheSand: function() { ... }
Seal.prototype.constructor = Seal;

* For that, you used to have only the special property __proto__ which was non-standard and only worked in certain browsers, and now in ECMAScript 5 you have Object.getPrototypeOf(object)