Narvius' Blog: 2012

Tuesday 11 December 2012

Messing with probabilities (part 1: The math)

[Edit: This is no longer "part 1", but the whole article. See "How I screwed myself over with side effects".]

In the mini-project I was thinking of recently there are a lot of coin flips. They usually occur in groups of one to eight, and I only want to know the total amount of heads. Now, I could simply flip that digital coins that many times and be done with it, or I could be a bored college student who actually should be doing other things right now, and think of something more sophisticated.

For something so trivial, there must be a way to reduce the several random flips to ONE random number generation. It's just a matter of dividing the [0, 1]-space into the right intervals, to maintain the same probability for every outcome.

Having cursory knowledge of statistics, my thoughts immediately wandered to something called the binomial distribution. In short, a binomial distribution with parameters n and p - denoted as B(n, p) - describes the chance that x out of n trials with chance p to succeed, will succeed. For example, here is the probability distribution function f(x) (or, in this case, probability mass function, since it's discrete) for B(5, 60%).

This chart tells us that the chance that if we try something that has a 60% chance to succed five times, our chance to succeed twice is around 25%.

It's not hard to notice that getting heads on a coin flip is a random trial with a chance to succeed of 50%. Thus, B(n, 50%), where N is the amount of coins flipped, perfectly describes what I'm trying to achieve!

Calculating the PDF for B(n, p) is a fairly trivial matter, just translating a formula. Here's the Clojure:

(defn binomial [n k]

  (letfn [(fact [n] (reduce * 1 (range 2 (inc n))))]

    (/ (fact n) (fact (- n k)) (fact k))))

(defn binom-pdf [n p]

    (fn [x]

    (if (<= 0 x n)

      (* (Math/pow (- 1 p) (- n x)) (Math/pow p x) (binomial n x))

      0)))

So, now we have the PDF for B(n, p). There's another, related function, called the cumulative distribution function. It's a function F(x) that describes the chance that the result will be less than x. Seems like a weird thing to devote an entire function to, but it's surprisingly useful in practice. The CDF for B(5, 60%) looks like this:

And, the PDF again:

The CDF is simply the sum of all the PDF entries for values less than x. For example, in this case, CDF(3) = PDF(0) + PDF(1) + PDF(2) + PDF(3). This is why the CDF is always rising, and why the final value is exactly equal 1 (= 100%).

Using the cumulative distribution function, we can now map one single random number to a final result. Just for reference, the values of this CDF are roughly:
CDF(0) = 0.01
CDF(1) = 0.087
CDF(2) = 0.317
CDF(3) = 0.663
CDF(4) = 0.922
CDF(5) = 1

Thus, we have a set of intervals, that each corresponds to a result:

[0, 0.01) => 0
[0.01, 0.087) => 1
[0.087, 0.317) => 2
[0.317, 0.663) => 3
[0.663, 0.922) => 4
[0.922, 1] => 5

Which solves the original problem: We divided [0, 1] into intervals that maintain the same weight for every possible outcome. Now, we can just generate one random number, pick the interval it belongs to, and we have our final result.

And how to more-or-less elegantly do that in code, I'll show in the next post (actually, the post AFTER the next one - there's another thing I want to interject first).

Monday 6 August 2012

Macros I've been using

Here are a few macros I've been using.

The epic trio of Debug macros:

1. The regular, simple debug:

(defmacro ?? [& form]

  `(let [result# ~form]

     (println "DEBUG:" '~form "=>" result#)

     result#))

This really is the most simple version there is. If you want to know what a particular expression you cannot for some reason simply extract returns, add ?? as the first element of that expression.

(+ 2 3 (* 4 5) (?? - 5 3))

DEBUG: (- 5 3) => 2

=> 27

2. The regular, abridged debug:

(defmacro ? [& form]

  `(let [result# ~form]

     (println "DEBUG:" '~(map #(if (coll? %)

                                 "..."

                                 (str %)) form) "=>" result#)

     result#))

This one will replace sub-forms in the form with "...". This is useful if you don't want your output too cluttered:

(? + 2 3 (* 4 5) (- 5 3))

DEBUG: (+ 2 3 ... ...) => 27

=> 27

3. The deep inspect:

(defmacro ??? [& form]

  `(? ~@(map #(if (coll? %)

                `(??? ~@%)

                %) form)))

This one places an abridged debug "?" in every form in the tree in which it is placed. For example:

(??? + 2 3 (* 4 5) (- 5 3))

  is equivalent to

(? + 2 3 (? * 4 5) (? - 5 3))

Thus giving:

(??? + 2 3 (* 4 5) (- 5 3))

DEBUG: (* 4 5) => 20

DEBUG: (- 5 3) => 2

DEBUG: (+ 2 3 ... ...) => 27

=> 27

So, recapping, add one to three question marks, the more you put, the more info you get.

It would be, of course, trivial to implement ????, which would be equivalent to a tree full of ??'s, but what's the point? Since ??? is intended to be used on trees, not shallow expressions, the verbosity of ?? on every level would be thorough overkill.

Multi-set!:

(defmacro multi-set! [target & pairs]

  (let [x (gensym)] 

    `(let [~x ~target]

       ~@(for [[field value] pairs]

           `(set! (. ~x ~field) ~value)))

       ~x))) 

Due to working with java.awt.GridBagConstraints a lot recently, I got annoyed with how often I have to rewrite the line (.set! (.[some field] c) [some value]). So, being lazy as hell, I'd much rather write

(multi-set! [instance-expr]

  ([field] [value])

  ([field] [value])

  ([field] [value])

  ...)

In fact, I wrote a macro called "constraints" for working with GridBagConstraints, but it's a tad too long to share here. Also, it's boring.

This macro actually uses a fun trick, aka displays a subtle problem with the auto-gensym feature (symbols with # appended are replaced with a unique name, in order to avoid name clashes).

If you were to write...

(defmacro multi-set-wrong! [target & pairs]

  (let [x# ~target]

    ~@(for [[field value] pairs]

        `(set! (. x# ~field) ~value))

    x#))

In this case, the auto-gensymmed symbols x# in the let-form and inside the set!-form are DIFFERENT symbols, because they are within DIFFERENT quotes. So, this version doesn't work as intended. In fact, the generated (set! ...) expressions will throw an exception, because the symbol x# in them could not be found.

Thus, in the actually working version, we create a symbol by hand, and use that symbol throughout the macro. This is a trick I learned from reading the source of the doto macro (I had a hunch I'd find something useful there, because the macro works similarly to mine - I was right!).

Convenient Map Writing Macros:

More than once, I've written macros that simply implement a more convenient syntax for maps. For example, since I'm learning japanese right now, I wrote a program to quiz me on Hiragana. For that, I had to store all corresponding Unicode characters in the program, and map them to the Romaji (latin) transcriptions.

(defmacro defseries [group & chars]

  `(def ~group

     ~(into [] (map (fn [[name code]]

                      (let [code (+ code 0x3000)]

                         {:name name, :code code}))

                     chars))))

While this wasn't the perfect approach in this case (here, I still had to type char => pairs, just with smaller values and less overhead), but it still vastly reduced the typing I had to do. I do things like that quite frequently.

(defseries a ("a" 0x42) ("i" 0x44) ("u" 0x46) ("e" 0x48) ("o" 0x4A))

I typed out roughly 15-16 series like that. So I did save me some typing with just a tad of thinking.

That's it for now.

N.

Tuesday 17 July 2012

Dynamic Bindings

Time for another post... Nothing fancy though, just something simple I wanted to share. Also, yes. Clojure.

Recently, I've been doing something that involved some statistics (...I've been trying various formulae for calculating combat damage. Eh.), as well as some randomness (dice rolls, I admit it. -.-").

More clearly: The formulae include randomness as part of their calculations. I, when making tables and lists of results, do not want any randomness. Instead of modifying the function definition, or adding some hacky additional parameter, the entire thing can be solved in a rather elegant manner via dynamic bindings.

Dynamic bindings are a mechanism that allow you to... aw, shucks. I can't express it in one sentence, at least not clearly. Consider this example instead:

(def foo 2)

(defn bar []

  (let [foo 4]

    (baz)))

(defn baz []

  (println foo))

Rather obviously, the output is 2. After all, the redefinition of "foo" is contained to the lexical context of the "let" within bar. Now consider this:

(def ^:dynamic *foo* 2)

(defn bar []

  (binding [*foo* 4]

    (baz)))

(defn baz []

  (println *foo*))

This will print 4. At first I thought this kinda goes against the idea of pureness, otherwise so prevalent in Clojure. But as with everything, the only important thing is how you use it. As I've said, there are things that can be solved in a more elegant way thanks to this tool.

Consider something as mundane as I/O redirection. Clojure makes, of course, use of streams, and all I/O operations use, per default, the standard input and output streams. Now, sometimes we want to use different streams... a naive way would be then to simply add optional arguments to the I/O functions and be done with it. Except... it's ugly. Really. Wouldn't it be more awesome if we could just say "Those operations? Do them on this stream instead, kthx."? With dynamic bindings, we can do exactly that. clojure.core defines the dynamically bindable var *out*, the value of which is the currently used stream. So, we might do this...

(binding [*out* (some-file-or-whatever)]

  (println ":D:D:D")

In the scope of this binding, all I/O operations that are aware of *out* will act on the new stream instead. Pretty cool, eh?

Anyways, dice, randomness. You might already see what I did there. And either way, I don't think it requires much more explanation...

(def ^:dynamic *dice-mode* :random)

(defn roll-dice [[amount sides modifier]]

  (case *dice-mode*

    :random (reduce + (or modifier 0) (for [i (range amount)] (inc (rand-int sides))))

    :average (int (Math/round (+ (* amount (/ sides 2)) modifier)))

    :zero 0))

Yes. In the part where I do the statistics, I can now either take the average of dice rolls, or simply completely omit them, via a simple binding.

You might be wondering about why I'm using asterisks. It's not per se required to use them. It's just, you are supposed to name dynamic variables this way. Clojure will give you a stern talking to if you use a dynamically bindable var and do NOT assign it a name surrounded with asterisks (or when you use such a name with regular vars). I guess the reason is readability? I'm not sure though. I don't mind either way, so I'm not likely to look it up. You can, though.

N.

Thursday 14 June 2012

VK_VERYLONGKEYNAME (on reflection in Clojure)

So, I'm still writing a game. As it happens with games, I rely on keypresses for the user to tell me what he wants to do. That made me fall back on Java's KeyListener/KeyAdapter facilities.

The basic idea is that I get callbacks with information about the pressed key as argument (WARNING: GROSS OVERSIMPLIFICATION.), among other things the keycode, which is an integer. Now, I'm supposed to dispatch it using constants defined in KeyListener/KeyAdapter/KeyEvent/probably some other places.

Problem?

It's a lot of typing in Clojure. Especially since each constant is called VK_(something). I had large maps looking like that:

{ KeyEvent/VK_UP [some function]

  KeyEvent/VK_DOWN [some function]

  KeyEvent/VK_RIGHT [some function]

... }

You get the idea. Since I have several keymaps, that gets annoying quickly. So I started to define aliases for some frequently used ones in a seperate "keys" namespace. Now I only had to write, for example, "keys/up" instead of "KeyAdapter/VK_UP". That's a lot of saved space (and some gained readability).

But that introduced something else that annoyed me: Some of the more rare keys were still written out fully, and it created an aesthetic discrepancy, which upset my minor OCD. Besides, having to redefine tons of variables by hand isn't particularly elegant in itself.

(Disclaimer: I am aware of clojure.contrib/import-static. But I don't rely on that library enough to warrant including it in my project just for this.)

Sure enough, there is a better way.
The first nugget of information I needed was... how to get a list of all fields in a Java class?

Of course, Stack Overflow helped me there. It's

(clojure.reflect/reflect java.awt.event.KeyEvent)

It spews out a LOT of things like this:

{:bases #{java.awt.event.InputEvent},

 :flags #{:public},

 :members

 #{{:name VK_F17,

    :type int,

    :declaring-class java.awt.event.KeyEvent, 

    :flags #{:static :public :final}}

   {:name VK_DEAD_DOUBLEACUTE,

    :type int,

    :declaring-class java.awt.event.KeyEvent,

    :flags #{:static :public :final}}

...}

This is the abridged pretty-print output of the stuff. We know all the VK_ fields are keycodes. Everything else falls in place itself, basically. Most of what I need I covered in a previous post already.

We want to:
1. From the list of members, filter out everything that is not a keycode.
2. Define the aliases over it.

Simple enough.

(->> (clojure.reflect/reflect java.awt.event.KeyEvent)

     (:members) (map :name) (filter #(.startsWith (str %) "VK_")))

This yields us a list of symbols (which I am grateful for, as it simplifies the next step) of all fields that contain keycodes. Now, as in my earlier post, we doseq over the list and define the aliases via eval in the loop. I also wanted the VK_ dropped and the entire name downcased, but that's just my whim.

(doseq [sym (->> (clojure.reflect/reflect java.awt.event.KeyEvent)

            (:members) (map :name) (filter #(.startsWith (str %) "VK_")))]

  (eval `(def ~(symbol (.toLowerCase (drop 3 (str sym))))

           (. java.awt.event.KeyEvent ~sym)))) 

That's it. Now, there are vars like "a", "up" and so on that hold the corresponding keycodes. I put the entire thing into a new namespace (game.keys) that I require like that:

(ns game.some-module

  (:require [game.keys :as keys])

Another nonproblem solved! :D

Wednesday 13 June 2012

A Simple Solution; Part Two

A Short Solution, Part 2:

In the previous part, I laid out the initial draft of the solution to the problem described there, as well as, in the end, pointed out a few problems with said solution.
In addition, you might have noticed that the definition of generate-node was superfluous; in fact, I factored it out. The (cons chunk...) was added directly inside generate-tree.

Anyways. First we shall deal with dealing with duplicate entries in the chunk list and there not being a value in the root node.

(defn to-tree 

  "Generates a solution tree for the given word and chunk list."

  [word chunks]

  (cons word (generate-tree word (keys (group-by identity chunks)))))

This function will be the one called in the end - it's not done yet, but both the mentioned problems are solved. The value of the root node is now the word being assembled, and
the chunk list is filtered of duplicate entries (I am not sure whether there is some other, more efficient method, this was what I thought of off the top of my head when writing).

Now, for the harder part. Filtering invalid entries out of the tree. Any branch that ends without an entry is invalid - but guessing which ones are invalid before everything is
generated would be impossible (I think?), or at least extremely hard to implement. So I went for the simpler, more functional approach: Filter out invalid entries after everything
is generated. Since the user will never see the unfiltered step inbetween, we can mark invalid branches in some way instead of writing a function that assembles a branch in order
to check it for validity.

So, first we need to modify our tree generation function:

(defn generate-tree

  [word chunks]

  (if-let [nexts (seq (possible-nexts word chunks))]

    (map #(cons % (generate-tree (drop (count %) word) chunks)) nexts)

    (when (seq word) '((:invalid)))))

(This also contains the correction with generate-node.)

Now, when there are no possible nexts, and there still remains some part of the word to be assembled (note again how (seq ...) is used as emptiness check), we add a node with the
value of :invalid. So now we have all invalid branches marked.

Then, the filtering...
First, we need a validity predicate, let's call it "valid?" (how imaginative).
First, let's consider which branches are technically invalid.
Given:

"valid?" ["va" "li" "id" "l" "d" "d?"]

We have:

va \ - l - id - :invalid
\ - li \ - d - :invalid
- d?

From the viewpoint of the "va" node, the "l" branch is invalid, and the "li" branch is valid. That is because all the children of the "l" branch are invalid, but one child of the
"li" branch is valid. Seeing that, we can conclude that nodes are invalid if:

1) It has at least one child and all children are invalid
or
2) The own value is :invalid

Let's code it:

(defn valid?

  "Returns whether a node in the tree is valid."

  [[value & children]]

  (cond (= value :invalid) false

        (not children)     true

        :else (reduce #(or %1 %2) false (map valid? children))))

Fairly simple, is it not? The check is, obviously, recursive. I suppose that if we could traverse the tree from the lowermost nodes instead from the root, the algorithm would be faster
(we could simply hack off invalid branches without having to check the entire depth if a branch is invalid), but our tree nodes are not aware of their parents, nor do we have a list of
branch tips.

So, given this check, we can now filter.

(defn filter-valid

  "Filters invalid branches from the tree."

  [[value & children]]

  (cons value (map filter-valid (filter valid? children))))

Again pretty much one single line.
Now, add this to to-tree...

(defn to-tree 

  "Generates a solution tree for the given word and chunk list."

  [word chunks]

  (filter-valid (cons word (generate-tree word (keys (group-by identity chunks))))))

DONE. Simply call to-tree with the word and the chunks and it spews out the final result. You can run it through clojure.pprint/pprint to make it readable, but that only aesthetics. ;P

Final program:

(defn possible-nexts

  "Given a word and a list of chunks, returns a list containing the chunks that are contained at the beginning of the word."

  [word chunks]

  (filter #(= (take (count %) word) (seq %)) chunks))

(defn generate-tree

  [word chunks]

  (if-let [nexts (seq (possible-nexts word chunks))]

    (map #(cons % (generate-tree (drop (count %) word) chunks)) nexts)

    (when (seq word) '((:invalid)))))

(defn valid?

  "Returns whether a node in the tree is valid."

  [[value & children]]

  (cond (= value :invalid) false

        (not children)     true

        :else (reduce #(or %1 %2) false (map valid? children))))

(defn filter-valid

  "Filters invalid branches from the tree."

  [[value & children]]

  (cons value (map filter-valid (filter valid? children))))

(defn to-tree 

  "Generates a solution tree for the given word and chunk list."

  [word chunks]

  (filter-valid (cons word (generate-tree word (keys (group-by identity chunks))))))

So, that's it. FIVE function ranging from one to three lines of executed code each.

Wednesday 2 May 2012

The Last Remnant

Alright, so far this blog was solely a about programming. Time for a change of pace.

The Last Remnant. Yes. I'm having such a hard-on for this game right now.
It's an jRPG, by Square Enix (the Final Fantasy guys). Like any respectable jRPG it is full of Guide, Dang It! moments, but WHO CARES. Seriously.

This is probably one of the most epic games I have played so far, despite having menu-based combat. Really. The game starts off really fast, having you battle overwhelming numbers of enemies with only three or so squads at your disposal. The epicness is hard to describe, really. And all of this underlined by the AWESOME soundtrack. It is... powerful, yet humble enough as not to overshadow what is happening. It's tremendous. Here, listen for yourself. Or this.

Also, I've enabled the japanese voiceover. It's just funny to hear japanesemen attempt to say "Yes, My Lord!" - it sounds roughly like "YES MA ROD" :D Or there is this large pigman guy thingie named "Blocter", I chuckle everytime I hear "BUROKTA" yelled ingame.

The game was criticised for it's extremelly stupid and childish main character, which incidentally happens to be, y'know, a stupid child. Nevermind the fact that he lived most of his life on a secluded island without parents or whatnot. And, from what little japanese I understand (which is virtually none), I gather that he simply gained a little obnoxiousness in the translation. For example, in the original voiceover, I've heard him thanking "Dave" several times, which is absent in the translation. Really, he has some disregard for authority (not weird in his position), this is commented on in the game (later on only by exasparated sighs and shrugs from David, who happens to be a marquis).

I don't claim I like him, but neither is he the annoying useless character everyone says.

Anyways. This game.
THIS GAME.

Epic. More music.

The battles are really intense for some reason. Unlike other menu-based jRPG's, you don't give exact commands to your troops. Each "Union", consisting of one to five characters, is given a general order by you at the beginning of the turn. Examples include the regular "Attack", "Attack with Combat Arts!", "Keep your HP up!", and others. The regular attack changes its name depending on the situation, which is yet another detail that makes everything more epic. You don't "Attack" bosses, you "Charge" them. If you fight against something significantly stronger, you tell your troops: "Don't be afraid to die!" There are other situational commands: If one union is getting battered, another can be given the command "Save them no matter what!".

There's also a morale bar. It increases when you pull of maneuvers like back attacks, interceptions and so on, or defeat enemy unions, decreases when the enemy does any of that. It influences how well your units fight.

Sometimes, a unit may suddenly disregard your command. This sounds wrong, but actually is really cool - your union is getting hammered apart suddenly, and one of the spellcaster decides on his own to heal the group. Especially during boss fights - the without doubt most intense parts of the game - this adds a lot to the experience.

While picking your commands, all fighting unions sparr with each other, the battlefield never rests - no damage is dealt or anything, but it looks cool.

Really, all this builds up to an extremely immersive and FUN experience.

Buuuuut, getting down from my unrestrained fanboyism for a moment...

This game has flaws. Lots of things are confusing and make no sense unless you read an in-depth guide (up to and including the levelling system). You might find out a lot through experimentation... but that would take thrice the time, I guess. Sometimes, you will be crushed by relatively weak, or equal enemies through no fault of your own (it's rare.) Advancing the story may make things unavailable, a problem for completionists. Sometimes you will run into superpowered enemies who will, again, crush you within seconds - with no warning whatsoever. And maybe at least a dozen other pet peeves.

Despite all this, this game...

Here, have another battle theme.

Despite all this, this game is PURE. Awesome.

Wednesday 25 April 2012

A short solution.

My friend and roommate (and possibly future brother-in-law, heh.), who studies IT, was presented with the following problem/homework:

Given a string and a list of other strings, build a tree of all possible ways of assembling the given string from the strings in the list.

For example, given "tree" and ["tr" "e" "t" "re"], possible branches are ("tr" "e" "e"), ("t" "re" "e"), and so on. Each item from the list can be reused, so "tree" and ["tr" "e"] produces one valid result: ("tr" "e" "e"). A very simple final solution looks like this: (("tr" ("e" ("e")) ("ee"))) (for "tree" and ["tr" "ee" "e"]).

He had to do it in Java. Since I thought the problem was interesting, I decided to implement a solution in Clojure.

Working in a language that builds trees out of nested lists, my choice of datatype was obvious: Nested lists (you probably didn't see that one coming). The first element of each list is the value of the node, whereas the rest of the elements are its children.

So, I though about what I needed.
First of all, a function that filtered the list of strings (lets call them "chunks" here), leaving only the chunks that are included at the start of the word. For example, given the same parameters as above, the function would return ("tr" "t"). Fairly simple to write:

(defn possible-nexts ;; Yes, slightly awkward name.

  "Given a word and a list of chunks, returns a list containing the chunks that are contained at the beginning of the word."

  [word chunks]

  (filter #(= (take (count %) word) (seq %)) chunks))

It's a fairly simple function, and I believe I don't need to explain how it works (and actually, it's largely irrelevant, as long as it produces the output we need).

Now, since we want to generate a tree, the generating function will obviously be recursive. The general idea for the algorithm is this:

1) Given a word and chunk list, obtain a list of possible nexts.
2) For each possible next chunk, recursively call this with the chunk cut off from the beginning of the word (so with "tree" and "tr" the call would get "ee") and the whole chunks list
3) If there are no possible nexts, terminate.

Simple enough.

(defn generate-tree

  "Generates a tree of all possible ways \"word\" can be assembled from the \"chunks\"."

  [word chunks]

  (when-let [nexts (seq (possible-nexts word chunks))]

(map #(generate-tree (drop (count %) word) chunks) nexts))))

This was my original version. Again, a very simple function (though I managed to screw up, see below). when-let is fairly simple and does what it says on the tin, whereas seq coerces a collection into a seq, or nil if the collection is empty. This is great for at least two reasons:
1) You get uniform behaviour for all collection types.
2) Free emptiness-check!
Doing emptiness checks via seq is apparently idiomatic in Clojure. I found it rather confusing in the beginning, but now I like it.

Anyways, I have to admit I failed to store the value in each node. But that was quickly rectified with some minor changes...

(defn possible-nexts ;; Yes, slightly awkward name.

  "Given a word and a list of chunks, returns a list containing the chunks that are contained at the beginning of the word."

  [word chunks]

  (filter #(= (take (count %) word) (seq %)) chunks))

(declare generate-node)

(defn generate-tree

  "Generates a tree of all possible ways \"word\" can be assembled from the \"chunks\"."

  [word chunks]

  (when-let [nexts (seq (possible-nexts word chunks))]

    (map #(generate-node (drop (count %) word) chunks %) nexts))))

(defn generate-node

  "Generates a tree node." 

  [word chunks chunk]

  (cons chunk (generate-tree word chunks)))

That's pretty much it. Less than ten lines total. Or so I (actually, we) thought. There turned out to be a few actual problems, and one aesthetic one.
For one, I didn't have a root node, per se, just a list of subtrees (one for each possible first chunk). I decided the root node shall hold the original word that is being assembled. This is easily solved; call (generate-node word chunks word) instead of (generate-tree word chunks). In light of the other problems, though, I decided to write one last function:

(defn to-tree

  "Nothing much, yet. But this shall be the entry point for the \"program\" shortly."

  [word chunks]

  (generate-node word chunks word))

The other problems are more serious: With certain combinations of chunks, there would be incomplete branches. For example, given "lol" and ["lo" "l"], one branch would be simply... ("l"). We don't want this garbage.
Other than that, if there were duplicate chunks, parts of the tree would be duplicated as well. So, we might want to filter the chunks list.

But I will solve both of those in the next posts, this is getting a bit long. You might have noticed that the "program" mostly consists of one-liners; my original, slightly less verbose (no function descriptions) was only 9 lines (2 whitespace, 3 function "header" and 4 lines of function body). The next post won't add more than ten lines (not counting function descriptions, again), either.

It also performs rather well, it was clearly faster than my friends Java version (though I suspect an adequately experienced Java programmer may write one faster than my nineliner. As might an adequately experienced Clojure programmer, I guess...).

Functional languages amaze me over and over again with how little I need to do to achieve what I want.

Saturday 21 April 2012

Installing Clojure: Linux, Emacs, Swank - Step by Step

(This was posted in 2012. Don't count on it being up to date.)

I had to format my computer recently, so now I also have to go through the entire setup process again.

When I did it the first time, it was an epic pain in the arse, but this time I got through with barely a problem, so I decided I might want to share my fairly simple method. Also, because it annoys me when I find handwritten guides and am too agitated to check the post date for outdatedness, I am writing this post on the 21 April 2012. And minor explanations why I do certain things, because that's another thing I don't like about handwritten guides: Magic. ;P

I'm using Emacs + leiningen under Linux Mint to serve all my needs (any linux distro will do, I suppose. I'm not an expert on distros, or Linux in general, quite the opposite). Leiningen is a truly wonderful tool that is surprisingly comfortable to use, provided the command line does not scare you, of course. Anyways, leiningen provides much useful functionality for handling projects and is fairly configurable. It also, and that is the most important thing, I think, it handles dependencies for your projects. Emacs is of course the editor we'll use.

So, without further ado:

1) Get all the things you need (via "sudo apt-get install [name]"). After a fresh install of Mint 12, that was:
a) emacs23
b) leiningen
Any other dependencies should be handled by apt-get.

2) Install the swank plugin for leiningen.

> lein plugin install swank-clojure 1.4.2

A very useful thing. Really, very useful. Very short version: Get reasonably feature-loaded REPL within Emacs with minimal hassle. Long version: Slime. Slime is a REPL for lisps (it literally stands for "Superior Lisp Interaction Mode for Emacs"), which consists of two parts, the server and the client. This easily allows Emacs to interact with it; the client is integrated into Emacs, with it not caring where the server really is. lein-swank provides the server part, customised for Clojure.

3) Launch emacs once. I usually use emacs -nw, so it runs in the console, but some people might prefer the more interface-y version. This spawns the .emacs.d/ folder in your home directory.

4) Get clojure-mode (and paredit).

First, create and enter ~/.emacs.d/site-lisp **
Paredit is a minor mode for Emacs that makes working with LISPs sane, and also infinitely more pleasurable. It has a few quirks you need to get used to, but saying it is "recommended" when working with a LISP (like Clojure) is like saying that breathing is "recommended" when living.
Obtain it via

> wget http://mumble.net/~campbell/emacs/paredit.el 

** Small plugins for Emacs, like paredit, I usually put into a common site-lisp directory, whereas larger ones, that span multiple files, go into their own directories. You don't have to do it like that, of course, you''ll just have to adjust the load paths (will come up soon) accordingly. It's not hard.

Now, Clojure Mode. I am sad to say my git-fu failed me, and I downloaded it from github via the ZIP button and unzipped it into ~/.emacs.d/clojure-mode/. Oh well, one day I shall be capable of everything from within the command line!
Anyways, Clojure mode. Major mode for Emacs. It provides syntax highlighting, intendation, and other stuff for Clojure.

5) Configure the crap out of Emacs.

This is pretty much the final step. Go to your home directory, and open the file called .emacs - this file will be executed when Emacs is being started up and can be used to configure it.
We'll be using a few commands in Emacs Lisp, a language integrated into Emacs, here.

a) I usually disable the menu bars and stuff. If you want, too, put:

(menu-bar-mode nil)

This is of course entirely optional, but those one or two lines of extra space are more valuable to me than those pesky menu bars. :P

b) Add site-lisp to the load path.

(add-to-list 'load-path "~/.emacs.d/site-lisp")

That way, Emacs knows where to look when you want to load a file. Actually, there are a multitude of directories on the load path already (to check which ones, type "load-path" into the file and press C-x C-e with the curser directly after what you just typed. Just remember to remove that line afterwards :3), and you theoretically could use those, but they are shared, and it's always a good idea to keep personal configurations bound to your account - thus we use the appropriate directories in your home folder.

c) Load paredit

(require 'paredit)

Fairly self-explanatory - we load the paredit.el file. This does not activate it yet, but it supplies the function paredit-mode, which can be called to enable (and disable) paredit. We will use it in the next steps. Emacs knows where to find paredit.el thanks to our adding the site-lisp directory to the load path, of course.

d) Load and configure clojure-mode

(add-to-list 'load-path "~/.emacs.d/clojure-mode")

(require 'clojure-mode) 

Two fairly self-explanatory steps, again. We basically do the same as with paredit. Again, this does not yet activate clojure-mode, it merely loads the file.

(add-hook 'clojure-mode-hook

          (lambda ()

            (paredit-mode 1)))

This is the by far trickiest part, but by no means complicated. add-hook takes two arguments, the hook (I'm not sure what exactly they are, but they are usually [or always?] called [mode-name]-hook), and a function. Then, whenever the mode is enabled, that function is called. Due to this snippet of code, whenever clojure-mode is enabled, paredit goes along with it.

Almost done.

clojure-mode gets enabled automatically when you open a .clj file, though you can manually enable it via M-x clojure-mode any time.

Once you have clojure-mode loaded, you can use M-x clojure-jack-in to open a slime session/REPL (after typing that, wait for a few seconds - booting it up takes some times).

Actually, you could add clojure-jack-in to the function that is hooked onto clojure-mode, so that the REPL is automatically launched, though since clojure-mode is started everytime you open a new .clj file, it tries connecting each time you do so (and succeeds, but I'm not sure if there are any side effects - better to just do it manually, or do some slightly more advanced hacking that checks whether there is a clojure repl buffer and doesn't start a new one if there is).

And a final thing about clojure-jack-in: It is intimatly bound with leiningen. It will only work within leiningen projects, which isn't much of limitation actually. Simply learn to use leiningen - it is worth it.

Aaaaand now really a final thing:

When using swank-clojure 1.4.2, I got an error message about Emacs not being able to start a swank server - it STILL WORKS for some reason. If it doesn't, or the message annoys you, use swank-clojure 1.3.3.

I hope I helped.

Saturday 31 March 2012

Final thing on destructuring.

I haven't been writing in some time, that is because I haven't been doing much coding lately. Now I've started again, and immediately found something interesting.

So, I managed to nuke my environment (Emacs/Slime/Swank) by trying to install a Common Lisp thingie next to it, so I had to setup everything again and... it turned out my previous setup was unnecessarily complicated and hacky. Oh well.

Anyways, I've been expanding a few macros for fun to see how exactly they work (->, to be exact), and expanded let by accident.

So... one quick call to (doc let) later...

`01`	`(defmacro` `let`

`02`	`"binding => binding-form init-expr`

03

`04`	`Evaluates the exprs in a lexical context in which the symbols in`

`05`	`the binding-forms are bound to their respective init-exprs or parts`

`06`	`therein."`

`07`	`{:added` `"1.0",` `:special-form` `true,` `:forms` `'[(let` `[bindings]` `exprs)]}`

`08`	`[bindings` `&` `body]`

`09`	`(assert-args` `let`

`10`	`(vector?` `bindings)` `"a vector for its binding"`

`11`	`(even?` `(count` `bindings))` `"an even number of forms in binding vector")`

`12`	`(`let*` `~(destructure` `bindings)` `~@body))`

Actually, this is taken from clojuredocs.org. Neat site. But I found it out over (doc let).
It turns out let is NOT a special form, let* is. And then...

(destructure bindings)

Yes. Destructuring is actually implemented in the language. clojure.core/destructure, to be exact. You can even pass quoted argument vectors to that function and see the output.

The source is ugly (and long) as hell, but still, it's fascinating this is implemented in the language as well.

Saturday 18 February 2012

Generating functions

Macros in LISPs are fascinating, and from what I gather, they are fairly well implemented in Clojure. As I'm sure I've mentioned before, I'm writing a game in it, and recently I wrote a piece of code I find very interesting.

In that game, I have units, and those units have attributes like health, attack power and so on. I wanted to write a bunch of functions to get, set, modify and reset those values. Thats four functions times seven attributes... 28 functions. Naturally, it would be crazy to spell out ALL of those accessors by hand, especially since they are almost identical to each other in many cases.

What I did is made of three parts, basically.

First, I defined a set of attributes, which will be used later.

Second, I wrote generic accessors, accepting a keyword naming the attribute to work on as first argument, and whatever else they needed as the remaining arguments. I called them set-, get-, change- and reset-.

Last, and this is the most fun part, I wrote a little tidbit of top-level code:

(doseq [stat stats, f ["get-", "set-", "change-", "reset-"]]

(eval `(def ~(symbol (str f (->> stat str rest (apply str))))
(partial ~(symbol f) ~stat))))

While I didn't use defmacro explicitly, I use the macro mechanics here. I just wrote it inline. What happens here isn't very difficult to understand once you grasp the basics of macros, but I will step through it nonetheless.

The first line can basically translated as "for every possible combination of stats and those string, do the following".

eval executes the code it is given. Executing LISP code literally means just running it through eval. Also, it is a special form. But nevermind that right now.

The backtick ` is a very pleasant thing introduced in Clojure to combat common macro problems which aren't important right now, but generally it does the same as an apostrophe ', it turns the expression it precedes into a literal (when writing macros in Clojure, use backticks, never apostrophes - ALWAYS). So while (+ 2 5) evals to 7, '(+ 2 5) evals to... well, '(+ 2 5). The other one, the wave ~ (or however it is called), kind of cancels the quoting. As an example, '(+ ~(+ 1 1) 5) evaluates to '(+ 2 5).

Now it becomes fairly obvious what is happening here. We plug the attribute and function names into a generic def.

As an example, for :attack and "get-"...

`(def ~(symbol (str "get-" (->> :attack str rest (apply str))))
(partial ~(symbol "get-") :attack))))

Which then turns into...

`(def get-attack

(partial get- :attack))

And that is given to eval to finally execute it.

The abominable first line happens due to my choice of using keywords - running a keyword through str returns it together WITH the colon, and I remove it with the str rest (apply str) part. The second line is fairly straightforward. This is why it was important that the name of the attribute be the first argument.

Anyways, now, whenever the file is executed (be it by loading or by... well, running it), 28 functions are generated. I can easily add more attribute accessors by adding entries to the list of attributes, I can easily change the implementations of the functions themselves without having to wade through tons of code and correcting them in several places at once (which is a really error-prone procedure). Also, should I never need to, I can add new types of accessors with a minimum of hassle. And finally - I had fun writing accessors. When was the last time that ever happened to any of you?

I'm am fairly confident I am entitled to say I did an awesome job there.
Now I should go play Guitar Hero "Fury Of The Storm" on hard before my self-esteem gets too high.

PS: I slightly modifed the generic def later. The first line turned into
`(def (->> stat str rest (apply str) (str f) symbol)
It doesn't change anything, but looks more consistent, in my opinion.

PPS: The base functions (get-, set-...) are private. Just saying.

Saturday 4 February 2012

Destructuring: Addendum

Today, while continuing the work on my game, I remembered one thing about destructuring, and found out another.

Now, the one very important thing I remembered:

(let [[a b c :as d] [1 2 3]] (println a b c d))

results in

1 2 3 [1 2 3]

You can use :as in both vector and map destructuring.

The second thing... I tried it on a hunch, really.
In a very abstracted sense, you can treat vectors as maps with ordered integer keys.
So, a function I wrote got input in the form of a seq (the general sequence datatype of Clojure) of vectors. The vectors contained three elements each, but the only one I needed was the second one (index 1, in other words).

So I tried this...

(let [{thing 1} hypothetical-3-element-vector]

  (println thing))

Now, assuming hypothetical-3-element-vector is bound to, say [1 "lol" 2], the result will be...

lol

So not only can YOU hypothetically treat vectors that way, Clojure does so. In a way.
It probably isn't very useful that often, but still a neat thing to know.

[Edit; Afterthoughts resulting from a walk with my dogs:]
Now that I think of it, this implies that destructuring, under the hood, uses get (or at least a similar mechanic). get handles both vectors and maps gracefully (which neatly extends to get-in... on which I rely a lot.)

So... I consulted The Joy of Clojure (get this book.). Apparently, there is so much more awesomeness hidden in destructuring. But won't discuss it here/now. Sadly it doesn't reveal anything about the inner workings, though. Will need to search later.

Monday 23 January 2012

Breaking stuff apart - Destructuring.

So... boooored. I might as well type up whatever I know about destructuring, which is... extremely, EXTREMELY useful, to say the least.

In fact, I know quite a bit about how to use it (not everything, though), but nothing about how it works under the hood. I only recently learned how lazy-seqs work (which, incidentally, is an interesting topic for another time).

So, destructuring.

If you recall, in my previous post I used something along the lines of

(defn reroot [tree [n1 & nr]] ...

Aaaand, here's destructuring. Instead of giving a second name for the second argument, I put [n1 & nr]. The second argument passed will be destructured into n1 and nr, where n1 holds the first element of that vector, and nr the other elements (that's what &, the rest argument is for). The funny thing is, this kind of destructuring can be arbitrarlily nested, which is powerful.

Really.

More examples:

(let [[a b c] [1 2 3]]

  (println a b c))

-> "1 2 3"

(let [[a b & c] [1 2 3]]

  (println a b c))

-> "1 2 [3]"

(let [[[a & b] & c] [[1 2 3 4] 5 6 7 8]]

  (println a b c))

-> "1 [2 3 4] [5 6 7 8]"

I'm currently writing a game, and I have a map that holds the entire game state. Pure functions act on that map, returning a new one with changes to the game world incorporated. There are many uses for destructuring there.

The world is made up of tiles. I use a map of vectors to maps for those tiles; the vector has two elements (x and y position), and the maps holds all data about the given tile. Now, when a map is ran through (seq), it returns a seq of two-element vectors - the first element is the key, the second one the value. Hence when drawing the field, I can do this:

(doseq [[[x y] tile] tiles]

  ...)

Yes, destructuring can be used in doseq and for. Even in let!

See, I destructured the input vector (which is, remember, a key-value pair) into [x y] and tile. Then, the first element is further destructured into x and y. That way I have access to all those important informations without having to manually extract them.

And this is destructuring for vectors only. Yes, there are other kinds.

But before I go into them, let me note one thing first. If, via destructuring, a name should be bound to some element, but there is none, those names are bound to nil:

(let [[a b c] [1 2]]

  (println a "/" b "/" c))

-> "1 / 2 / nil"

Alright. Destructuring MAPS. This is fun as well. See, in my game I rely on maps a lot. Units, for example, are maps holding all necessary information. In many functions, I don't need all of these informations, so some handy way to extract the ones I DO need would be... handy.

Behold.

(def unit

  { :power 1 :endurance 3 :name "Thing" :image nil :owner 1 :range 3 :position [2 2] })

(let [{name :name, owner :owner} unit]

  (println name "belongs to player" owner))

-> "Thing belongs to player 1"

And you can mix both types of destructuring.

(let [{name :name, [x y] :position} unit]

  (println name "is on field" x y))

-> "Thing is on field 2 2"

Again; if a key is not found in the map, nil is bound to the corresponding names.

So. Destructuring is awesome.

Words of warning, though. Overusing this makes code unreadable. Seriously. I usually try to avoid using this in defn parameter vectors, unless the function is trivial, but it really shines in for and doseq, and has it's uses in lets. I rely on it a lot to make those parts more concise.

Mazes and trees (sounds better than Clojure and Emacs, doesn't it?)

Some days ago I wrote a simple maze generator using depth-first search. It's really trivial, took much less space than my age-old Ruby implementation and - most importantly - made me play around with trees in Clojure.

Today, on a three-hour ride on the train I was bored. So my mind drifted off to my happy place (which, incidentally, is programming) and I started thinking about the aforementioned generator. I realized that all generated mazes could be represented as a tree (I... won't confess as to what data structure I used before.), so, upon further thought, I also concluded that path-finding between two points is trivial, assuming that one of the points is the root node... then I thought some more, and thought, "Hey! In the case of the maze it doesn't really matter WHICH node is the root node... so, I should be able to make a different node the root node without changing the structure of the maze at ALL."

Some mililitre of ink, some paper and quite some thoughts later...

It's definately possible. And, actually, not so hard.

1 -> 2

  -> 3

  -> 4 -> 5 -> 6

       -> 7 -> 8

            -> 9

Let's use this beautiful *cough* tree. Let's say we want to find the way from 4 to 3. The way is simple: 4 -> 1 -> 3. But it's not entirely clear from a data-structure point of view. So...

4 -> 5 -> 6

  -> 7 -> 8

       -> 9

  -> 1 -> 2

       -> 3

Upon closer inspection, what happened is very clear. All nodes UP from 4 were appended to the 4 instead. I don't feel like typing out another entire tree, but if there were more nodes above the 4, then all nodes above it would be in reversed order:

1 -> 2 -> 3 -> 4 -> 5

  becomes

3 ->  4 -> 5

  -> 2 -> 1

You see, 1 -> 2 became 2 -> 1.

So, that's about the entire theory - take all nodes up until the root, reverse the order, and append that to the node you want to be the new root.

Let's implement it.

I assume trees built from vectors in the form of:

node => [[object] & nodes]

Which means the first item of each vector is basically anything (in our example these would be numbers), followed by any number of similarly-structured vectors.

[1 [2] [3 [4]]]

is equal to:

1 -> 2

  -> 3 -> 4

Because I was on a train at the time, and I couldn't recall any function to do it (nor use the clojure.repl tools very well) I wrote my own function to remove an indexed item from a vector, called "without". Basically...

(defn without [n v]

  (vec (concat (take n v) (drop (inc n) v))))

Nothing too exciting. Also, probably redundant.

At this point I was kind of stuck. I wrote something along the lines of this...

(defn reroot [tree [n1 & nr]]

  (if n1

    (conj (reroot (tree n1) nr) (without n1 tree))

    tree))

I used destructuring here (which I will go into in more detail in the next post), but here, it's enough to know that (reroot [tree] [1 2 3]) would have n1 = 1 and nr = [2 3].

Anyway, the first argument is the tree itself, obviously, whereas the second argument is the "path" to the new root node.

For example,

(reroot [1 [2] [3 [4]]] [2 1])

should result in

[4 [3 [1 [2]]]] 

since item #1 in item #2 of the original tree is 4.

Anyways, the code snipped above is wrong. I didn't really know why, I only knew I was doing something wrong. I was totally shotgunning it, putting a without there, removing one somewhere else, replacing it with a simple access-by-index... Until I thought - SCREW THIS, let's do it properly.

I wrote another helper function, I called it swap. It would swap (HAH!) the places of the root node and it's child node of index n:

(defn swap [tree n]

  (conj (tree n) (without n tree)))

Trivial enough.

I didn't explain yet why I use the without. Picture this tree:

1 -> 2 -> 3

  -> 4 -> 5

Calling swap 1 on it (index 1 is the 2 -> 3 branch) would result in this without the "without":

2 -> 3

  -> 1 -> 2 -> 3

       -> 4 -> 5

while it should be

2 -> 3

  -> 1 -> 4 -> 5

Now it's quite obvious; I use the "without" to achieve the effect of taking out the branch from the tree I'm appending to it.

Now return to our reroot...

(defn reroot [tree [n1 & nr]]

  (if n1

    (recur (step tree n1) nr)

    tree))

This works as intended.

But, this being Clojure, such a simple function in four lines? No way this is done. We can do it better.

And of course, the pattern is pretty obvious: some value that is modified repeatedly, according to some list passed along. Sounds awfully like reduce.

Hence, the final version:

(defn without [n v]

  (vec (concat (take n v) (drop (inc n) v))))

(defn swap [tree n]

  (conj (tree n) (without n tree)))

(defn reroot [tree path]

  (reduce swap tree path))

If this isn't elegant, I don't know what is*.

*If you ignore the part with that "without" for a moment.