February 28, 2013

Roman numeral conversion in Clojure

(see also part II)

I've been reading Stu Halloway's Programming Clojure (which is excellent by the way). When I reached chapter 6 "Concurrency" I thought I'd better pause and practise some of the basics before venturing into deeper waters!

I once saw a programming challenge posted with a job ad, which asked applicants to write some code to convert numbers from the decimal numeral system to Roman. This seemed like a sufficiently challenging and self-contained thing to try for a first attempt at Clojure.

I started by creating a map of the individual numerals:

(def numerals {\I 1, \V 5, \X 10, \L 50, \C 100, \D 500, \M 1000})

Maps are functions, so invoking a map with a key gives the value, like this:

=> (numerals \X)
10

Next I created a function that maps roman numerals to decimal values (baby steps!):

(defn decimal-values [s]
  (map numerals s))

Testing this at the REPL I get:

=> (decimal-values "MCXVII")
(1000 100 10 5 1 1)

Great, I have a list of the values which I need to compose together. Next I want to combine the values in this list by adding them together.

(defn roman [s]
  (reduce + (decimal-values s)))

Testing again at the REPL I get:

=> (roman "MCXVII")
1117

Whoop! OK, we have our first conversion. Are we done? Unfortunately not - there's a complication.

The "simple" way of writing 4 in Roman numerals is IIII. That repetition is allowed, but not commonly used. The short-hand is to prefix a higher numeral with a lower one, e.g. IV, which means subtract the lower number from the higher. This works for all of the other numerals too, so 999 can be written IM (1000 - 1).

I pondered this for a while, then decided to write a combine-numerals function to use in place of + in the reduce call. I want my roman function to look like this:

(defn roman [s]
  (reduce combine-numerals (decimal-values s)))

Now, combine-numerals needs to account for the more complicated logic described above. Here's what I came up with:

(defn combine-numerals [a b]
  (if (> a b)
    (+ a b)
    (- b a)))

This simply checks that the value seen first is larger than the following value - if it is we add, if it isn't we subtract. Testing that at the REPL gives:

=> (combine-numerals 1 10)
9
=> (combine-numerals 10 1)
11

Perfect, if I see a 1 before a 10 it is subtracted to give 9, otherwise it is added to give 11. Lets try our roman function again:

=> (roman "XVII")
17
=> (roman "MCMXVII")
2117

Hmm, that last one isn't right, it should give 1917. What's going on?

reduce takes each value in the given list and applies the function to it and the reduced result so far. That means that when combine-numerals is called it isn't called with two adjacent values except in the case where the numeral string only has two numerals. Bum.

What I really need is to have the sum so far, the previous numeral seen, and the current numeral being handled, so that I can decide whether to add/subtract the current numeral by comparing it with the previous numeral.

I think I could do this with a recursive function, but I'm struggling to remember the syntax, so I ponder a little more and decide to try running in reverse through the numerals, and add if the total so far is less than 4 times the current total.

This involves changing combine-numerals to multiply the current numeral's value by 4 before comparing, and renaming it to add-numeral:

(defn add-numeral [n t]
  (if (> n (* 4 t))
    (- n t)
    (+ t n)))

I also need to change roman to reduce the list in reverse with the new add-numeral function:

(defn roman [s]
  (reduce add-numeral (map numerals (reverse s))))

Testing at the REPL gives:

=> (roman "IX")
9
=> (roman "MCMXIX")
1919
=> (roman "MIM")
1999
=> (roman "MLMXXXIIII")
1984
=> (roman "MCMLXXXIV")
1984

So, lets see the whole program:

(def numerals {\I 1, \V 5, \X 10, \L 50, \C 100, \D 500, \M 1000})

(defn add-numeral [n t]
  (if (> n (* 4 t))
    (- n t)
    (+ t n)))

(defn roman [s]
  (reduce add-numeral (map numerals (reverse s))))

I confess myself truly amazed that -

I managed to write some Clojure that actually works
I think it is nearly idiomatic Clojure (is it? please comment!)
It is incredibly concise and moderately readable (even to my unfamiliar eyes)

Here's a rough translation of the same algorithm to Java -

import java.util.*;

public class RomanNumerals {

    private static Map&lt;Character, Integer> numerals = 
        new HashMap&lt;Character, Integer>();

    static {
        numerals.put('I', 1);
        numerals.put('V', 5);
        numerals.put('X', 10);
        numerals.put('L', 50);
        numerals.put('C', 100);
        numerals.put('D', 500);
        numerals.put('M', 1000);
    }

    public int convert(String aRoman) {
        int total = 0;      
        for (int i=aRoman.length()-1; i>=0; i--) {
            char c = aRoman.charAt(i);
            total = combine(numerals.get(c), total);
        }
        return total;
    }

    private int combine(int aCurrent, int aRunningTotal) {
        return (aRunningTotal > (aCurrent * 4)) ?
            aRunningTotal - aCurrent : aRunningTotal + aCurrent;
    }

    public static void main(String[] args) {
        System.out.println(new RomanNumerals().convert(args[0]));
    }
}

25 lines of Java (not counting the main method) to 7 lines of Clojure, but I'm not judging (yet).

Java pads out with boiler-plate, and there are big losses in populating the HashMap. I walked the String in reverse rather than actually reversing it. I could have reversed it with new StringBuilder(aRoman).reverse().toString() I suppose, but that's pretty ugly.

Fun stuff! I plan to come back to this exercise over time as I learn more Clojure - pretty sure there are several dozen alternative versions that are more idiomatic/elegant, and I haven't made any attempt at handling bad input, etc.

Because I'll forget it if I don't write it down...

Roman numeral conversion in Clojure

Recent Tweets

Recent Posts