May 27, 2011

JQuery + HtmlUnit = runtimeError messages galore

For those who don't care about the back-story and just want the solution: switch the user-agent to Firefox by using the WebClient constructor that takes a BrowserVersion parameter.

... One of my colleagues wrote a htmlunit test that involved swiftly navigating through a set of pages. The test checked the behaviour was correct and passed just fine, but left behind a very large number of messages like this:

runtimeError: message=
  [The data necessary to complete this operation is not yet available.]
  sourceName=[http://localhost:10821/js/jquery-1.6.1.js] line=[16]
  lineSource=[null] lineOffset=[0]

This was looking pretty nasty in the hudson and mvn build logs, so I investigated a little to see if I could get rid of it.

First step was to try to figure out what part of the jquery script was triggering the problem, but of course we are using the minified jquery script, so it was impossible to find the problem on line 16 (line 16 is jquery).

Replacing the minified script with the "source" version I get the same error reported at line 923. It's doing the following check:

// The DOM ready check for Internet Explorer
function doScrollCheck() {
 if ( jQuery.isReady ) {
  return;
 }

 try {
  // If IE is used, use the trick by Diego Perini
  // http://javascript.nwbox.com/IEContentLoaded/
     document.documentElement.doScroll("left");
 } catch(e) {
  setTimeout( doScrollCheck, 1 );
  return;
 }

 // and execute any waiting functions
 jQuery.ready();
}

Line 923 is the one inside the try block. Of course, this is a an IE specific check, and the default user-agent mimicked by HtmlUnit is Internet Explorer 7 - and yes, we were using the default.

You can change the default by passing a BrowserVersion parameter to the WebClient constructor. Switch to Firefox 3 or 3.6 and the problem goes away, switch to IE8 and it gets worse (test fails) - although it seems this is for different reasons, not related to the doScrollCheck. Can't even escape browser version troubles when not using a browser :(

Incidentally I discovered recently that many of the latest browsers and developer plugins can "unminify" javascript on the fly.

Comment on this post

March 15, 2011

Unit Testing code written in "Tell, Don't Ask" style

More and more I find myself enjoying the "Tell, Don't Ask" style of programming, to the extent that recently I've been challenging myself to write all my code that way.

This brought up an interesting discussion while working on a proto-type with an excellent developer I've known and worked with for years - how do you test code written in this style?

Setting the scene

The idea of the proto-type was to play with and test out several Entity-Extraction and Text-Clustering algorithms - some that we had researched, some using existing libraries, and others of our own devising. We collected a few dozen Mb-worth of sample data to toy with, and I knocked up a quick harness to feed the test data into the Entity-Extraction and Test-Clustering routines, which just needed to implement a bunch of Java interfaces I'd written (using "Tell, Don't Ask" style). Each of us then proceeded to implement some of the algorithms, which we plugged in Strategy-Pattern style.

Every so often as we were working, my colleague would turn and say something like "I need to add Xxxx to the Yyyy interface, so I can unit-test my implementation". His feeling was that whilst in principle "Tell, Don't Ask" is laudable, it makes testing very awkward and ungainly.

As a result the interfaces drifted away from "Tell, Don't Ask" so that, whilst they still included the callback-style methods, they now presented a more typical "Ask" style API too - a simple example being: implementing Iterable<T> as well as providing each(T) methods.

Example Tests

Let's look at a simple example of the kind of tests that appear to be made difficult due to the "Tell, Don't Ask" style. Note that I'm using junit 3.8 style for these examples.

Say we have a Stuff interface, which is an immutable container of Thing's, and a ThingMaker that creates many Thing's and returns them packed in a Stuff. Here are the "Tell, Don't Ask" interfaces we might have started off with:

  public interface Thing {
    public String getName();
    public void doThings();
  }

  public interface ThingCallback {
    public void with(Thing thing);
  }

  public interface Stuff {
    public void each(ThingCallback callback);
  }

  public interface ThingMaker {
    public Stuff makeThings(int howMany);
  }

Pretty straight-forward. Now, lets see what happens when we want to test that when we ask the ThingMaker to make two Thing's, we actually get two non-null Thing's. Here's what the test method might look like:

public void testMakesCorrectNumberOfThings() {
  ThingMaker tm = new SimpleThingMaker();
  Stuff result = tm.makeThings(2);

  final boolean[] calledBack = new boolean[1];
  final int[] count = new int[1];

  result.each(new ThingCallback() {
    public void with(Thing thing) {
      calledBack[0] = true;
      if (thing != null) {
        count[0] = count[0]++;
      }
    }
  });

  assertTrue(calledBack[0]);
  assertEquals(2, count.length);
}

Yikes, that's pretty nasty. What's all that nonsense with the final arrays? Well, given that we're working in an anonymous inner class (the callback) we can't just update a boolean or an int that we've declared in the enclosing scope, the only option we have is to cheat and use final variables that have mutable content - arrays are a cheap way to do that. But I think you'll agree this is hideous.

An alternative is to make the inner class do the checking and counting. To do that we have to raise its profile somewhat, to make it a named inner class:

  class TestThingCallback implements ThingCallback {
    boolean called;
    int count;

    public void with(Thing thing) {
      called = true;
      count++;
    }
  }

  public void testMakesCorrectNumberOfThings() {
    ThingMaker tm = new SimpleThingMaker();
    Stuff result = tm.makeThings(2);

    TestThingCallback cb = new TestThingCallback();
    result.each(cb);

    assertTrue(cb.called);
    assertEquals(2, cb.count);
  }

OK, well that's a lot better, but the effort of making these "Test Spy" objects grows very quickly, and although this is more readable it somehow feels less concise, and by moving code outside of the test-method it makes it just that little bit more awkward to read.

I think this shows that "classical" unit testing of "Tell, Don't Ask" style code is actually awkward enough to want to find a better way. But can we do any better? Absolutely ...

Enter, jMock

jMock is an astonishingly useful tool in the testing arsenal. It makes truly unit testing your code possible without masses of work creating Test Double's, because it does all the grunt work for you. Lets quickly re-write our test using jmock.

  public class TestCaseThingMaker extends MockObjectTestCase {
    public void testMakesCorrectNumberOfThings() {
      final ThingCallback callback = mock(ThingCallback.class);
      checking(new Expectations(){{
        exactly(2).of(callback).each(with(any(Thing.class)));
      }});

      ThingMaker tm = new SimpleThingMaker();
      Stuff result = tm.makeThings(2);
      result.each(callback);
    }
  }

Some of that might need a little explanation, so here goes:

The first thing we do inside our test method is create a "mock" instance of the ThingCallback interface. jMock does this for you in the invocation of the mock method.

Next we set up our "expectations" of what will happen to the mock ThingCallback during our test. The slightly funky syntax with the double braces is just an instance initializer on an anonymous inner class. The interesting bit is the declaration of what we expect to happen to our mock object - this is the bit inside those {{ ... }} written in jMock's intuitive internal DSL. To understand it you just have to read the whole line from left to right - we're expecting exactly 2 invocations of callback.each() with any instance of Thing.

Once we've set up our expectations it only remains to build the object under test - SimpleThingMaker - and invoke the methods we want to test. If you are staring at that and wondering how JMock knows that the test is finished and the expectations should have been met (and that it should fail the test if not), notice that I'm using the JUnit-3 integration here - extending MockObjectTestCase - and the behind the scenes plumbing is taking care of that last step for me.

If you're using the JUnit-4 integration you need to explicitly invoke assertIsSatisfied on the mock object context (org.jmock.Mockery) which supplied your mock objects. jMock takes a little getting used to, as it involves quite a different way of thinking about your test. If you want to write good tests with it it certainly involves considerable effort to learn how to use it well (its easy to use it badly and end up with tests which are very difficult to understand). Once you get used to it though, it really does make tests much easier to write and, more importantly, to read.

I find that jMock really comes into its own when testing Tell, Don't Ask code - the code under test is clean and, by nature of the improved data-hiding, more robust, whilst jMock provides a very neat way to test that code with a minimum of fuss and boiler-plate.

UPDATE: After ruminating on this for a while I came to the conclusion that JMock is so good at testing "Tell, don't ask" code that I was sure the designers of JMock must have set out with that in mind. I found a very nice post from @natpryce (one of the JMock authors) which confirmed my suspicions - Nat describes tell don't ask very succinctly:

"...objects tell each other what to do by sending commands to one another, they don't ask each other for information and then make decisions upon the results of those queries.

The end result of this style is that it is easy to swap one object for another that can respond to the same commands but in a different way. This is because the behaviour of one object is not tied to the internal structure of the objects that it talks to."

Nat goes on to say that the only way to test tell-dont-ask code is to see how the state of the world is affected when objects are told to do something (because you can't ask them about their state), and that this is best achieved with mock objects, whereas "train wreck" code (Nat's description of code that is not written tell-dont-ask style, and commonly violates demeter's law) is hard to test with mock objects.

Comment on this post

February 24, 2011

ElasticSearch vs SOLRCloud

For an upcoming work project I need a scalable search platform - scalable to tens or hundreds of millions of documents (news articles), and millions of queries per day. We're a (mostly) Java shop, and have a lot of experience with Lucene, so two solutions that pique my curiosity are SOLRCloud (SOLR + ZooKeeper) and ElasticSearch.

Initial Impressions - ElasticSearch

ElasticSearch is impressive. Its clean, simple, and elegant. For those who are familiar with Compass, ElasticSearch can be considered as Compass 3.0 (quoting Shay Bannon, author of Compass). ElasticSearch has been under development for about 9 months at time of writing, and is currently at version 0.15. It appears to be very actively developed, with new features and fixes flowing steadily.

My main worry at this point is that there appears to be only one "resource" active on the project - Shay Bannon (@kimchy) himself, who seems to be architect, developer, documentation-writer, and a prolific commenter on forums.

Noteworthy features include:

Document-oriented / Schema-free (JSON documents)
Store, retrieve, index and search multiple versions of documents
Self-hosting RESTful web-service api
Exposes the full power of lucene queries
Multiple Indexes in one cluster (described as Multi-Tenancy)
Built from the ground-up with scalability and distributed-operation in mind - supporting distributed search, automatic fail-over and re-balancing, with no single point of failure
Support for async write/backup to shared storage (Gateway, in ElasticSearch parlance)
"Percolator" (aka. prospective search)

Initial Impressions - SOLRCloud

SOLR is a project from the same (Apache) stable as Lucene itself, and the projects have recently merged to some degree. SOLRCloud is an extension that integrates ZooKeeper with SOLR with the express aim of "enabling and simplifying the creation and use of Solr clusters."

SOLRCloud is described as "still under development", ie., not yet a GA release.

Currently proclaimed features include:

Central configuration of the entire cluster
Automatic load-balancing and fail-over for queries
ZooKeeper integration for cluster coordination and configuration (not sure I would have listed that as a feature personally!)
I'll add that SOLRCloud is part of the SOLR code-base, and is being developed by core Lucene and SOLR committers including Mark Miller and Yonik Seeley. This can only be a good thing :). On top of all that, SOLR has been around for a good long time now, so it is battle-tested and there's lots of information available (including numerous books).

That said, I still have two big worries about SOLRCloud: * Setup/deployment just sounds fiddly - it is recommended not to deploy zookeeper embedded with SOLR (though I cannot find any explanation to back up that recommendation), which means you need both a ZooKeeper ensemble - multiple ZooKeeper instances - and a SOLRCloud ... er ... cloud. * No GA release as yet, and no roadmap that I can find (this is the closest I got).

Next Steps

My next steps are to dive in to both technologies and really get to see which best suits our needs, and really how difficult these things are likely to be to manage in a medium/large-scale deployment.

Comment on this post

February 15, 2011

Progressive Enhancement with GWT, part 3

UPDATE: Since writing this series I have published the source of gwt.progressive at github.

This is the third part in a series, following my thoughts on using GWT in SEO'able web applications. The other parts in the series are part 1 and part 2.

In my previous posts I described an idea for progressive enhancement using GWT - "activating" server-generated html, to combine GWT goodness with an SEO friendly server-generated website, and my findings after some initial trials.

One of the problems I described in that second post was that it would be very difficult to work with these widgets if nested widgets could not be automatically (or at least easily) bound to fields within this widget.

After a little playing around and learning about GWT Generators I now have what seems like a nice solution, using a Generator to do almost all of the donkey work. Think of it like UiBinder, but with the templates provided at runtime (courtesy of the server). Here's an example class that automatically binds sub-widgets - an Image in this case - to a field of that class:

public class MyWidget extends Widget {

    interface MyActivator extends ElementActivator<MyWidget> {}
    private static MyActivator activator = GWT.create(MyActivator.class);

    @RuntimeUiField(tag="img", cssClass="small") Image small;

    public MyWidget(Element anElement) {
        // this will set our element and bind our image field.
        setElement(activator.activate(this, anElement));

        // now we can play with our fields.
        small.addClickHandler(new ClickHandler() {
            public void onClick(ClickEvent aEvent) {
                Window.alert("clicked!");
            }
        });
    }
}

This class will bind onto any html that has an image tag somewhere in its inner-html, for example:

<div> <!-- Say our MyWidget is bound here -->
  <div>
    <span>
          <!-- will be bound to our Image widget -->
      <img class="small" src="/images/image.jpg">
    </span>
  </div>
</div>

Anyone familiar with UiBinder will recognize the pattern I've used for the "activator":

Extend an interface with no methods interface MyActivator extends ElementActivator<MyWidget> {}
GWT.create() an instance of that interface private static MyActivator activator = GWT.create(MyActivator.class);
Then use it to initialize your widget setElement(activator.activate(this, anElement));

The nice thing about this is we can automatically bind as many widgets as we like onto various sites within the inner-html of our current widget's element. It doesn't mess with the structure (unless you explicitly do so after the binding is done for you), and you can have as much other html within the elements as you like - it will just be left alone, which gives your designers the flexibility to change the layout quite a lot without necessarily needing to re-compile your GWT code.

Currently I have my generator set up to allow your widgets to bind to a choice of tag-name or css-class or both, for example:

// bind to the first &lt;div> found by breadth-first 
// search of child elements
@RuntimeUiField(tag="div") Label label;

// bind to first element with class="my-widget" 
// found by breadth-first search
@RuntimeUiField(cssClass="my-widget") Label label;

// bind to first &lt;div> with class="my-widget" 
// found by breadth-first search
@RuntimeUiField(tag="div", cssClass="my-widget") Label label;

Notice in my examples so far I'm binding standard GWT widgets onto the nested elements. This works for the elements I've used in these examples because they all have a public static Type wrap(Element anElement) method which allows those widgets to be bound onto elements that are already attached to the DOM.

It is also possible to bind widgets of your own making in one of two ways:

Create a wrap method like public static MyWidget wrap(Element anElement)
Create a single-argument public Constructor that accepts an Element as its argument.

Activate-able widgets can be nested within other such widgets - with no limits that I am aware of so far - and it is also possible to assign nested widgets to a List field in the enclosing widget, like this:

@RuntimeUiField(tag="img") List<Image> images;

This will search recursively for any <img> tags inside the enclosing widget's element and bind them all to Image widget's that will be added to the List. The current limitations here are that the List must be declared either as List or ArrayList, and parameterized with a concrete type that meets the criteria defined above (i.e. has a static wrap(Element) method, or a single-arg constructor that takes an Element as the argument).

A remaining question is how to bind the outer-most Widget. Currently I'm doing that using the DOM scanning code I wrote during earlier experiments and which I'm also using in the automatic scanning process set up by the Generator. For example to find the outer-most widgets and kick off the binding process I have something like this in my EntryPoint:

public void onModuleLoad() {
    List<MyWidget> _myWidgets = new ArrayList<mywidget>();
    for (Element _e : Elements.getByCssClass("outer-most-widget")) {
        new MyWidget(_e);
    }
    // do stuff with our widgets ...
}

I think of this as very similar to the RootPanel situation - "normal" GWT apps kick off by getting a RootPanel(body tag) or RootPanel's (looked up by id), to which everything else is added. It would be nice to hide away some of that scanning code inside a "top-level" widget - much like RootPanel does for the normal case. I can imagine this might look something like:

public void onModuleLoad() {
    Page _page = Page.activate();
    _page.doStuffWithWidgets();
    // ...
}

I still have lots of things to figure out and questions to answer, for example:

What's the performance like when binding many hundreds of widgets?
How will this really work when I make ajax requests for more data? (should I make ajax requests for html snippets which I add to the DOM and then bind onto, or switch to json for ajax requests and make my widgets able to replicate themselves from an initial html template?)
What's the best way to divide labour between developers and designers, and for them to organize their interaction? (Ideally I'd like there to be something of a cycle between them, where the designer can rough-out a page design, agree the componentisation with the developer, the developer knocks out some components and a build which the designer can use to activate their static designs, add fidelity, work on other pages with the same components, etc).
Where is the sweet-spot between creating high-fidelity html server-side and decorating it client-side using GWT? Should the GWT components really be just for adding dynamism, or is it a good idea to use them to build additional html sweetness? - I mean the server could dish out html that is more of a model than a view (just enough "view" to satisfy SEO), and the GWT layer acts as a client-side controller and view (SOFEA/TSA with a nod to SEO).

I'll try to keep posting as I work things out.

This probably belongs in a separate post, but with reference to that last point on TSA (Thin Server Architecture) - the working group list the following points to define the concept:

Do not use server-side templating to create the web page.

Use a classical and simple client-server model, where the client runs in the browser.

Separate concerns using protocol between client and server and get a much more efficient and less costly development model.

I'm right behind them on (2) and (3), and also on (1) for "enterprise" apps where SEO is a non-goal. However, for an app that needs SEO, (1) is a deal-breaker, so I'd offer this alternative 1st rule instead:

Use server-side templating to produce a model for the client to consume which minimally satisfies the needs of SEO.

Comment on this post

February 12, 2011

Installing fonts in Ubuntu

Installing fonts in ubuntu is very easy these days - just open a ttf file and you are presented with a nice sample of the font (quick brown fox style), and a button in the bottom right corner to install the font.

Nice'n'easy, but you're not quite done yet. You'll definitely need to restart running apps before the font becomes available to them, and quite possibly you'll need to rebuild the font cache, which you can do by rebooting (hah!) or:

sudo fccache -fv

btw., check out Eurostile. Its about 50 years old, but nonetheless is one of the most gorgeous fonts i've ever seen.

fonts
ubuntu

Comment on this post

Because I'll forget it if I don't write it down...

JQuery + HtmlUnit = runtimeError messages galore

Unit Testing code written in "Tell, Don't Ask" style

Setting the scene

Example Tests

Enter, jMock

ElasticSearch vs SOLRCloud

Initial Impressions - ElasticSearch

Noteworthy features include:

Initial Impressions - SOLRCloud

Currently proclaimed features include:

Next Steps

Progressive Enhancement with GWT, part 3

Installing fonts in Ubuntu

Recent Tweets

Recent Posts