There are many good syntax-highlighters available for progressively enhancing code on website's such as my blog. On my old blog I used Alex Gorbachev's - which is exceptionally good.

For this blog, I really wanted to write my own. Why? Well, for the same reason I wrote the site-generation software that builds the blog from Markdown templates - because I love programming, and doing this stuff is FUN!

Having just released my GWT.Progressive library I had two very good reasons to write a syntax-highlighter that works by progressive-enhancement:

  1. I need good examples of progressive enhancement written in GWT using GWT.Progressive
  2. Those very same code-samples need syntax-highlighting to aid readability!

For this first attempt I'm not setting out to write the world's greatest syntax-highlighter. My aims are more modest: merely to illustrate the GWT.Progressive library and add basic syntax-highlighting (Java) capabilities to my blog.

The syntax-highlighter will work as follows:

  1. After the page is loaded, identify <pre><code> ... </code></pre> blocks on my blog.
  2. Try to ascertain if the text in the block is Java, or a similar language (e.g. JavaScript).
  3. Mark keywords, literals, comments, and a few other choice elements using inline html markup to add css classes, allowing the actual colours and styles to be selected independently of the process of identify the parts to highlight.

The script that kicks this off is loaded by the very last line in the html body of each page on my blog:

<script 
  type="text/javascript" language="javascript" 
  src="blog/blog.nocache.js">
</script>

The Java class that uses the GWT.Progressive library to scan for <pre><code> blocks is, in full, as follows:

package com.sjl.blog.client.syntax;

import com.google.gwt.core.client.*;
import com.google.gwt.dom.client.*;
import com.knowledgeview.gwt.activator.client.*;
import com.knowledgeview.gwt.activator.client.widgets.*;

@RootBinding(tag="code")
public class Code extends BoundRootPanel {
    interface MyActivator extends ElementActivator<Code>{}
    static MyActivator activator = GWT.create(MyActivator.class);

    static JavaSyntaxHighlighter highlighter = 
        new JavaSyntaxHighlighter();

    public Code(Element anElement) {
        setElement(activator.activate(this, anElement));

        if (isJavaCode(getElement().getInnerText())) {
            getElement().setInnerHTML(
                highlighter.highlight(getElement().getInnerText()));
        }
    }

    private boolean isJavaCode(String aString) {
        return !aString.trim().startsWith("<");
    }
}

The GWT module descriptor looks like this:

<module rename-to="blog">
  <inherits name="com.google.gwt.user.User"/>
  <inherits name="com.google.gwt.user.Debug"/>

  <inherits name="com.sjl.gwt.progressive.Progressive"/>
  <entry-point class="com.sjl.blog.client.BlogApplication" />

      <!-- allows the script to be used cross domain -->
  <add-linker name="xs" />
</module>

I will take the time to write a better highlighter some time in the future (using ANTLR or similar to generate a parser), but for now I knocked up a simple version using regular expressions.

This highlighter was built by TDD'ing my way through highlighting the different components, starting simple with keywords. The design changed a few times along the way, but in total, including the Code class, the finished highlighter took around 90 minutes to build.

The slowest part, actually, was round-trip testing and fixing "for real" with GWT production mode, since the difference in regular expression engines between development and production mode caught me out.

Here's the Java syntax highlighter in full:

public class JavaSyntaxHighlighter implements SyntaxHighlighter {
    private List&lt;Replacement> replacements;

    public JavaSyntaxHighlighter() {
        replacements = new ArrayList&lt;Replacement>();

        // order is important
        replacements.add(new CommentReplacement());
        replacements.add(new AnnotationReplacement());
        replacements.add(new LiteralReplacement());
        replacements.add(new KeywordReplacement());
    }

    public String highlight(String anInput) {
        List&lt;Component> _components = new ArrayList&lt;Component>();
        _components.add(new SplittableComponent(anInput));

        for (Replacement _r : replacements) {
        _components = _r.process(_components);
        }

        StringBuilder _sb = new StringBuilder();
        for (Component _c : _components) {
        _sb.append(_c);
        }
        return _sb.toString();
    }
}

interface Component {
    boolean canSplit();
    String toString();
}

class SplittableComponent implements Component {
    protected String value;

    public SplittableComponent(String aValue) {
        value = aValue;
    }

    public boolean canSplit() {
        return true;
    }

    public String toString() {
        return value;
    }
}

class UnsplittableComponent extends SplittableComponent {
    public UnsplittableComponent(String aValue) {
        super(aValue);
    }

    public boolean canSplit() {
        return false;
    }
}

interface Replacement {
    public List&lt;Component> process(List&lt;Component> anInput);
}

abstract class AbstractReplacement implements Replacement {
    private String cssClass;

    public AbstractReplacement(String aCssClass) {
        cssClass = aCssClass;
    }

    protected abstract String replace(String aString);

    @Override
    public List&lt;Component> process(List&lt;Component> aComponents)
    {
        List&lt;Component> _result = new ArrayList&lt;Component>();
        for (Component _c : aComponents) {
        if (_c.canSplit()) {
            String _replaced = replace(_c.toString());
            _result.addAll(createComponents(_replaced, cssClass));
        } else {
            _result.add(_c);
        }
        }
        return _result;
    }

    private List&lt;Component> createComponents(
        String anInput, String aClass) {
        List&lt;Component> _result = new ArrayList&lt;Component>();
        for (String _s : anInput.split("\\[\\[\\[|\\]\\]\\]")) {
        if (_s.startsWith("&lt;span class=\"" + aClass)) {
            _result.add(new UnsplittableComponent(_s));
        } else {
            _result.add(new SplittableComponent(_s));
        }
        }
        return _result;
    }

    protected String replacement(String aClass) {
        return "[ [ [&lt;span class=\"" + aClass + "\"&gt;$1&lt;/span&gt;] ] ]";
    }
}

class KeywordReplacement extends AbstractReplacement
{
    public static String[] KEYWORDS = new String[] {
        "class", "package", "import", "public", "private", "protected", 
        "interface", "extends", "this", "implements", "throws", "try", 
        "catch", "finally", "final", "return", "new", "void", "for", 
        "if", "else", "while", "static", "transient", "synchronized", 
        "byte", "short", "int ", "char ", "long ", "float ", "double ", 
        "boolean ", "true", "false", "abstract", "volatile", "switch", 
        "case"
    };

    public KeywordReplacement() {
        super("sh-keyword");
    }

    protected String replace(String aString) {
        String _result = aString;
        for (String _k : KEYWORDS) {
        _result = _result.replaceAll("(" + _k + ")", 
                replacement("sh-keyword"));
        }
        return _result;
    }
}

class LiteralReplacement extends AbstractReplacement {

    public LiteralReplacement() {
        super("sh-literal");
    }

    protected String replace(String aString) {
        return aString.replaceAll("(\"[^\"]*\")", 
            replacement("sh-literal"));
    }
}

class CommentReplacement extends AbstractReplacement {
    public CommentReplacement() {
        super("sh-comment");
    }

    protected String replace(String aString) {
        String _result = aString.replaceAll("(//.*)", 
            replacement("sh-comment"));
        _result = _result.replaceAll("(\\/\\*[\\*]?.*\\*\\/)", 
            replacement("sh-comment"));
        return _result;
    }
}

class AnnotationReplacement extends AbstractReplacement {
    public AnnotationReplacement() {
        super("sh-annotation");
    }

    protected String replace(String aString) {
        return aString.replaceAll("(@\\w+)", 
            replacement("sh-annotation"));
    }
}

You can find out more about GWT.Progressive in the release announcement, or at the github repository.

blog comments powered by Disqus