Not logged in (Log in or Sign up)

unwwwritten

Colour with a U (and a v)

Despite the fact that some of you may consider my spelling of the word colour an abomination, I hope that there are more of you that understand my insistence on using Canadian (or British) spelling on this blog.

In any event, this post is about my implementation of:

  • syntax colouring for code excerpts

In one of my past efforts at blogging I used SimpleLog. During that experiment, I used SimpleHighlight to colour (highlight) code included in my posts.

When it came time to add the same functionality to my blog this time around, I took a look at what made SimpleHighlight tick - the UltraViolet Syntax Highlighting Engine.

Quoting the RubyForge page, “Ultraviolet is a syntax highlighting engine based on Textpow. Since it uses Textmate syntax files, it offers out of the box syntax highlighting for more than 50 languages and 20 themes.”

The code to support this library follow (and as you can see, was pretty minimal).

First, I needed a way of specifying code that was to be highlighted. Again, I decided to follow SimpleHighlight's lead and recognize a few extra attributes for the code tag - allowing me to specify a theme and language for the highlighting and also whether or not line numbers are included. For example...

<code language="ruby_on_rails" theme="mac_classic" numbers="numbers">
  insert code here
</code>

To process this I needed to be able to read html... or at least xml. There are a few options for this in Ruby.

  1. REXML xml processor: included in ruby standard library, but REXML... is... slow...

  2. Hpricot: created by why the lucky stiff... probably an order of magnitude faster than REXML, however, maybe a bit HTML-centric if you are trying to work with xml

  3. LibXML-Ruby: long abandoned, recently revived, blisteringly fast - although with a slightly larger learning curve

I went with Hpricot, as libxml wasn't supported on Heroku (at the time - I'm not sure of the current state).

Combining, Hpricot with UltraViolet, I coded up the following SyntaxHelper and dropped it into lib/syntax_helper.rb...

module SyntaxHelper  
  THEME = 'mac_classic'
  LANGUAGE = 'ruby_on_rails'
  NUMBERS = true

  def syntax(text, options = {})
    doc = Hpricot(text)
    doc.search("//code") do |code|
      theme = code.get_attribute(:theme) || options[:theme] || THEME
      language = code.get_attribute(:language) || options[:language] || LANGUAGE
      numbers = case code.get_attribute(:numbers)
                when NilClass
                  options[:numbers].nil? ? NUMBERS : options[:numbers]
                when 'numbers'
                  true
                else
                  false
                end
      prefix, lines = code.inner_html.match(/(\r?\n)?(.*)/m).to_a[1,2]
      code.swap "#{prefix}#{Uv.parse(lines, 'xhtml', language, numbers == 'numbers', theme)}"
    end
    doc.to_s
  end
end

As a few words of explanation, the basic flow of this text is...

  • load the text into Hpricot
  • search for a code tag
  • if found, parse out the theme, language and line-numbering options from the tag attributes
  • apply the default values for the aforementioned options from the values passed into the helper, or use the hard-coded defaults if not present in the attributes or the options hash (juggling the boolean numbers attribute takes a bit of finesse - or is it brute force?!)
  • syntax-colour the code using Uv#parse
  • replace the contents of the code tag with the syntax-coloured version, being careful with leading whitespace so that we don't get a leading blank line
  • return the document as a string

Then, in my ApplicationController, I added the helper as follows...

class ApplicationController < ActionController::Base
  helper :all # include all helpers, all the time
  helper SyntaxHelper
  ...
end

This just gives all our views access to the helper.

Now, to render beautiful, syntax-coloured code, it is as simple as...

<%= syntax(text) %>

As I've mentioned before, this code can easily be pluginized (or is it pluginified? ... or plugified?... hmmm...), but not until I need it. For now, it is in lib, and easily extracted when the time comes.

Thank u for reading.

Cheers.

blog comments powered by Disqus