Last Updated:

Code Safari: Getting started in HAML

With the recent release of HAML 3.1, I decided to dig deeper into its depths to find out what makes it work. What beasts are hiding in the bowels of the template system?

HAML is a template language that allows you to write HTML using a concise syntax:

%article %h1 My great article %p Here is the text of my article

Which is compiled into:

<article> <h1>My great article</h1> <p>Here is the text of my article</p> </article>

This allows for some nifty things, such as built-in Ruby blocks that are closed with a significant gap. There's no doubt that there are some interesting tricks up your sleeve.

Let's go on safari.

Safari Time

As always, start by getting the code:

git clone git://github.com/nex3/haml

I recommend you read it along with this article.

There are two places I always start when researching a library: README and the main requirement. Unfortunately, most libraries don't have a guide to diving into code in the README, but that doesn't hurt. For HAML, we find very good user documentation, but nothing points us in the right direction. This is normal as we are greeted by a very good comment in lib/haml.rb which makes me smile:

# lib/haml.rb # The module that contains everything Haml-related: # # * {Haml::Engine} is the class used to render Haml within Ruby code. # * {Haml::Helpers} contains Ruby helpers available within Haml templates. # * {Haml::Template} interfaces with web frameworks (Rails in particular). # * {Haml::Error} is raised when Haml encounters an error. # * {Haml::HTML} handles conversion of HTML to Haml.

We found our guide! Class and module level headers like this are a godsend. You can write the most beautiful code in the world, but its weight can scare new developers. Welcome developers to your code.

It looks like Haml::Engine will become a cash ticket, and by opening lib/haml/engine.rb we are greeted with another comment that pays out the jackpot.

# This is the frontend for using Haml programmatically. # It can be directly used by the user by creating a # new instance and calling {#render} to render the template. # For example: # # template = File.read('templates/really_cool_template.haml') # haml_engine = Haml::Engine.new(template) # output = haml_engine.render # puts output

Let's play at home with irb and confirm that the proposed syntax really works. Lauch irb from the HAML catalog. -I flag that adds a directory to the boot path.

$ irb -Ilib irb> require 'haml' irb> Haml::Engine.new("%b hello").render => "<b>hello</b>"

Find "def initialize" in lib/haml/engine.rb to find our entry point. There are many lines, the trick of effective reading when trying to understand the essence of the library is to quickly skip code that is not important for understanding the essence of the program. This usually means skipping omissions and looking for method calls. I also often work from the bottom up, starting with the return value. Usually the methods are structured setup-action-return, and at the moment we are interested in the last two. Most of the #initialize is setting variables, but near the end you'll find a very interesting line:

# lib/haml/engine.rb:124 compile(parse)

Our first understanding! It would seem that HAML separates the parsing of the document from compilation to HTML. This is a standard technique that separates two very different problems.

analysis

Parsing is the process of getting a representation (in this case, our HAML pattern) and preparing it for output to another view (HTML). You can find the parsing code in lib/haml/parser.rb, either lib/haml/parser.rb search for the project "def parse", or by noticing the inclusion of Parser at the top of Haml::Engine. Starting at the bottom of the method, we can see that it returns an instance variable @root . This is handy – since Parser is included as a module in the Engine class, we should be able to easily check this instance variable. We can use the instance_eval method to evaluate code in the context of any object, giving us access even to the instance's private methods and variables. It's a really bad idea for production code, but it's a great tool for exploration.

irb> input = "%article ... sample from above ..." irb> Haml::Engine.new(input).instance_eval { @root } => (root nil (tag {:name=>"article", :value=>nil} (tag {:name=>"h1", :value=>"My great article"}) (tag {:name=>"p", :value=>nil} (plain {:text=>"Here is the text of"}) (plain {:text=>"my article"}))) (haml_comment {:text=>""})) irb> Haml::Engine.new(input).instance_eval { @root }.class => Haml::Parser::ParseNode irb> Haml::Engine.new(input).instance_eval { @root }.children.map(&amp;:class) => [Haml::Parser::ParseNode, Haml::Parser::ParseNode] # (I edited out some extra values from the hashes for clarity.)

The parse method creates a Haml::P arser::P arseNode tree, creating an abstract representation of our document. In other words, this view is not due to the fact that our input was a string. This separates the HAML syntax from the output, resulting in a better architecture. Note that there is always one special root node to which you need to attach the rest of the tree.

Let's dive into the analysis a little more. By scanning the parse method, we get the following basic structure:

while next_line process_indent # decrease nesting if needed process_line if block_opened? increase nesting end end close open tags

There are two main functions here: indentation processing and line parsing. I'll focus on the latter here and leave reading the indent code as an exercise for you (see end of article). Once again, I'll take a skeletal representation of process_line:

case first_char_of_line when '%'; push tag(text) when '.'; push div(text) # ... other cases else; push plain(text) end

The tagdiv, and plain ParseNode methods return ParseNode objects, and push adds a node to the ParseNode of the current node.

Making our own

Now we have enough idea of how HAML parsing works to try to compose the script ourselves. This helps to confirm that we have correctly read the code, as well as to consolidate all the knowledge gained. Let's create a simple parser that can convert our sample document from the top to a node tree, starting with a simple case, ignoring indentation.

require 'test/unit' class HamlParserTest < Test::Unit::TestCase def test_one_line_plain tree = HamlParser.new("hello").parse assert_equal 1, tree.children.size assert_equal :plain, tree.children[0].type assert_equal 'hello', tree.children[0].data[:value] end def test_one_line_tag_with_value tree = HamlParser.new("%em hello").parse assert_equal 1, tree.children.size assert_equal :tag, tree.children[0].type assert_equal 'em', tree.children[0].data[:name] assert_equal 'hello', tree.children[0].data[:value] end end class HamlParser class Node < Struct.new(:type, :data) attr_accessor :children attr_accessor :parent # Used in next example def initialize(*args) super self.children = [] end end def initialize(string) @string = string end def parse @root = Node.new(:root, {}) @root.children = @string.lines.map do |line| parse_line(line) end @root end def parse_line(line) case line[0] when ?% name, value = line[1..-1].split(' ') Node.new(:tag, :name => name, :value => value) else Node.new(:plain, :value => line) end end end

Test::Unit is a unit testing environment provided in the Ruby standard library. If you run this file, you will see that it automatically runs the specified tests. It's a great way to quickly create a small project like this. I generated the code similarly to the HAML code, with a parse_line method that includes the first character of the string, and a root node to store the tree.

To support indentation, we need to configure the parser so that it has the concept of the current node to which child elements are added (instead of always adding to the root according to our first example), as well as the current depth. To facilitate this, we'll add a parent method to access the nodes so that we can walk both down and up the tree. This version is actually a bit simpler than the HAML code, but for now it does its job.

require 'test/unit' class HamlParser < Test::Unit::TestCase def test_tag_with_nested_value tree = HamlParser.new("%em hello").parse assert<em>equal 1, tree.children.size assert</em>equal :tag, tree.children[0].type assert<em>equal 'em', tree.children[0].data[:name] assert</em>equal 'hello', tree.children[0].children[0].data[:value] end end
class HamlParser # Node and initialize as above def parse @root = Node.new(:root, {}) @parent = @root @depth = 0 @string.lines.each do |line| process_indent(line) push parse_line(line.strip) end @root end def process_indent(line) indent = line[/^s+/].to_s.length / 2 if indent > @depth @parent = @parent.children.last @depth = indent end end def push(node) @parent.children << node node.parent = @parent end def parse_line(line) # ... as above end end

It's a good start, and it analyzes our source code example, but there's still a lot to do:

  • Fix the process_indent in our example so that it also "retreats" correctly.
  • It's hard to visualize our parser output because the standard Ruby inspect implementation doesn't include the node's child elements. Override Node#inspect to provide good output, as HAML does.
  • The HAML parser actually tracks two rows at the same time, not one as our parser does. Read the HAML code to find examples of where it is useful.

Let us know how you go in the comments. Join me next week as I continue to work on the second half of the process: the compilation phase.