HTM Light: A HTML generator

I’ve been looking for a clean, simple, readable HTML generator library (for any common server-side language) for perhaps a decade now. There are certainly lots of useful libraries around, and templating and MVC systems abound, but nothing quite appealed. I’ve also tried to create my own, but my efforts had always come unstuck, or otherwise I’ve focused on the big-ticket items (i.e. lists and data-tables). I think, though, I may finally have a solution I’m happy with, which should work reasonably well cross-language. Here’s a surprisingly succinct Python version with no dependencies, which I’m calling HTM Light (dedicated to public domain):

HTM Light

class node(object):
    # 'c' is for content
    def __init__(self, tagName, c=None, **attrs):
        self.tagName = tagName
        self.content = c if c else []
        self.attrs = attrs

    def escape(self, s):
        return str(s).replace('&', '&amp;').replace('<', '&lt;') \
            .replace('>', '&gt;').replace('"', '&quot;')

    def __str__(self):
        s = "<"+self.tagName

        # Handle attributes
        for k,v in self.attrs.items():
            s += " " + k + '="'+self.escape(v)+'" ';
        s += ">"

        # Handle content
        for el in self.content:
            s += str(el)

        s += "</"+self.tagName+">"

        return s

Remarkably, that should handle everything in HTML5 barring comments and doctypes. Of course, the contents of style and script tags are just stored as strings and HTM Light never generates singular tags like input and img which HTML5 allows. I’m sure there’s other cases I haven’t considered that HTM Light doesn’t handle either. Nonetheless, pretty handy for 20-30 lines of code.

Usage would be as follows:

print \
  node("html", [
    node("body", "Hello, World!")
  ])

Or for something marginally more interesting:

print \
  node("html", [
    node("body", [
      node("video", src="helloWorld.webm", Class="myVideoControl",
        c="Video element not supported")
    ])
  ])

(Note the casing of ‘Class’ to avoid the ‘class’ keyword in Python.) The nice thing about this approach is that it creates an object model first. Hence, one can do:

doc = \
  node("html", [node("body")])

followed by:

doc.content[0].content.append(
  node("video", src="helloWorld.webm", Class="myVideoControl",
    c="Video element not supported")
)
print doc

For the same result. Obviously, a jQuery-like find method would be needed for serious object model work.

There’s lot of simple ways this can be extended and I expect to post such extensions in future. Just as one example, adding the following to the __init__ method (and importing the ‘re’ module):

    m = re.findall(r'[\.#]?[\w\d_-]+', self.tagName)
    self.tagName = m.pop(0)
    classes = ""
    for attr in m:
        if attr[0]==".":
            classes += " " + attr[1:]
        elif attr[0]=="#":
            self.attrs["id"] = attr[1:]
    self.attrs["class"] = classes

allows you to create very nice code like the following:

print \
  node("div.entry", [
    node("h2#intro.post", "Introducing the first post"),
    node("p.important", [
      "Welcome to the first post!"
    ])
  ])

Which is almost the way I would like to write HTML itself! The only reason I don’t include this in HTM Light is to keep things clear and avoid the dependency on ‘re’ (or extra code).

Leave a Reply

Your email address will not be published. Required fields are marked *