I’ve been looking for a clean, simple, readable HTML generator library (for any common server-side language) for perhaps a decade now. There are certainly lots of useful libraries around, and templating and MVC systems abound, but nothing quite appealed. I’ve also tried to create my own, but my efforts had always come unstuck, or otherwise I’ve focused on the big-ticket items (i.e. lists and data-tables). I think, though, I may finally have a solution I’m happy with, which should work reasonably well cross-language. Here’s a surprisingly succinct Python version with no dependencies, which I’m calling HTM Light (dedicated to public domain):
HTM Light
class node(object): # 'c' is for content def __init__(self, tagName, c=None, **attrs): self.tagName = tagName self.content = c if c else [] self.attrs = attrs def escape(self, s): return str(s).replace('&', '&').replace('<', '<') \ .replace('>', '>').replace('"', '"') def __str__(self): s = "<"+self.tagName # Handle attributes for k,v in self.attrs.items(): s += " " + k + '="'+self.escape(v)+'" '; s += ">" # Handle content for el in self.content: s += str(el) s += "</"+self.tagName+">" return s
Remarkably, that should handle everything in HTML5 barring comments and doctypes. Of course, the contents of style and script tags are just stored as strings and HTM Light never generates singular tags like input
and img
which HTML5 allows. I’m sure there’s other cases I haven’t considered that HTM Light doesn’t handle either. Nonetheless, pretty handy for 20-30 lines of code.
Usage would be as follows:
print \ node("html", [ node("body", "Hello, World!") ])
Or for something marginally more interesting:
print \ node("html", [ node("body", [ node("video", src="helloWorld.webm", Class="myVideoControl", c="Video element not supported") ]) ])
(Note the casing of ‘Class’ to avoid the ‘class’ keyword in Python.) The nice thing about this approach is that it creates an object model first. Hence, one can do:
doc = \ node("html", [node("body")])
followed by:
doc.content[0].content.append( node("video", src="helloWorld.webm", Class="myVideoControl", c="Video element not supported") ) print doc
For the same result. Obviously, a jQuery-like find method would be needed for serious object model work.
There’s lot of simple ways this can be extended and I expect to post such extensions in future. Just as one example, adding the following to the __init__
method (and importing the ‘re’ module):
m = re.findall(r'[\.#]?[\w\d_-]+', self.tagName) self.tagName = m.pop(0) classes = "" for attr in m: if attr[0]==".": classes += " " + attr[1:] elif attr[0]=="#": self.attrs["id"] = attr[1:] self.attrs["class"] = classes
allows you to create very nice code like the following:
print \ node("div.entry", [ node("h2#intro.post", "Introducing the first post"), node("p.important", [ "Welcome to the first post!" ]) ])
Which is almost the way I would like to write HTML itself! The only reason I don’t include this in HTM Light is to keep things clear and avoid the dependency on ‘re’ (or extra code).