Obviously I’m not the first to believe in the power of Python to ease XML management. Of course there are DOM and SAX which are meant to provide uniform (whatever language we use) XML APIs. After using these for a little while, I’m now convinced OO features should be more used, especially meta-programming.

Let’s get an example. For a school IT project, I needed to provide an OO (partial) abstraction of a DTD. So I started making a class for each XML item type defined in the DTD. The process was always the same:

  • if item A includes a list of B items, there was a need for get/set methods to access the given list
  • attributes were dealed as Python class variables (__getattr__, __setattr__)

And so on. Then I realized that, the maintenance process would be painfull it the DTD had to be revised. In fact the OO abstraction becomes fully coupled to the DTD.

After that I discovered Python MetaClasses. Python is not the only language to provide meta-programming. But, I find it easier to use than SmallTalk’s one for example. So now we can imagine things like:

class ItemA(XMLObject):
    BItems = XMList(ItemB)
    index = XMLAttribute(type = StringType)

A = ItemA(someXMLData)
print list(A.BItems)
print A.index
A.index = 'foo'

This concept comes directly from SQLObject (Thanks Ian for this great thing). Adapting these design concepts to the (XML, Python) couple would be quite great.