Well, it’s finally official. Chandler’s parcel.xml format is now deprecated and will soon be gone altogether, replaced entirely by simple Python APIs. Some of you may be thinking back to my Python Is Not Java rant, in which I said that using XML for core application functionality like this was, well, unwise. 🙂 At the PyCon Chandler sprint, it was discovered that the Chandler’s homegrown XML schema definition language was a terrible hardship on developers, and so I proposed to replace it with a descriptor-based Python API. That migration was completed recently. With that done, only initialization of data items (such as Chandler’s UI components) was done using XML. So, a few weeks ago, I implemented an experimental API for initializing data items, which quickly became quite popular, with some even pointing out the advantages of being able to factor out repetition.
For a while, there was also a proposal to create a new XML format just for UI definition. But my counterproposal for using a simple template class and a classmethod instead was met with great rejoicing.
Many people misunderstood and/or misrepresented my previous position on XML; the case of Chandler should help to clarify it. Chandler still uses XML for WebDAV, for .xrc files, for sharing, and numerous other use cases where it makes at least some sense to do so. The parcel.xml format, however, was pure excise: a verbose additional language to do things that are more cleanly (and efficiently) done in Python code. It was developed to serve a vision of Chandler as a “data-driven” system, and it was supposed to ultimately support things like GUI editors.
Of course, the real sin here was not so much XML per se, as overengineering in advance of requirements. If you’re not developing the feature now, it’s best not to make a bunch of other design decisions based on what you think the feature will need. A little thing like choosing to put data in XML form can result in a wide variety of additional costs like:
- Designing the XML format
- Implementing a parser
- Documenting the format
- Developing a bunch of stuff in the format
- Evolving and fixing the parser to handle more and more complex use cases that weren’t thought of previously
- Productivity losses versus what it would’ve been with Python
- Converting all the data once you decide it was a bad idea, or else paying the ongoing marketing and education costs to get third-party developers over the hump, or the cost of not getting those developers on board
The cost of adding things you don’t need is really, really high. Luckily, OSAF believes that it’s more important to get things right, than it is to keep throwing money down a rathole to justify the money already spent. I’ve certainly worked for organizations where the reverse is true, though, including one that threw away tens of millions of dollars trying to replace a small, well-designed Python application with an expensive piece of “enterprise” crapware. Ah, the things I could’ve done with that budget! Well, probably I just would’ve given everybody raises and maybe hired a few more people. Or maybe spun off my group as a company that would sell the software to other companies. Heck, we could’ve used it to buy free sodas for life for everybody working in the company and got more value for the investors than what was actually done with the money!
But I digress. The point is this: delaying feature investments good, sunk cost fallacy bad. Any questions?
Philip, I’d love to hear your thoughts on the microformats approach to the serialisation issue, and in particular to XOXO
If by “microformats” you mean kludging crappy markup into HTML, I’m not particularly convinced; I’m waiting to see how that works out in practice. If by microformats you mean stuff more like RSS and Atom, that get generated by applications and uploaded *alongside* HTML, then I’m much more convinced of the utility. For example, I think that CalDAV isn’t a particularly good idea; a better idea would be for people to just upload their calendars to the web, and put the smarts in the client instead of the server. It works great for Blogger, why not for calendars?
I also totally don’t see the point of XOXO, but then I didn’t bother reading that far. Maybe they should’ve included a “motivation” section up front explaining why it’s useful or even sane.
Using python to define schema is indeed the right way to go.
I am in the process of developing the config layer for an application and initially defined the schema as xml.
Upon reading your posts on the topic I realized the folly of that method and defined them in python.
However, and forgive me if I’ve misunderstood the nature of ‘internal domain specific languages, consider the following:
1)Schema for the config file is defined in python.
2)Reading the config files is relatively easy (I simply import the relevant modules and use the defined classes).
3) My problem is now how to develop a UI utility that can write to this dsl using our pythonic schema.
What solution do you propose for this?
I really like this idea, and look forward to seeing your strategy for managing these “data classes”, and any patterns that you discover in the process.
I suppose it would be easy enough to render these objects into XML in order to interoperate with the web or statically typed systems if you had to. XML has its purposes but it is best to draw the XML boundary as far as possible from the workings of your system, which in Python, is pretty far indeed!