Saturday, August 13, 2005

Chandler begins recovery from XML

Well, it's finally official. Chandler's parcel.xml format is now deprecated and will soon be gone altogether, replaced entirely by simple Python APIs. Some of you may be thinking back to my Python Is Not Java rant, in which I said that using XML for core application functionality like this was, well, unwise. :) At the PyCon Chandler sprint, it was discovered that the Chandler's homegrown XML schema definition language was a terrible hardship on developers, and so I proposed to replace it with a descriptor-based Python API. That migration was completed recently. With that done, only initialization of data items (such as Chandler's UI components) was done using XML. So, a few weeks ago, I implemented an experimental API for initializing data items, which quickly became quite popular, with some even pointing out the advantages of being able to factor out repetition.

For a while, there was also a proposal to create a new XML format just for UI definition. But my counterproposal for using a simple template class and a classmethod instead was met with great rejoicing.

Many people misunderstood and/or misrepresented my previous position on XML; the case of Chandler should help to clarify it. Chandler still uses XML for WebDAV, for .xrc files, for sharing, and numerous other use cases where it makes at least some sense to do so. The parcel.xml format, however, was pure excise: a verbose additional language to do things that are more cleanly (and efficiently) done in Python code. It was developed to serve a vision of Chandler as a "data-driven" system, and it was supposed to ultimately support things like GUI editors.

Of course, the real sin here was not so much XML per se, as overengineering in advance of requirements. If you're not developing the feature now, it's best not to make a bunch of other design decisions based on what you think the feature will need. A little thing like choosing to put data in XML form can result in a wide variety of additional costs like:
  • Designing the XML format
  • Implementing a parser
  • Documenting the format
  • Developing a bunch of stuff in the format
  • Evolving and fixing the parser to handle more and more complex use cases that weren't thought of previously
  • Productivity losses versus what it would've been with Python
  • Converting all the data once you decide it was a bad idea, or else paying the ongoing marketing and education costs to get third-party developers over the hump, or the cost of not getting those developers on board
The cost of adding things you don't need is really, really high. Luckily, OSAF believes that it's more important to get things right, than it is to keep throwing money down a rathole to justify the money already spent. I've certainly worked for organizations where the reverse is true, though, including one that threw away tens of millions of dollars trying to replace a small, well-designed Python application with an expensive piece of "enterprise" crapware. Ah, the things I could've done with that budget! Well, probably I just would've given everybody raises and maybe hired a few more people. Or maybe spun off my group as a company that would sell the software to other companies. Heck, we could've used it to buy free sodas for life for everybody working in the company and got more value for the investors than what was actually done with the money!

But I digress. The point is this: delaying feature investments good, sunk cost fallacy bad. Any questions?