It all started a couple days ago, when Ian Bicking posted about his attempt at using generic functions for a simple JSON-ification task.
Then, Rene Dudfield posted comments to the effect that generic functions were a poor fit for the task, and slower to boot. He included a benchmark that was supposed to show that generic functions were 30 times slower than a hand-optimized version of the same operation, although the numbers he posted actually showed only a 23.4 times slowdown.
Well, I didn’t think the benchmark was a very good one, but what the heck. I tried it out for myself, made a couple of minor tweaks, and spent 30 minutes or so writing a C version of one part of RuleDispatch that I’d been meaning to get around to anyway, and got the benchmark down to only a 1.37 times slowdown – a mere 37% slower than the hand-tuned version.
But since it’s still not fair to compare a function you’re supposed to extend by adding new methods, with a hand-tuned version that has all the methods it will ever have, I devised a slightly fairer benchmark. Since Rene proposed that monkeypatching – that is, replacing the original function with a new version – was a better way to implement extensibility, I added a couple of types to his version with monkeypatching, and added a couple of types to the generic function version as well.
And then the worm turned: the generic function version was now 35% faster than the monkeypatched version of the hand-tuned function. I was a bit surprised by that, I thought it would’ve taken more layers of monkeypatching first. But no, just one extra layer in the typical case made it the same speed as the generic function, and two layers made it slower. (Presumably, additional layers would continue to degrade performance at a linear rate.)
Now, before anybody gets the wrong idea, I don’t promote monkeypatching in the general case, and in the specific case of a framework function like Ian’s jsonify()
, it would be crazy to recommend that people monkeypatch it. Monkeypatching as a recommended extension technique is little short of lunacy – it’s trivial to accidentally break it or change the semantics due to a change in import order, you can’t import the function normally (e.g from jsonify import jsonify
), and as Rene’s own benchmark shows, it doesn’t even come close to being scalable from a performance perspective.
But thanks to that benchmark, RuleDispatch users everywhere can now benefit from my speeding up of isinstance()
checks, without making any changes to their code. Perhaps other people can now design and post other bogus benchmarks, so that I then can go ahead and spend a few minutes making those cases faster too. Ah, the wonders of blogging and open source. 😉
Seriously, though, I do want to thank Rene for his comments, despite the fact that I think he’s still quite thoroughly missing the point, which is that generic functions are for people creating extensible libraries and application platforms, not writing one-off scripts or applications. Nonetheless, if he hadn’t taken the time to write his comments, I still wouldn’t have gotten around to writing that bit of C code, and RuleDispatch wouldn’t now be so much faster for isinstance()
dispatching. And I wouldn’t be feeling quite as smug right now, about how little monkeypatching overhead is required before RuleDispatch kicks some serious ass!
Interesting thread. Your link to Ian’s post led me to his SQLObject examples [1 2] of using generic dispatch and then from Anthony and Rene’s comments onto this description of Lisp generic functions and David Mertz’s PEAK article respectively.
By which time I finally “got” (I think) the basics of Generic Functions and RuleDispatch.
It’s now tempting now to write all my python programs with classes that are more-or-less structs with all the method resolution handled by generic functions defined separately from the classes eg.
class MyClass:
…attributes, no methods
@dispatch.generic()
def doX(obj,params…):
….
@doX.when(‘isinstance(obj, MyClass)’)
def x_for_MyClass:
….
While a bit verbose for the simple case the benefits of the generic functions seem well worth it.
But nothing for nothing, there must be a catch.
What are the tradeoffs, the limits?
This is fascinating stuff. I feel like you’ve provided a laser-torch; So how do I avoid cutting off my arm?
The title “RuleDispatch Mojo Kicks Monkeypatching’s Ass” gets my +1 for Python quote of the week. 🙂
RuleDispatch is excellent. Nice work, Philip.
Is there some incompatibility between the latest setuptools, RuleDispatch, and Python 2.4.1? On my Mac, when building RuleDispatch, I keep on receiving an error on line 405 of build_ext.py (part of the included distools):
File “/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/distutils/command/build_ext.py”, line 442, in build_extension
sources = self.swig_sources(sources, ext)
TypeError: swig_sources() takes exactly 2 arguments (3 given)
The problem you’re describing is actually an issue with Pyrex on Python 2.4. See this post for more details.
I’m planning to add a monkeypatch to setuptools that will work around this problem, so if all else fails wait for the next setuptools release. 🙁