Wednesday, January 05, 2005

Rich Comparison is Tricky

People sometimes complain about new Python features as being too tricky; sometimes I think they should look at what's already in Python.

As of version 2.3, the sort algorithm changed to use rich comparisons, specifically the __lt__ form. This unexpectedly broke part of my generic function dispatch system in an obscure corner case involving range tables and classic class instances, when I recently upgraded to using Python 2.3 for new development and testing.

Specifically, the dispatch tables used special "Min" and "Max" objects that were intended to compare less or greater than all other objects, respectively. However, they were implemented using the __cmp__ special method, which only works correctly with the cmp() builtin, or when paired with new-style class instances, when Python is using a rich comparison. (Try saying that three times fast!)

I believe earlier versions of the Python builtin sort used the equivalent of cmp(), thus leading to the current breakage when I moved to 2.3. (Although it's possible the problem existed in 2.2, and I just never exercised exactly this particular corner case before. Whatever the situation, it's getting added to my unit tests posthaste.)

Anyway, the point is that getting rich comparisons right can be very, very tricky. The "Min" and "Max" objects were based on code cribbed from PEP 326, which in turn was from code contributed on the Python-Dev mailing list. It looks like to be completely correct, you have to define __lt__, __le__, __gt__, __ge__, __eq__, and __ne__, in order to guarantee that rich comparisons with classic instances turn out correct, at least as of Python 2.3.4. This is a bit of a pain, but I've figured out I can have all these methods delegate back to __cmp__, which will be slow, but correct.

For a while there, the problem was really spooking me; it showed up as "impossible" behavior in a generic function, because the function's internal indexes were shot to hell by the incorrect comparisons.

Decorators are actually pretty simple, when "compared" to this. :)