Wednesday, July 26, 2006

Symbols in Python

Jeff Shell follows up on my DSL post:

There is something about that that just looks kindof... nice. :all. In my editor, Symbols are colorized differently than strings, which helps them stand out even more. find :all. Not findAll() or find(all=True) or find('all') like one might have in Python (all of which are OK solutions, but man.. those Symbols).

Evidently, he's missed the availability of my SymbolType package via the Cheese Shop.  ;-)

Anyway, Python can do symbols quite easily enough without any language changes.  That's why I highlighted function application and block syntax as being the important features of Ruby where DSL's are concerned.  And of the two, the block syntax is by far the most critical.

Interestingly enough, Python actually came very close to getting a DSL-usable block capability in PEP 340 -- authored by Guido himself!  At the last minute, however, it was rejected due to excess flexibility.  More specifically, Guido was convinced by an article about the problems with control-flow macros in C that having a statement whose execution semantics were runtime-defined was a bad idea.  (The original PEP 340 "block" statement would have allowed the block body to execute zero or more times, rebinding the variables in the "as" clause.)

In truth, for many DSL's, PEP 340 still wouldn't have been good enough, even if it had been chosen.  What you really want for a block is something that's basically a function definition that can share variables with its enclosing scope, and possibly rebind them as well.

All in all, the best solution for Python DSLs is probably to create a macro or language extension toolkit that allows syntax-sugared translations to pure Python, that are debug-info preserving (e.g. line number tables correct so you can step through your sugared code).  I've taken a few steps in that direction with my (unreleased) SCALE library and BytecodeAssembler, but it's going to be a while before I get around to doing much more with them.

The "big idea" behind SCALE is simply that Python's lexical syntax and indentation rules can be used to implement a variety of domain-specific languages or extended variants of Python.  (Right now, there's only a parsing/unparsing library implemented, though.)  And, if coupled with an appropriate import mechanism, it could allow unlimited extension to Python in the style of Logix, but without compromising on syntax.  (Logix's base language isn't precisely Python in syntax or semantics; I would want any Python extension languages to be truly Python in their roots.)

Luckily, Python provides us with enough hooks that language extensibility is possible.  It's just not practical to do it without writing parsing code at the moment.  But maybe that will change some day.