Sunday, June 05, 2005

CPAN Goodies for all

Well, less than two weeks after I speculated on how a dirt-simple CPAN clone might be constructed, it's ready to use for everything but scripts. Installing a Python package can now as simple as typing (say) "easy_install SQLObject", and boom, you're done. Cleanly uninstalling or upgrading packages is almost as easy.

EasyInstall is at version 0.4a1 right now, so there are still a number of features on my to-do list. Most important are script handling and automatic dependency installation. The docs could use a "Making your Package Work Well with EasyInstall" section, and I'd like to do a little refactoring so that it uses the logging module and has support for pluggable downloaders, so that GUI or Twisted-based applications can control the actual downloading process.

In the process of adding all these new features, I've also been plowing through the Python Eggs to-do list, and especially updating the Eggs developer documentation. I'd still like to write a section on creating application plug-in architectures using Eggs, and expand the API reference some more, but it should actually be pretty usable already.

The trickiest part of all this to figure out has been bootstrapping the installation process. Once EasyInstall is able to handle scripts, it'll be able to upgrade itself, so that's not a big deal, and initial installation can be done the old-fashioned " install" way. (Actually, if you have Python 2.4 and use "python -m easy_install" to run it, EasyInstall can upgrade itself now.)

But what about packages that depend on using a newer (or older) version of setuptools to do their own installation? That one has been making my head spin. With the package still changing rapidly, it's tough to just tell people, "make sure you have setuptools version X installed". On the other hand, if you just 'import pkg_resources' and 'require("setuptools>=0.4a1")' in your setup script, then it'll exit with an ImportError, DistributionNotFound, or VersionConflict as appropriate. Or you can trap those errors and exit with a message asking the user to install (or upgrade) setuptools. I guess that's probably the sanest, safest way to handle it, and I should probably add some sample code to the doc pages to show how to do that.

In the meantime, I took a break from writing this to see if I could come up with a way to make setuptools/EasyInstall install itself as an egg, even when you first install it using the distutils. Well, it turns out that the distutils themselves have a way to do this, using the two-argument form of 'extra_path'. The CVS version of setuptools now installs using a 'setuptools.pth' file, and is installed to a directory called 'setuptools-0.4a1.egg', thereby creating a sort of "poor man's egg" with no metadata, but enough information in the directory name to allow the egg runtime system to be able to treat it as if it were a real egg. The 'setuptools.pth' file then ensures that the package is always available to be imported, but it has lower precedence than 'easy-install.pth', so if you use EasyInstall to upgrade setuptools, the newer version will get used, and the setuptools.pth will be a relatively-harmless leftover.

Whenever setuptools' version changes, so too will the extra_path setting, so each installed version will actually be separately available. The only catch is that you can't actually change versions at runtime, because of the usual problems with reloading modules in non-trivial Python applications. Switching between versions of the egg runtime will require an interpreter restart, after changing the easy-install.pth file to point to the new version of setuptools. But, applications using EasyInstall internally to "update plugins" would probably do such a restart before activating any other library upgrades, so that's probably not a big deal; setuptools will really be no different than the application's other libraries in that respect.

Whew. Sometimes it really amazes me what you can do with just what's in the Python standard library. Setuptools and EasyInstall do zip and tarfile processing, HTTP downloads, HTML screen scraping, subversion checkouts, and a whole host of other things, and I've only spent two weekends working on EasyInstall so far.

Now, somebody tell me why I didn't write this thing sooner? :)