Friday, July 01, 2005

It's log, log, log

What rolls down stairs
Alone or in pairs;
Rolls over your neighbor's dog?
What's great for a snack
And fits on your back?
It's log, log, log!
-- Ren & Stimpy
It's good to see that other people are experimenting with alternative logging systems for Python. I've never been entirely happy with PEP 282 myself. The interface is okay, but not great; it drops some of the syslog levels and ends up with a set of levels that are either too verbose or not verbose enough. For all practical purposes you can't run with less than WARN level on output, and most of what you log will be either INFO or DEBUG, which basically translates into verbose or insanely verbose.

PEAK adds back in a NOTICE level between INFO and WARN, which allows you to use that as your default filtering level for typical logs and thereby avoid the incessant yammering of overenthusiastic components. But that's not really the big problem with PEP 282, which does after all allow you to add extra levels if you want (just without the benefit of a shortcut method unless you monkeypatch the Logger class).

The big problem with PEP 282 logging is the configuration. Or rather, the lack thereof. If you don't configure it, you get annoying spew from your loggers about how you didn't configure anything for that particular logger, going right to sys.stderr. This then encourages people to manipulate log handlers in library code in order to avoid this, which is exactly the wrong thing to do. Library code should never ever ever ever mess with logging configuration, that's not its job and it will break applications. On the other hand, it's stupid to make an application overly responsible for log configuration in the simple case; most logging should be directable via environment variables or configuration files in standard locations. If I'm writing a simple script, it ought to be easy to get it to do the right thing.

Unfortunately, what library authors want to do with their log output may well be in conflict with what the application author wants. An issue that the library considers critical may well be of trivial consequence to the application. To push the burden of configuring appropriate log levels and filters from 50 library authors onto the users is just plain silly.

In essence, the simple level-based logging concept works only in situations where there is an end-to-end notion of levels , and this ignores the essential concept of modular information hiding. Let's say I have a module (like distutils.dir_util) that manages directories, and logs various things about what I'm doing to them. Depending on the application -- or even different operations in the same application! -- I may want different verbosity to apply.

I ran across this issue recently working on EasyInstall, because the code in different distutils modules has widely varying ideas of what constitutes INFO versus DEBUG level, and for certain operations I wanted it to treat them differently. Similarly, EasyInstall runs packages' setup.py scripts with a lower verbosity level than it employs itself; this is because for the EasyInstall user, the point is mostly about what EasyInstall is doing. The setup script runs in a sandbox and is prevented from actually installing anything directly, so the user probably doesn't really care about the details.

No logging system of which I'm currently aware -- except for an early prototype of PEAK's logging facility, and an earlier logging framework that Ty developed -- actually offer this kind of modularity, where log messages bubble up through a dynamic, component context or caller/callee context, rather than a fixed logger hiearchy or a keyword system. I suspect there are ways to abuse the PEP 282 logging system and the new "py" logging facilities so that they sort of do the same thing, but there's an even more fundamental flaw going on here.

The fundamental flaw of logging systems is that they all basically assume they're creating text. Oh sure, most logging systems, including PEP 282 and the ones Ty and I developed, have some notion of a log record, with information like level, timestamp, and all that sort of thing. But the payload of these logs is basically text, rather than a programmatically-useful description of an event. You're forced to use crude tools like regular expression matching or simple level cutoffs in order to do any kind of content filtering, and god forbid you want to do something like analyze how often event B occurs within X time of event A.

When I first started writing this little rant a couple of days ago, I didn't really know what to do about that. I just thought I'd be writing a wistful little rant about the futility of using logging systems and that would be the end of it. But a funny thing happened midway through the text; I realized that there's not only a good solution to this problem, but in fact it already pretty much exists.

Here's the thing that I realized. Anything you want to "log" is really a record of some event that occurred. An "event" is really an operation, like a function or method call. In the broad sense, what you want is to be able to define flexible, contextual mappings between events and log output, timing analysis, profiling, or other operations upon the events in some form. Ideally, you would also want to define textual messages for recording the operations in a log file or showing to a human viewer, but these definitions' only purpose would be translating events to output, if your filtering rules chose to output them.

The obvious way to link these message translations is with decorators, which coincidentally will encourage you to structure operations into coherent chunks. Sometimes, you'll want to log something that doesn't really have a method of its own, in which case you might create an empty method that doesn't do anything; it's just a stub you call for purposes of offering a progress report to interested callers.

For this to work, we'd need some sort of flexible interception capability that can efficiently wrap an object's methods on an instance-by-instance basis, so that a higher-level component using that object can translate its events into events that make sense for the higher-level component. It can then aggregate those events, log them, ignore them, whatever. Of course, that component is itself really just propagating events in the same way, so only the highest-level component in a given assembly really controls the ultimate output.

Of course, I already have a nice efficient rule processing system in generic functions, and they already have nice flexible wrapping. To make them work with existing objects, I'd need only add some type of interception/injection facility to turn existing functions into generic functions. The rest, while not easy, is at least doable. Mostly it amounts to developing simple APIs that provide shortcuts for commonly-used features, although ideally I'd also like to make the system able to "wire" or "stack" rules together so that any "bubbling up" occurs in a single generic function invocation, rather than resulting in any kind of recursive delegation to higher component levels.

But let's consider a simple case. I have a method, let's say it's copyFile(). I decorate it with a message like "Copying %(source)s to %(destination)s" and a default level like INFO. This wraps the method with a generic function interceptor that will call another generic function, something like logMessage(sender,level,message,args), which then can be configured with rules as to where the message goes. However, this is basically the default message and level for this method. I can define the rules so that it never actually gets logged at all, e.g. by having a rule that if "self is someObject", do nothing. Or I can make it call some event function on my own class. Or do something completely different.

All this rule definition would get exceedingly tedious if you had to hardcode it in terms of specific objects and manually create rules. However, the rules themselves can be expressed as concepts like "the foo attribute of objects of class Bar should have these methods intercepted with such and such rules". This can then be used to create a virtual component hierarchy among objects, even if they aren't "placeful" components like those in PEAK or Zope 3. These "higher order" rules can potentially also do things like, "while this method is executing, intercept this non-method function in such and such a way" -- a useful technique for wrapping utility functions for progress reporting, timing, or logging, or to wrap "legacy" use of the PEP 282 logging framework.

If you think this sounds a lot like AOP, you're right. I pretty much decided about a year ago that AOP = Generic Functions + Transparent Interception, but I've been treating the interception part as a YAGNI (You ain't gonna need it), and therefore haven't implemented it before. Now, however, logging and the like seem like an excellent use case. I'll probably draw some on the accumulated wisdom of the Object Teams approach to AOP in order to develop this, as their conceptual model is heavily based on this kind of "wiring" of components.

I don't expect, however, that this is something I'll be able to actually implement soon. For one thing, my current generic function implementation is unlikely to perform well if it has to add and remove a lot of methods as objects come and go; it's optimized for functions that don't change much. For another thing, I have yet to actually make a release of the generic function package, and I plan to split it off into a separate RuleDispatch project, separate from PyProtocols, although still depending on it. That project in turn is waiting on completion of some features in EasyInstall and setuptools, to make it possible for packages to install their dependencies even when run using "setup.py install".