Thursday, February 04, 2010

Don't use CGIHandler on Google App Engine

It seems that an early documentation example recommended that people use wsgiref.handlers.CGIHandler to run WSGI apps on Google App Engine, instead of the correctly-functioning google.appengine.ext.webapp.util.run_wsgi_app() function.

If you are doing this in your application or your web framework, you have a potentially-exploitable security hole and you should fix it at once.

The specific problem is that one of CGIHandler's base classes caches a copy of os.environ, for non-CGI use cases, and this makes it possible for certain CGI variables to "leak" from the request that started the process, into every subsequent request.

Of course, CGIHandler was never intended to be capable of handling long running processes like GAE, because CGI is not a long-running process.  The idea is that if you have a new kind of long-running process, you subclass BaseCGIHandler for your specific use case.

See, in a "traditional" long-running web app protocol (like FastCGI), process startup is distinct from request handling.  Even if a FastCGI app is started because there's a request ready for processing, there is still a separation between application initialization and the actual request processing.  (And wsgiref tries to cache the "startup" os.environ, separate from the "request" os.environ.)

App Engine, however, jams these two phases together, such that the "main" script is being re-run for each request, so there's no distinction between "startup" and "request".  This makes things convenient for people used to a CGI environment, but brings up problems for the CGIHandler, which expects that it will only be used once per process invocation, and so inherits a cached version of os.environ that also contains request content.

The fix is straightforward: switch from using wsgiref.handlers.CGIHandler to google.appengine.ext.webapp.util.run_wsgi_app().

However, if for some reason you can't do that, a quick monkeypatch fix is to add this line:

CGIHandler.os_environ = {}

somewhere in your code before the first use of CGIHandler.

It is possible that Google has already implemented the patch I provided them to fix this, but if so, the bug opened for App Engine is still open, more than two months later, and some of the documentation is still recommending CGIHandler.  Don't know whether that means it's fixed and the docs are okay, or that it's unfixed and they're still recommending people use it.

Either way, though, recommending CGIHandler for use in the GAE environment was never a good idea, since GAE is not really CGI.  If it ain't CGI, don't use CGIHandler.  Subclass BaseCGIHandler instead, and make a GAEHandler or AWSHandler or whatever, and take advantage of the branding opportunity provided thereby.  ;-)

Tuesday, October 13, 2009

A Clarification Or Two

It seems that one aspect of the response to my announcement yesterday was one I didn't anticipate: people assuming that my statement about uninstalling Distribute was some sort of snark or an attempt at competition, power plays, etc.

But since the annoucement was made to the Distutils-SIG mailing list, I assumed anyone reading it would already know that having both Distribute and setuptools on sys.path would in most cases cause you to still be using Distribute, and not setuptools.

And just as obviously, it would mean you were testing Distribute, rather than setuptools, thereby invalidating the usefulness of any test results.

As for the comment about bugs being fixed differently, I wanted to make the point that testing Distribute does not equal testing 0.6c10; people should not assume that 0.6c10 has already seen as wide usage as the Distribute code has.  I just wrote it this weekend, for heaven's sake.   (Conversely, if someone is concerned about some of the bugs that were on the setuptools tracker, they deserve to know that not all the patches on the tracker -- and used in Distribute -- were correct or complete, in my estimation.)

Was there a teeny bit of annoyance in there as well?  Might the tone of that paragraph have been a little off?  Perhaps.  I edited it several times, trying to minimize any show-through of annoyance, and keep it 100% neutral/factual, but I can certainly believe that a bit of it came through anyway.  I am annoyed, after all.

I'm annoyed that I had to prepare this release.  I'm annoyed that people rant about me not doing anything, and then the same people turn right back around and rant when I do do something.  I'm annoyed that setuptools has been widely blamed for yet another problem that it didn't actually create.  (Two, actually: the 2.6.3 problem itself, and then the subsequent brouhaha on Python-Dev.)

Sure, I'm annoyed about lots of things.

That doesn't mean I want anybody to uninstall Distribute, for any other reason than that they'd rather use setuptools.

After all, I'm making a new release of setuptools primarily so people have the option of not being forced to use Distribute -- and so that Python-Dev isn't forced to make a new Python release just for the benefit of setuptools users.

So it's certainly not my intention to force anyone to use setuptools.  I'm not even trying to persuade anyone to use it in place of Distribute, for heaven's sake.  (Hell, if you're interested in Python 3 support, Distribute is the only game in town right now. Oh, and I even suggested that Guido put Distribute in the stdlib...  and despite the smiley, I wasn't kidding.)

In short, not being forced to do one thing, is not the same as being forced or persuaded to do the opposite.  Capisce?

Okay, that's all, you can now return to your regularly scheduled blaming and flaming.  Pay no attention to the man in the corner, trying to do something useful.

P.S.  There are people I'd trust with maintainership (or at least committership) of setuptools who are working on Distribute.  It's the public nastiness of certain parties that torpedoed the negotiations on that topic back in July, despite the oft-repeated claims by some that I wouldn't turn over the reins to anyone.  See, for example, this attempt to open discussions on that line, or the last paragraph of this, where I expressed excitement at the idea of having 0.6 get cleaned up by someone else.  (OTOH, if I'd realized it would only take me a weekend rather than a couple of weeks to clean up the backlog, I'd have done it six months ago!  So, that bit of delay is my fault.)

Also, for anyone who thinks that my announcement of this release was completely out of the blue, please see the "new setuptools release" thread on Python-Dev, in particular, the post where I said I was planning to make a new release this week, by this Monday, in order to address the outstanding issue with 2.6.3.

P.P.S. Let's keep it constructive in the comments, shall we?  Comments that show no sign of their author's having read and grokked this entire post (and the items linked to above)will be summarily deleted, no matter how otherwise thoughtful the comment might be.

Frankly, anyone who can read all the links I just gave, and still think I'm deliberately trying to put one over on anybody, spring unannounced surprises, hijack Distribute in some way, or unwilling to hand over significant chunks of setuptools responsibility, well...  let's just say that hypothetical person is not being very charitable, and leave it at that.  Here's hoping you're better than that.

Monday, October 12, 2009

That wasn't so difficult, was it?

So, a few weeks ago, Python 2.6.3 was released with a change to the distutils that broke backward build compatibility for a few libraries, like pywin32, setuptools, and every other package using setuptools to build C extensions in a package.

After wading through huge piles of Python-Dev and Distutils-SIG emails in my inbox every day for a week or so, it became clear to me that it'd be a lot more efficient to just make a new release of setuptools than to try to correct all the myths and misapprehensions, or even just the deliberate mischaracterizations, misappropriations, and outright bald-faced lies.  (As the old saying goes, a lie goes halfway 'round the world while the truth is still putting its shoes on!)

Indeed, spending the weekend bringing setuptools up to date was downright fun by comparison, and the results are now in SVN, ready for you to try them out.  Not only did I implement fixes for virtually all the outstanding bugs in the setuptools tracker, but I also tackled a few that aren't in there but personally annoyed me, like Vista's UAC warnings when you try to run easy_install.

Anyway, I'll be cutting an official release (0.6c10) in a few days, so please be ready if your project infrastructure relies on fetching the latest setuptools version from PyPI.  I'm actually pretty excited about this release, because it represents the finish-out of all the 0.6 maintenance cruft that I wanted to do before working on new features in 0.7a1, and starting on the improved package manager that I've been babbling about for years.

On the other hand, I'm less than excited to have people believing some of the rubbish that they do, and it's even less pleasant to imagine that it's spreading.

I do understand that some people are angry about the long wait between setuptools releases, and that's their right.

It's also their right to do something about it, and for the brief time when they were taking the high road -- i.e., 1) being courteous, and 2) not engaging in a public FUD campagin -- I tried to help and participate, and even offered to release the fork as an official version.  Heck I was even excited about the possibility.

But apparently, my positive attitude was too "suspicious" for some people to accept, and others took my search for a qualified maintainer very personally.  And now, various people are claiming the fork is "official" or "blessed", when it never was...  that it's backward-compatible with setuptools, when it's not...  and that it fixes various bugs...  that in fact it doesn't.

And it seems that the collective anger in some quarters has reached such a fever pitch that anything positive I do is still considered "out of the blue" and suspicious.

It would be nice if I could say I was 100% above the no-good-deed-goes-unpunished philosophy of the programming herd.  Certainly, this sort of rampant negativity -- and the corresponding negativity in me that it tends to trigger -- is a big reason I wanted out of professional software development in the first place.  While there are a lot of cool people in the business, a large number of others are...  shall we say, rather unpleasant?

I had hoped to just move forward and pick things up where I left them last year, when I started taking time off to do some serious work on non-programming projects -- and I had no idea when I started it'd be a whole year.

But it seems that I've underestimated just how upset some formerly-(relatively)-happy users of setuptools are with my (relative) disappearance.  And so it's understandable that showing back up with nothing more than a, "Hey, long time no see, here's an update I'd like you to test" might be...  less than satisfying.

So, for all of you long-suffering setuptools users, I'm sorry.  I should've communicated better about my status and plans, and I would have if I'd had any idea what they were myself.

It didn't help that I fell prey to a bit of an entitlement attitude, after so many distutils-sig flamewars over all sorts of minutiae that tended to make me think of anyone proposing changes to setuptools as being an idiot by default.  Among other things, I forgot that the fact that plenty of other people being wrong, doesn't necessarily make me right.

But you do need to understand: I'm not working on setuptools to get back into anyone's good graces but my own.

You see, over the last few weeks there've been a lot of emails on the distutils-sig and Python-Dev from people who are upset at having to leave setuptools.

And I want them to be able to have an actual choice about whether they start using another package or not.  Some have said that it's not fair game for the very people promoting the use of another package, to force them to do so by changing Python.

And I agree, which is why I've done what I've done.

Now, I realize that this isn't going to do much for the people who're already mad as hell and don't want to take it any more.  But when it comes right down to it, being mad is your choice, not mine.

In the same way, I could choose to be forever angry about what seem like numerous unfair, unjust, unexcusable slights dealt out to me.  And I could keep lashing out in anger at the people I think are responsible for my pain.

But that won't make them stop, and it won't make anything better.  At some point, you have to decide what it is that you want instead, and move towards that, instead of trying to "solve" an existing "problem".

So, that's what I'm starting now.

See, for the last several months I've actually been walking on eggshells, trying to not say anything that anyone might be offended by, or announce any plans that anyone might accuse of being "suspicious".

Now, though, I realize that my silence and bending over backwards was actually adding to the problem.  It gave some people the idea I agreed with things I didn't agree with.  It didn't communicate anything to the people who needed to know what was going on with me.  And it even discouraged me from actually doing anything, because I wasn't sharing the excitement and ideas that motivate me.

I kept thinking that there'd be a "right" time, that if I just lay low long enough, I could go accomplish some things and come back and say, "ta da!" and make everything better.

Well, you can see how well that worked.

So, new plan starts now.

Stay tuned.

Saturday, September 06, 2008

Python Gets Out...

Python seems to keep turning up in the most unusual places.  Today I went to the library and borrowed a couple of books on graphic design to assist in making some layout decisions for the book I'm working on.  One was a book I'd read before, Editing By Design, which I'd used to help with the design of my earlier book, "You, Version 2.0".  The other was a book called (appropriately enough) "The Layout Book".  I was skimming through it, when I came across a page with this near the bottom (I've elided a few items from the middle):

"Simple is better than complicated.
Quiet is better than loud.
Unobtrusive is better than exciting.
Small is better than large.
...
The obvious is better than that which must be sought.
Few elements are better than many.
A system is better than single elements."

The block of text was a quote, attributed to one Dieter Rams.  "Wow," I thought, "I wonder if Tim Peters's Zen of Python was a play off of this..."

Then I turn the page.

At the very top of a collection of "methodologies", I see:

"Python philosophy

Derived from computer programming, the main points of the Python approach were presented by developer Tim Peters in The Zen of Python.   Key points include: beautiful is better than ugly, explicit is better than implicit..."

Small world, eh?

--PJ

P.S. I'm still amused by the mentions of Python in Charles Stross's science fiction novels, especially the one where the future hero is described as doing his game programming work in Python 3000, almost as if it were some highly-futuristic language.  ;-)

P.P.S. In case you hadn't guessed, the reason I'm not doing more programming (or blogging about programming) right now is because I'm working on the book...  in which, incidentally, I'm attempting to take a truly algorithmic approach -- not to mention a highly test-driven one -- to such diverse matters as motivation, belief, creativity, time management, and even optimism.

Saturday, February 23, 2008

reddit - now with PJE inside

Until I saw this neat traceback from the innards of reddit, the success of WSGI and eggs was a lot more abstract.  But now, it appears they've got some street cred (along with Pylons, Paste, Beaker, and flup, it looks like).

OTOH, since reddit was first built with Lisp and then rebuilt with web.py, (making this version 3), and since now they've gone all mainstream and "sold out" to Conde Nast, maybe that means my work is "corporate" and "enterprisey", now, instead.  ;-)

Sunday, February 17, 2008

The Library Paradox

Every so often, I see a new package listed on the Python Package Index that claims -- in its tag line, no less -- that it offers some feature "without external dependencies".

The authors of such packages must think it's a truly valuable feature, to list it in the tag line ("description" field of their setup script).

But I'm a little confused by why they're uploading it to the Cheeseshop.  After all, anybody who believes that a lack of dependencies is good...  isn't going to be able to use it!  ;-)

And anyone who doesn't care that much about dependencies (perhaps because they use setuptools?) was probably already using whatever package the new package is intended to replace.

Sure, if the package also offers some other features, a better API, or something like that, it gives people more choices, encourages competition and all that other good stuff.  And I certainly don't want to discourage anyone from uploading useful things to PyPI.

I'm just puzzled by the idea of advertising that your package has no external dependencies, when, as far as anybody else is conerned, it is an external dependency.  Seems like that space could be better used to promote whatever other benefits the library brings.

Just one of those "things that make you go hmmm" I suppose.

Thursday, January 24, 2008

Rumors of Chandler's Death Are Greatly Exaggerated

I don't usually like to blog about my client work, but in this case I'll make an exception, since so many other people are blogging about it who don't have any idea what they're talking about.

A lot of these people, it seems, think that the Chandler project is dead, dying, or a "failure" of some kind.  And that's simply not true.

Could the project have delivered more, in less time?  Sure, absolutely.

Does it have anything to do with the tools used?  Absolutely not.

Is the project dead?  Not at all.

Did it achieve its original goals?  No.

Is it likely to achieve its original goals?  I'm inclined to say no.

Does that matter?  Hardly!

Why?  Because in addition to creating a nice desktop application that:

  • does online/offline calendars with overlays, recurrence and timezone support, date parsing, etc.
  • has an extensible sharing framework that allows peer-to-peer or server sync with a variety of protocols, including CalDAV and gData
  • allows plugins to extend the data model and include that data in peer and server sharing
  • has an Eclipse-like plugin model allowing extenders and extendees with sophisticated UI extensibility

and a nice CalDAV server with an AJAX UI, the Chandler project has also funded (through paid developer time) a lot of open source libraries and tools, especially for Python.  Here's a partial list of other projects that are or were funded by Chandler, in whole or in part:

I'm sure I've missed some, even among the things that I personally worked on.

Anyway, the point is that between the client, server, and libraries, the Chandler project has produced quite a lot of working code, all of which would remain useful and beneficial to the community,  even if the organization had already been disbanded.

But it hasn't disbanded.  On Tuesday I'll be in San Francisco, meeting with the other members of the team, as we hash out our strategy for moving forward.  It is certainly possible that we'll decide to just wrap up the outstanding work on bug fixes, quality improvement, and so on for the existing packages, and not try to continue past the end of this year.  It's also possible that we'll decide to push forward with new feature work, or focus on integrating our cool calendar/plugin platform with other open source applications.  Really, there are quite a few possibilities there.

What a lot of people outside OSAF don't get about this is that the re-org is a good thing.  Don't get me wrong - it's not so good for the people who got laid off.  But for the project, it was a godsend.

See, out of all the junk that people have been writing for years and years about the project, almost nobody has actually seen what the real problem is, why Chandler didn't get anywhere near the original, highly-ambitious goals.

Sure, people have pointed fingers at lots of things, and the idea that there was plenty of time and resources with no hard deadlines is often brought up as a culprit.  But that's not quite right, either.

Having worked on, in, and around the project for about three years now, I can tell you quite simply what the problem was, and why the re-org fixes that problem.

There was no objective basis for decision-making.

It's that simple, really.  Without an objective basis, there was no way to argue from anything except opinion, with nobody's opinion being more important than anyone else's.  There was no benevolent dictator but Mitch, and Mitch had already stopped being available day-to-day before I even started working for them.  (And I've heard that even when he was the benevolent dictator, he was perhaps sometimes a bit too benevolent -- i.e., inclined to just let people choose their own direction/vision for what the project was going to be.)

Thus, there was no unified design, architecture, vision, nothing.  We had fiefdoms, not because anybody wanted to shut anybody else out, but because the natural response of a good developer faced with chaos is to find a way to organize the part that he or she can deal with.  It was easier for each person to just go and focus on the things he or she cared about, than to try to build consensus in the absence of an objective idea of what "success" was supposed be.

Now, you can point to inadequate specification, lack of constraints, and all sorts of other contributing factors as to why there was no objective criteria for success.  But to me, those aren't really central.  You could have every single one of those things correct, for example, and still find some other way to create a culture that lacks objective criteria!

And it's the lack of these common, objective criteria that does you in, regardless of why the criteria are lacking.  Without them, you can't really have productive discussions or planning, whenever the necessary action crosses organizational boundaries.  (Since different sub-groups will have their own views and criteria, with no common criteria to sync against.)

Anyway, next week, we're actually going to sit down and work on defining some objective criteria for the Chandler project as a whole, going forward.  Those criteria may not be what Mitch originally had in mind, and they may be considerably less ambitious.  But, my sincere hope is that they will be sufficiently objective, to allow us a chance at achieving them this year.

So, if and when the project is really dead, we'll certainly say so.  In the meantime, IMO, the obituaries are more than a little premature.  We're only pinin' for some criteria, you see.  ;-)

Saturday, December 22, 2007

The End of an Era

CompUSA closing?  It's a bit hard to believe it.  I remember going there back in the 80's, for goodness sakes.  Bought lots of books and software there, not to mention keyboards, mice, memory, cases, motherboards, you name it.  They've been around for about as long as I've been a programmer.  Now where am I going to buy motherboards when the old one dies on a Saturday morning?

True, I've been buying most of my peripherals from Circuit City these days, but they don't carry very many motherboards, and none of them in stores.  It seems that computer shows are a dying breed these days, too.

I guess there are too many places to buy assembled computers these days, and the gamers and other people who like to customize buy their parts online.  Ah well.  A cup of auld lang syne, CompUSA, we knew thee well.

Monday, December 10, 2007

The Not-So-Secret Truth About SQL

The other day I was helping my wife debug a report she was trying to set up in her store operations software, and there was something screwy about the totals.

She was trying to set up an inventory change report that would reflect the changes of inventory levels at the store, for items that are also listed on her website, so she could update the website when items sold out or came back in.

For some reason, though, only the inventory arrivals were showing, not the sales.  I went in and changed the selection criteria to only select the sales, and that worked.  I put back in the other part of the query, something like "(arrived>0 or sales>0)", and then only the arrivals showed.

Frowning, I took out the "arrived>0" part again.  The sales showed up.  Then, just for the heck of it, I changed the query to "(sales>0 or arrived>0)", and only the sales showed up.

"This is weird," I said.  "You know, I'll bet NULLs are probably the problem somehow."

And she said, "What makes you say that?"

I laughed.  "Because with SQL, NULLs are nearly always the problem."

(Postscript: I'll leave it as an exercise for the reader to figure out how they were the problem.  It'll be good for your SQL chops.)

Thursday, July 05, 2007

Printing Postage with Python and Endicia.com

As my self-help business has grown, I've been needing to ship dozens of newsletters and CDs to my members each month, not counting incidental orders for other products.  Last year, I did all my shipping manually through the USPS website, but as soon as I started offering subscriptions, I switched to using Endicia.com, as previously recommended by Joel on Software.

Endicia offers a Windows client that lets you copy and paste a slew of addresses from the clipboard, and print shipping labels to a Zebra label printer (or pretty much any other kind of printer).  And since the May 2007 rate changes, I can now print customs forms right on the shipping label for shipping to every one of the countries my subscribers are in.  I don't even have to hand-sign and date the customs forms any more!

However, one of the most annoying things about using the client manually is that it has to be carefully configured before printing each kind of label, and it cannot print more than one non-US label at a time.  Almost exactly half of my subscribers are outside the US, so it literally takes me hours to copy-and-paste their addresses one at a time, double-check all rate options, and print the labels.

So this month I got fed up with that process and wrote a Python library to interface with Endicia via XML: PyDicia.

PyDicia is an industrial-strength (well, light industrial, anyway!) interface to Endicia's DAZzle client for Windows.  It can not only send arbitrary packages to anywhere in the world with any shipping option or label type supported by Endicia and the USPS, it can retrieve address corrections, delivery confirmation numbers, customs IDs, etc.  If you have application objects like a "Customer" or "Invoice", you can register callbacks to turn them into address data that PyDicia understands, and to receive the address corrections, delivery confirmation numbers, etc.  (e.g. So you can mark an order "shipped" in your database.)

In my current actual use so far, I've only printed out labels from a script, and haven't done any application integration yet.  That's because my "customer database" currently consists of an Excel spreadsheet and a plain text file of addresses.  I'll probably replace all that with an Access database soon and implement business rules to figure out what should go into each person's package(s).  (Eventually, the Access database wouldl probably move to some sort of server-side database, but it's a YAGNI for now.)

However, I've already saved a couple of hours of data entry...  at the cost of two or three days programming time.  :-(  Oh well, so it'll take a few more mailings (or a handful more members!) before the time savings really starts showing.  But on the flip side, I had fun writing it, and now I don't dread the process of doing the international shipments as much.  Even more important, I no longer dread the idea of getting a lot of new members, which has been causing me to avoid promoting the group as effectively as I should!

A few tips for using PyDicia, by the way: be sure to set up your DAZzle layouts first, including saving the printer setup with each layout and enabling "stealth" postage.  This will prevent constant prompting to enable stealth postage for international shipments, and avoid device errors or badly printed labels due to incorrect printer information.  That way, you can disable the prompts and just spew labels out at lightning speed.  Yay!