Children of a Lesser Python

Over the years, there have been many, many failed attempts to create alternative VMs for Python, in the hopes of increasing program performance. Even if we ignore the many half-finished Python-to-Parrot translator projects still lurching erratically onward like a half-decayed zombie army, the road to better VM performance is lined on both sides by the gravestones of colorfully-named projects like Mamba, Rattlesnake, and Vyper, all lying untended and forgotten.

Meanwhile, newer projects like pycore and ShedSkin are announced all the time, with a hopeful optimism all too similar to that of their predecessors. (Announced a little over a year ago with much fanfare in the blogosphere, pycore is already missing in action, without a single release yet.)

Making Python run fast, it seems, is a lot harder than it looks. You don’t have to be a compiler or VM design expert to look at CPython’s implementation and say, “Doing X is wasteful. I’ll bet you could make that faster by doing Y.” The problem is that at least 9 times out of 10, somebody already tried doing Y, and got maybe 80% of Python to work with their design before they hit a wall.

That wall basically amounts to this: 80% of Python is not Python, because everybody uses some part of that remaining 20%. The only reason ShedSkin isn’t already in the land of the dead with all the other projects is that it neatly sidesteps this issue by not pretending to be anything but a “Python-like” language. However, that’s sort of like the Black Knight in Monty Python and The Holy Grail, insisting that his lack of arms and legs is “only a flesh wound”. True, in other words, but not very useful.

On the other side we see the alternative VMs that actually implement the Python language, but don’t (for the most part) try to outdo CPython for speed. Jython and IronPython actually implement reasonably complete forms of the Python language, but from a practical perspective they are different platforms. Trying to target an application to work across CPython, Jython, and IronPython would be rather pointless, so only pure-Python libraries are portable across the implementations in any case.

But it’s the impure libraries that give (C/J/Iron)Python most of its current value! Be it database access, number crunching, interfaces to GUI toolkits, or any of a thousand other uses, it’s the C, Java, or CLR libraries that make Python useful. CPython is basically a glue language for assembling programs from C libraries, and to the extent that Jython and IronPython are successful, it’s because they’re glue languages for assembling Java or CLR components. What’s more, since their value equation lies elsewhere, Jython and IronPython don’t have to fully implement CPython’s semantics, although they do try to come fairly close.

And IronPython actually manages to improve on some Python performance microbenchmarks, although I’d say the jury is still out on whether IronPython programs perform better in general. Of course, it’s difficult to measure this well because IronPython is a different platform. A heavy number-crunching program using NumPy isn’t going to run on IronPython, for example, so how would you compare them?

And that leads us to the very heart of the issue with CPython. If the value of CPython comes from all the things that work with it today, then CPython is very close to being at a dead-end for further performance improvement. Most proposed performance enhancements these days get rejected because they change the Python C API in backwards-incompatible ways. If a change requires that everybody rewrite their C code, the language might as well not be Python any more. In short, CPython isn’t just a language implementation, it’s a platform API, not unlike the Java VM and libraries.

It used to be that we held out a hope for Python 3000 – Guido’s bold vision of a Python rethought from the ground up, unburdened by the need for backward compatibility. Here we could break with the C API of the past, and explore new territory – or so we thought.

But more recently, Guido has pulled back from the original plan, citing the ongoing vaporware status of Perl 6, and Joel Spolsky’s arguments against rewriting your flagship product. Python 3000 has become Python 3.0, instead. Not a complete rewrite, but a still somewhat vague plan for tuning-up the existing language, and tossing out a few things Guido considers mistakes in retrospect. Backwards incompatibility will be allowed, but Guido has pronounced that there will be no from-scratch rewrite of the CPython implementation. It’s not yet clear whether that means we can refactor in ways that would require third-party extensions to be rewritten. Perhaps this will be decided on a case-by-case basis.

But arguably the single biggest mistake in the CPython platform as it exists today is the lack of a foreign function interface, defined by the language and expressable by Python code. Instead, CPython has always relied on a fixed C API to express foreign interfaces. For its original intended purpose – an embedded scripting language for the Amoeba OS – that was probably okay. But the lack of a C FFI has meant that tools like SWIG, Pyrex, ctypes, Boost::Python, etc. had to spring up to fill the gap, but none of them are “standard” to Python, so a given CPython extension could be written in any of them, or none of the above. Thus, today’s backward-compatibility ball-and-chain: the Python/C API.

What’s more, few of these tools are designed to be independent of the existing CPython implementation. All but ctypes tend to have quirks that are a function of their intended code-generation target. But a Python language-defined FFI would have allowed the CPython API to be a mere implementation detail, able to be changed with little consequence. Indeed, such an FFI could conceivably have been usable even with Jython and IronPython, allowing even greater portability.

But, it’s too late to fix all that now. Or is it?

Enter PyPy. Two months ago, PyPy 0.7 was released. A major milestone, PyPy 0.7 is the first self-hosting Python implementation. That is, an implementation of Python, written in Python, that can interpret itself. What’s more, part of PyPy is a translation system that allows Python code to be translated to other languages, and it includes a kind of foreign function interface, although not a standardized one blessed by Guido. The PyPy developers have now done the work of rewriting all but a minimum of platform-specific C code as high-level Python code. In short, PyPy has already taken the most important step for us to escape from the CPython “gravity well” of needing a backward-compatible C API.

It’s hard to overstress how important this is. The current CPython implementation is locked into a host of design decisions that PyPy is not. As a simple example, PyPy can generate threads-supporting and non-threads-supporting versions of itself, refcounting and garbage collection versions of itself, and so on. Essentially, PyPy is completely virtual with respect to the underlying VM, even though it uses CPython bytecode. So, in the next few years it will be possible to experiment with radical redesigns of the VM, without getting bogged down in the “last 20%” issues experienced by projects of the past. Heck, it should be possible to use custom-tuned VMs on an application-by-application basis!

Further, because PyPy is implemented in Python, hacking on it to change the actual Python language or its semantics will be easier than hacking CPython. In short, we are almost on the doorstep of a renaissance in the development of the Python language, and on the way out of the alternative-implementations graveyard.

But what about speed? PyPy is currently described as 200-300 times slower than CPython, depending on what you’re doing, and what VM you translate it to. This sounds ludicrously bad, until you look at the fact that the untranslated PyPy, running on top of CPython, runs 2000 times slower. Which means – if you’re paying attention – that PyPy’s translator is already able to turn Python code into C that runs 10 times faster!

That is one heck of an improvement, folks. Granted, the code in question is technically “RPython” – a restricted subset of Python that eschews the use of certain more-dynamic features. But it doesn’t need type declarations in order to get speed, like Pyrex does. And this technology could be available for practical use soon, if Stackless guru Christian Tismer has his way, by creating an RPython-to-CPython extension module translator.

So, if it’s possible to create efficient C from a subset of Python, does that now mean that PyPy is finished? Can’t we just take that translation process and go on our way? Unfortunately, no. Although we could certainly take those fast modules back to the CPython platform, the translation process is still quite slow, and needs some accelerating of its own. Also, it still doesn’t really make CPython any faster – it just means that we can compile some individual modules and make them faster.

To reach the promised land, then, PyPy has to first get close to CPython speed. As it gets closer and closer to this goal, more and more people with an idea or two about speeding things up will say to themselves, “I wonder if I can get PyPy to do Y instead of X?” And, unlike the situation with CPython now, they won’t need to be both a Python guru and a CPython VM expert to have a prayer of implementing it.

So, instead of entirely new VM’s springing up and dying incomplete, it may be that we will soon see the opposite trend: existing VMs fading away, consolidated and replaced by an ever-more flexible PyPy. With any luck, we may yet see PyPy become the One Python to Rule Them All, replacing CPython, Jython, and IronPython with C, Java, and C# translator backends respectively.

Update: Just after I posted this, I found a message that appears to be saying that as of September, PyPy is now only 20 times slower than CPython. If that’s the case, things are moving quickly indeed. 2000, 200, 20… How much longer till 2, and 0.2 (five times faster than CPython)? Unfortunately, each new order of magnitude from this point on will probably be more difficult than the last. Too bad they can’t just feed the output back to the input and make it ten times faster as many times as they want. 🙂

Join the discussion Cancel reply

You must be logged in to post a comment.

29 comments

Anonymous says:

October 15, 2005 at 6:47 pm

From Psyco site:

“[…] The future of Psyco now lies in the PyPy project, which according to plan will provide a good base for a Python interpreter with better and well-integrated Psyco-like techniques as soon as 2006. […]”

Log in to Reply
Seo Sanghyeon says:

October 16, 2005 at 12:43 am

“Just after I posted this, I found a message that appears to be saying that as of September, PyPy is now only 20 times slower than CPython.”

It’s around 10x now. Here’s a series of benchmarks from first self-hosting PyPy to current status.

http://codespeak.net/svn/pypy/dist/pypy/translator/goal/bench-windows.py

Log in to Reply
Anonymous says:

October 16, 2005 at 5:54 am

Just FYI, pyvm is already there.

Maybe people had a flashback from the future that “in the future there will be a product from europe with four letters that starts with ‘py’, is implemented mostly in python, is horribly fast and easy to hack”, and thus invested on pypy:)

Facts:
– mostly in py: the compiler is pyc which is done in python. Many modules are reimplemented in python. Includes PyOpenGL, Pygame APIs in python. Improved batteris included.
– fast. It is already 2x times faster than CPython at *average* in many benchmarks. No just a few benchmarks that count static typing and integer arithmetic. pyc/pyvm compiling the standard library is 2.5 times faster than pyc/python on the same task.
– No need for JIT because it has a new technique (pyrex-approach) that can give C-speed for algorithms with static typing.

However there is a lot of hype and noise these days about what “shall” be done. pyvm/pyc is there today. (evidence: PJE’s blog)

Log in to Reply
Anonymous says:

October 16, 2005 at 7:29 am

> pyvm/pyc is there today.

of course, this depends a bit on what you mean with “is there”: there is no pyvm source code released yet, is there? And I can’t even find the binary anymore.

“I have written my own Python virtual machine and it’s 17 times faster than CPython. here are the results of the benchmark, but I won’t show you the source”

Carl Friedrich Bolz

Log in to Reply
Matt says:

October 16, 2005 at 8:46 am

Phillip,
Awesome post. I think a lot of pythoneers don’t know the significance of PyPy, hopefully that will soon change.

Log in to Reply
PJ Eby says:

October 16, 2005 at 10:28 am

“””Just FYI, pyvm is already there.”””

You didn’t really read the article, did you? 80% of Python doesn’t count, especially since pyvm is complete vaporware. Release the source already, and then we’ll see what’s “there”.

It’s fairly obvious from your posts here and elsewhere that you don’t have anything real, and are just trying to boost your ego by bragging in public. If you really accomplished something, you’d either tell it or sell it.

Let me guess – you’re waiting to release pyvm until PyPy is equally as fast, so you can make a binary version and claim that you had it all along? 🙂

(By the way, there are at least 3 projects out there called “pyvm”, and at least one of them was using the name before you. And unlike yours, that project has actually released some usable code.)

Log in to Reply
Anonymous says:

October 16, 2005 at 4:30 pm

Some people doubt that a fast implementation of Python will ever emerge.

Log in to Reply
PJ Eby says:

October 17, 2005 at 12:08 am

“””It’s around 10x now. Here’s a series of benchmarks from first self-hosting PyPy to current status.”””

Wow. If they keep accelerating it at that rate, it’ll be twice as fast as CPython by the end of this year! 🙂

“””Some people doubt that a fast implementation of Python will ever emerge.”””

I think Psyco is an existence proof that it’s possible to JIT-compile Python code to higher speeds than what we have now.

Of all the “Python acceleration”-focused projects, Psyco has the *fewest* limitations on dynamicity, and even those seem to be a function of its required compatibility with CPython.

I suspect that the few limits will be able to be removed in PyPy, where the whole runtime is subject to the PyPy gurus’ will. 🙂

Side note: I’m not one of those people who thinks Python “needs” to be much faster, although there are certainly times when it would be nice. But the real benefits of PyPy will be in allowing things like transactional code execution, massively parallel tasks, creating GIL-free multiprocessor speedups, etc. etc. PyPy is going to make it possible to run applications on custom-designed VMs, using the high-level expressiveness of Python.

One thing I’m particularly looking forward to is that as soon as PyPy reaches the right level of stability, I’ll take a look at changing my generic function implementation so that instead of a dispatch tree, it’ll generate bytecode for a dispatch function.

PyPy’s translation system basically allows you to do whatever you want during module imports, including code generation, so it should be able to then turn generic functions into functions that perform just as well as hand-optimized dispatching functions. Yummy!

However, for it to work really well, there needs to be a way to do an efficient equivalent to a “switch” statement inside bytecode, or else the optimizer needs to be able to recognize a suitable if-then array and turn it into a switch-equivalent construct. (That’s why I didn’t make it work that way in the first place: CPython doesn’t have anything like a switch opcode.)

Log in to Reply
Florian says:

October 17, 2005 at 2:28 am

I feel that pypy is very important for us, in mana respects. I quite couldn’t put it as elloquently as you did.

I don’t know if this had proper mention, but I think important thing is also that in pypy the generation of a new VM for a new environment (.Net, Java, C, Symbian etc.) becomes a matter of writing the right backend rather then rewriting the frontend.

This has the potential to lead python consistently to pretty much everywhere and beyond 😉 (it’s already in many places, but boldy to go where no pythoneer has gone before)

Log in to Reply
Anonymous says:

October 17, 2005 at 6:48 am

“The only reason ShedSkin isn’t already in the land of the dead with all the other projects is that it neatly sidesteps this issue by not pretending to be anything but a “Python-like” language.”

Perhaps that’s the way to go. I’m no Boo advocate – it seems to me that too much of the nature of Python has been removed from that language – but there are good reasons to at least reconsider some of Python’s more bizarre aspects when reinventing Python.

“However, that’s sort of like the Black Knight in Monty Python and The Holy Grail, insisting that his lack of arms and legs is “only a flesh wound”. True, in other words, but not very useful.”

Well, ShedSkin has given more bang per buck than PyPy if you consider metrics like “time from announcement to first usable deliverables” and “citations of relevant prior research”. It certainly deserves a lot more than a casual brush-off.

Log in to Reply
Anonymous says:

October 17, 2005 at 9:39 am

Very interesting post indeed!

It looks like the most talented young Python VM hackers (Christian and Armin) have joined forces to achieve this “greater good”.

I’m no VM expert myself, but the path taken by the PyPy developers is “something completely different” but is no “pie in the sky”. Apparently, it grew out of actual experience in the field and is getting forward at a fast pace (on the time scale of software development ;-).

Log in to Reply
PJ Eby says:

October 17, 2005 at 11:49 am

“””ShedSkin has given more bang per buck than PyPy if you consider metrics like “time from announcement to first usable deliverables” and “citations of relevant prior research”.”””

I don’t see why those metrics would be relevant. As I understand it, ShedSkin’s deliverables don’t let you use existing Python programs, which makes it effectively another language/platform. The metrics I’m interested in have more to do with the percentage of CPython’s test suite it can currently run, and its likelihood of getting to 100%. How does ShedSkin do on *those* metrics?

The issue here isn’t about inventing Python-like languages. There are lots of those, go take your pick and enjoy. But the only ones that will have a chance to survive as professional development languages are those that have a *platform* to go with them, like Jython, IronPython, and Boo. My post here is about upgrading the CPython platform, not supporting another platform or creating one from scratch.

Log in to Reply
Anonymous says:

October 17, 2005 at 1:05 pm

“I don’t see why those metrics [rapid delivery, prior work] would be relevant.”

I’d say that setting one’s work in the context of prior research is particularly important in computer science. And by using widely-understood/explained terminology and delivering working code with few dependencies in a timely fashion, it’s easier for interested parties to collaborate on such work.

“The issue here isn’t about inventing Python-like languages.”

Strange, then, that you started your article by referring to “failed attempts to create alternative VMs for Python” referring to such attempts as the “predecessors” of pycore and ShedSkin, which in the latter case isn’t entirely accurate anyway.

Sure, I accept that SkedSkin isn’t necessarily attempting to offer all of Python’s essential semantics, and perhaps it won’t offer an experience deemed “Pythonic” enough by many developers, but if SkedSkin and what it offers isn’t “relevant” (and I think such a view is very narrow-minded) then why bother to label it as “not very useful”, only then to talk up RPython – an apparently much more limited variant on the same theme?

Log in to Reply
PJ Eby says:

October 17, 2005 at 2:15 pm

“””why bother to label it as “not very useful”, only then to talk up RPython – an apparently much more limited variant on the same theme?”””

I’m not talking up RPython – I introduced it only as a strawman to dismiss. Read the Friendly Article: “Also, it still doesn’t really make CPython any faster – it just means that we can compile some individual modules and make them faster.”

So yeah, the goal I’m interested in is a compelling successor to the CPython platform. Not static translation of a Python subset to C (or C++, Java, C#, etc.).

By the way, if I understand correctly, RPython has a feature that as far as I know is unique to any Python accelerator other than Psyco: it allows you to use Python’s full dynamic nature at module initialization, and captures the state of an initialized program, rather than operating as a source translator. That allows PyPy to do things like generate its opcode tables dynamically, but then the generated tables get compiled to produce C.

As for the relevance of ShedSkin, I only mentioned it at all because it has stirred up some recent interest. However, like so many past efforts, it’s primarily *academic* – a word that’s typically used in business to mean that something is irrelevant, as in “that’s academic”. 🙂

I’m sure ShedSkin is important – for the guy that wrote it, and if it makes a scholarly contribution, that’s cool too.

But it doesn’t help the platform situation any, and is unlikely to significantly influence the mainstream of Python development. It’s not a platform, and there’s no credible path by which it could become a platform, unless it’s going to maybe try to be “C++Python”, in which case I’m guessing that better integration with C++ libraries would be important. (However, my understanding of ShedSkin is shallow, so I could be off-base here.)

For practical software development — that is to say, development that is economically viable and software that economically benefits its users — a language is scarcely the tip of the iceberg. It’s the libraries that count. The thing that’s in common between all the Python implementations I mentioned is that none but the big three (Jython, IronPython, and CPython) have a critical mass of libraries, and only PyPy seems to have a credible chance of getting there.

Anyway, I certainly could be wrong about any or all of these things, and I certainly hope that none of the authors of any Python-like language or translator take any of this article as denigration of their efforts. As I pointed out, it’s a *hard* problem, and those who have attempted it (ShedSkin included) and shared their results with us, have helped to pave that metaphorical “road to better VM performance”.

Log in to Reply
Anonymous says:

October 17, 2005 at 4:34 pm

> I’m not talking up RPython
> – I introduced it only as a
> strawman to dismiss. Read
> the Friendly Article:
> “Also, it still doesn’t
> really make CPython any
> faster – it just means
> that we can compile some
> individual modules and
> make them faster.”

That is the correct way to look at it: you should not have to rewrite your programs in RPython to get speedup. In a way, RPython is an implementation detail of the PyPy python-interpreter (of course it might still be useful to use the translation toolchain to do some special things)

cheers,

Carl Friedrich Bolz

Log in to Reply
Anonymous says:

October 18, 2005 at 5:59 am

“””
of course, this depends a bit on what you mean with “is there”: there is no pyvm source code released yet, is there? And I can’t even find the binary anymore.

“I have written my own Python virtual machine and it’s 17 times faster than CPython. here are the results of the benchmark, but I won’t show you the source”
“””

pyvm was announced in May with links to binary that gave a 1.9 speedup. Nobody complained back then that the link was unavailable or that it didn’t do what it claimed. So let’s not call each other a liar.

Demo period is over now.

However, because we lack the funds to advertise/advocate pyvm (and not only in the world of alternative python vms, but generally dynamic language runtimes), its release is stalled until it’s 100% ready for the user.

Log in to Reply
Anonymous says:

October 19, 2005 at 4:44 pm

“And, unlike the situation with CPython now, they won’t need to be both a Python guru and a CPython VM expert to have a prayer of implementing it.”

Actually, you still do have to be a guru – the PyPy architecture is certainly very flexible, but also very complex.

Log in to Reply
Anonymous says:

October 20, 2005 at 8:28 am

PyPy is what you are holding out hope for. Good luck with that. I think the problem is people treating Python like it is a religion instead of a tool.

“80% of Python doesn’t count” -> See, that’s exactly the problem.

Python hasn’t significantly changed at all since it was first released. It’s funny how many people still hold out hope that Python is going to significantly change sometime in the future when it never has before. Some magic VM is going to come out that translates my Python into any other language and into the most optimized code possible.

PyPy/Jython/IronPython have one significant impediment, and that is CPython itself. Python is always going to be slow, it is not beginner friendly as marketed, the “standard” library is anything but, and it is never gong to be the language zealots are always hoping it will be, even though the zealots are the ones running the Python show now.

Log in to Reply
Anonymous says:

October 22, 2005 at 12:57 pm

“I’m sure ShedSkin is important – for the guy that wrote it, and if it makes a scholarly contribution, that’s cool too.”…

I beg to differ. ShedSkin is the product of many months of hard work and it is an excellent piece of software.
Of course it doesn’t support 100% of python, because it is a static compiler, completely independent of any virtual machine or interpreter.
And that’s what makes it special and sets it appart from all the other projects.

It’s purpose is to to translate to c++ and then compile python code that adderes to a few restrictions on its coding style. But it doesn’t mean that it is yet another “python-like” language. It’s pure and simple python, with some common sense to make it translatable to c++.

It’s current state is version 0.04 and prety usable right now. However, if its author gets no help from the community, it will dye, because in order to work and support more features, many modules and libraries should be rewritten in restricted python in order to be compilable by ShedSkin.

Log in to Reply
PJ Eby says:

October 22, 2005 at 1:33 pm

“””However, if its author gets no help from the community, it will dye, because in order to work and support more features, many modules and libraries should be rewritten in restricted python in order to be compilable by ShedSkin.”””

You mean, like all the modules that PyPy already has in RPython? If the author had worked with the PyPy team instead, his work could perhaps have been used to improve PyPy’s type annotator, or maybe to provide a C++ backend, either of which would have had better chances of being supported by the “community”, and would have significantly multiplied his *practical* contribution.

Instead, ShedSkin is a silo because its author chose to work alone. That’s certainly not the fault of the “community”.

Log in to Reply
Anonymous says:

October 22, 2005 at 3:33 pm

what’s the larget example that shedskin can digest?

I tried it (0.0.4) with some of the
benchmarks that are used for PyPy toolcchain and it exploded in funny ways even after fixing/changing some idiomatic stuff that it more obviously did not like.

Log in to Reply
Anonymous says:

October 22, 2005 at 3:36 pm

“If the author had worked with the PyPy team instead, his work could perhaps have been used to improve PyPy’s type annotator…”

Again, ShedSkin was created with a different goal in mind: producing highly efficient compilable code, independent from any virtual machine or interpreter.

Log in to Reply
PJ Eby says:

October 23, 2005 at 9:28 pm

Um, sure. I was just answering the guy above who was whining about the community not supporting it. As I pointed earlier, if you want to make a new “C++ython” platform, great, but you’re on your own as far as a community. You really can’t have it both ways – CPython users are not going to move to ShedSkin unless there’s a plausible way to move their extensions and their code. That makes ShedSkin a Python-like language, rather than anything to do with the Python platform per se.

On the other hand, if research into practical type annotation of a dynamic language was the point (and I actually thought it was the main point of the ShedSkin project), then that research could *also* have helped PyPy.

Log in to Reply
Anonymous says:

August 10, 2006 at 4:55 am

Way after the fact comment:

“You really can’t have it both ways – CPython users are not going to move to ShedSkin unless there’s a plausible way to move their extensions and their code. That makes ShedSkin a Python-like language, rather than anything to do with the Python platform per se.”

True, but if anything this just brings wider community dynamics into play. There may be a lot of people who say, “I would never use anything which doesn’t let me use 100% of Python 2.5’s shiny features,” such as yourself, but have you considered that the majority of Python users might think differently? And the Python mailing list is hardly the place to seek a representative, collective opinion about this (let alone python-dev) – there are numerous projects using Python that are completely under the radar of the Python “elite” where a Python variant which compiles to C++ and links normally with native libraries, for example, might be rather attractive.

“On the other hand, if research into practical type annotation of a dynamic language was the point (and I actually thought it was the main point of the ShedSkin project), then that research could *also* have helped PyPy.”

The PyPy people made their minds up pretty quickly about what they wanted to do. Now, the type inference strategy they’ve chosen isn’t necessarily the best or the worst available: they’ve simply chosen a strategy which is compatible with their objective of writing a Python virtual machine in something like Python (your “lesser Python”, in fact). But this doesn’t make them open to using or implementing anything else which might work on something closer to your “greater Python”. In fact, discussions with various PyPy-related people suggest that they’re far too interested in either questioning the character of the developer of ShedSkin or just disregarding his work based on a cursory examination of announcements, and a selective aggregation of rumours and opinions.

So, rewinding…

“Instead, ShedSkin is a silo because its author chose to work alone. That’s certainly not the fault of the “community”.”

Perhaps the author doesn’t promote his work as much as he could – without the generous funding and widespread publicity, he’s not likely to get the attention level of PyPy, exactly – but I’d recommend taking a hard look at part of that “community” you’re talking about. With all those Euros floating around in certain circles, you wouldn’t expect that various people would feel the need to trash some guy for work he’s probably doing out of his own pocket. But then, perhaps that’s what the Python “community” is all about these days – I’ve certainly been left with that impression.

Log in to Reply
PJ Eby says:

August 10, 2006 at 11:30 am

”’There may be a lot of people who say, “I would never use anything which doesn’t let me use 100% of Python 2.5’s shiny features,””’

I’m not sure where you got that idea; my point is that it’s the *extensions* that make CPython a platform, and that until PyPy has a plausible route to allow extension development, it’s not really a contender either. The recent additions of both ctypes and a way to create CPython extensions mean that PyPy now has a way to bridge the platforms, however.

Meanwhile, ShedSkin has improved its practical utility quite a bit as well, but remains isolated as a kind of C++ython.

“””there are numerous projects using Python that are completely under the radar of the Python “elite” where a Python variant which compiles to C++ and links normally with native libraries, for example, might be rather attractive.”””

Certainly! Game development could be an example. The main difficulty would be in allowing extensible scripting, since ShedSkin can’t interpret anything and giving your users a C++ compiler to create new levels is a bit difficult. But I’m sure that there are niches where a “C++ython” is just the right thing.

Regarding the rest of your comments, I’m not going to get myself in the middle of such personal arguments, since I don’t have any knowledge of the history or issues there. I myself am curious how ShedSkin manages to do inference with so little code, relatively speaking, and whether it would in fact be at all compatible with the goals of the PyPy toolchain.

For my own purposes, PyPy’s toolchain goals make a lot of sense, in that they operate on an initialized program, which means that the ability to do metaprogramming in Python is maintained. This is important for things like decorators, generic/overloaded functions, macros, and other language extension possibilities that interest me.

Likewise, the ability to plug in alternate object spaces and a variety of output languages interests me as well. I’m not sure whether or how ShedSkin’s approach fits any of those goals, but I suspect that implementing a C++ translator for PyPy based on ShedSkin would be a good way to test that.

Specifically, it would be an opportunity to demonstrate that ShedSkin’s approach can scale up to the full feature set that PyPy supports, and an opportunity to compare performance and completeness of inference against PyPy’s own toolchain.

In other words, if you can’t beat them, join them. This is solid marketing/leadership strategy as well as common sense. If ShedSkin does well in that scenario, it gets to leverage PyPy’s publicity. And, if it does better than the parts of PyPy it replaces, it also exerts market pressure on the PyPy developers.

It does, however, require taking the approach that a complete implementation is more important than any other goal. The entire point of this article has been that alternative Python implementations survive in direct proportion to their completeness, and that the burden of proof of completeness is on the implementers of such new alternatives.

Log in to Reply
Anonymous says:

August 18, 2006 at 3:38 am

Good to see that you’re still reading comments, pje, although the timestamps aren’t very helpful in seeing when they were written. 🙂

“There may be a lot of people who say, “I would never use anything which doesn’t let me use 100% of Python 2.5’s shiny features,”

I’m not sure where you got that idea; my point is that it’s the *extensions* that make CPython a platform

In fact, the ultimate argument is that it’s everything that makes CPython what it is. While this sort of explains the relative unpopularity of Jython, the stream of new features in Python plus discussions about things like ctypes should both cause us to consider issues like the essence of what Python is and how it works with everything else. For example, do we really need a “with” statement? And if people start converting all the extensions to use ctypes, what’s the difference between that and adopting a CPython-incompatible virtual machine technology?

I myself am curious how ShedSkin manages to do inference with so little code, relatively speaking, and whether it would in fact be at all compatible with the goals of the PyPy toolchain.

There’s no doubt that ShedSkin is very clever and concise where the inference system is concerned. It could be that the author manages to leverage certain semantic similarities between Python and C++ in a very effective way. Some might argue that this makes ShedSkin fundamentally incompatible with PyPy and thus irrelevant, but I think such attitudes really are sweeping good ideas under a carpet of ignorance.

Log in to Reply
PJ Eby says:

August 18, 2006 at 12:23 pm

“””It could be that the author manages to leverage certain semantic similarities between Python and C++ in a very effective way.”””

Actually, after the last comment, I finally got around to reading the ShedSkin master’s thesis, and I’ve concluded that ShedSkin is definitely not Python, nor do I see much chance that it ever will be.

Among other things, it cannot support such basic dynamic concepts as getattr() or the idea of bound methods. These ideas are so fundamental to Python’s dynamic nature that essentially ShedSkin is a C++ frontend with Python syntax and implicit static typing. That’s a useful thing to have, to be sure, but is that really Python?

I think that it would be better to call it a statically typed Python-like language; I don’t think that you can really call it Python when it diverges so far from the Python reference manual on so many points.

Log in to Reply
Anonymous says:

January 24, 2007 at 2:39 pm

“””It’s around 10x now. Here’s a series of benchmarks from first self-hosting PyPy to current status.”””

Wow. If they keep accelerating it at that rate, it’ll be twice as fast as CPython by the end of this year! 🙂

It appears to be in the vicinity of 5x slower currently. Assuming I am reading the benchmarks at:

http://tuatara.cs.uni-duesseldorf.de/benchmark.html

correctly.

Log in to Reply
Giles says:

July 7, 2008 at 8:49 pm

Hey PJE – I guess I’ve missed the party here, but just in case you’re still reading comments…

At Resolver Systems we’ve produced a product in IronPython (written in and scriptable in), and certainly have noticed that a goodish percentage of people can’t use it because C extensions don’t work in IP. Not enough for it to be a serious commercial problem for us in the short term – our product is a spreadsheet and a lot of financial types just want to use IP in it to script .NET – but enough that we realised that getting C extensions to work in IP would benefit us a lot in the longer term. Especially if we can get it all working so that you can use .NET or Python C extensions in the same Python-scriptable spreadsheet.

So we’re working on an interface layer that talks .NET to IronPython and “unmanaged” (eg. normal) C to the C extensions, with an aim of getting NumPy working as soon as we can. We’ve called it Ironclad, and we’re putting it out under an MIT license.

So far, we’re getting somewhere (you can load and use the Zlib exension) but there’s still a fair way to go.

Of course, this doesn’t alter or work against your argument – but perhaps it’s an interesting data point for you.

Cheers,

Giles

Log in to Reply

Children of a Lesser Python

Join the discussion Cancel reply

29 comments

RuleDispatch on the move

RuleDispatch Mojo Kicks Monkeypatching’s Ass

The Stars I Don’t Know

Self, version 2.0

And the winner is…

Menu

Article Topics

PJ’s Sites & Socials

Stay In Touch

Get Unstuck, FAST

Children of a Lesser Python

Join the discussion Cancel reply

29 comments

Further reading

RuleDispatch on the move

RuleDispatch Mojo Kicks Monkeypatching’s Ass

The Stars I Don’t Know

Self, version 2.0

And the winner is…

Menu

Article Topics

PJ’s Sites & Socials

Stay In Touch

Get Unstuck, FAST