Jython 2.5.2 Beta 2 is Released

September 13, 2010 at 11:47 PM | categories: Jython, 2.5.x | View Comments

On behalf of the Jython development team, I would like to announce the second beta release of the 2.5.2 version of Jython. Our current plan is that this will be the last beta of 2.5.2, but this will depend on bug reports.

Download the installer JAR from SourceForge. Here are the checksums:

  • MD5, 560b43678059fd41a374a9487517235c
  • SHA1, 0c41db0e5d275bff80a2c4f9bc3de2e48969d0a6

The release was compiled on Mac OS X with JDK 5 and requires JDK 5 to run. Please try it out and report any bugs at http://bugs.jython.org.

This release fixes bugs related to resource leaks, Java integration, and a number of other issues. See the NEWS for more details. In particular, we did not completely fix the bug, Classloaders cannot GC, which exhausts permgen. Jython uses instances of ThreadState to manage its execution state, including frames, exceptions, and the global namespace. The ThreadState also indirectly refers to the ClassLoader objects used by Jython. Such usage can cause resource leaks when a Jython application is restarted under certain app containers, because the ThreadState often may not removed by the app server's thread pool. This is because ThreadState itself is managed by Java's ThreadLocal.

Fixing this problem without a backwards breaking API change appears to be difficult. Therefore we recommend exploring workarounds, such as the one published in this blog post, which also goes into these issues in more depth.

Jython 2.6 will introduce limited backwards breaking API changes, so it will be possible to fully resolve this bug, and related issues, in that version instead. In a future blog post, I will address what we can do with respect to ThreadState in our 2.6. work.

Let's turn to more on what has been fixed or extended in 2.5.2. In particular, I would like to highlight the following:

  • JSR 223 (javax.script) support was introduced in 2.5.1, but it's now fully usable and is supported by such solutions as RESTx, which runs on Mule ESB. In particular, we bundle our own JSR 223-compliant engine to interface with Jython. Much thanks goes to Jim White who kept pushing us -- and other implementations -- on this and provided us with the starting code; Nicholas Riley, who did much of the work on the Jython development team; and the many beta testers who provided valuable advice and bug reports. JSR 223 is an important, cross-language integration point.

  • Python functions can be directly passed to Java methods that take a single method interface (such as Callable or Runnable). This means you can now pass a callback function, usually a closure, instead wrapping it in a class implementing that interface. Tobias Ivarsson implemented this feature.

  • The collections.defaultdict type is now fully threadsafe. This change continues a trend with our 2.5.0 release to provide strong support for concurrent Python code. Previously any default values could be overwritten by competing threads. CPython is able to implicitly provide the same guarantees, but only on built-in type factories, by the fact that code is serialized through the use of its Global Interpreter Lock (GIL). Jython now uses Google Guava's support for collections. In particular, we leverage atomically computed function maps.

  • Jython's console now supports completing user input upon pressing the <TAB> key. Using JLine, which we have bundled since 2.5.0, means completion works on both Windows and Unix-derived platforms (including Linux and OS X). Use it just like in CPython:

    import readline
    import rlcompleter
    readline.parse_and_bind("tab: complete")
    

    Usually you would do this in a setup script or a Python shell like IPython. For now, you will also need to change a property setting. See the tracking issue on the specifics, but we hope to have this and IPython support complete, so to speak, by 2.5.2 final.

    Such completion is particularly useful in navigating Java APIs, most of which tend to be complex.

  • You can now call a Java constructor using keyword arguments. Geoffrey French contributed the patch for this nice feature. It will also be the last new feature implemented in the 2.5.x versions!

There are many other features and bug fixes, some small, some large. We will look at these in future posts, as well as some outstanding bugs we should be able to fix before the final release.

And -- last but not least -- please help spread the word:

Organizations using Jython 2.2.1, or earlier, should test their code against 2.5.2 beta 2 now so that bug fixes and/or workarounds may be identified. In particular, please note the following:

  • No additional work is anticipated on Jython 2.2.
  • Jython 2.5.2 is the last release in Jython 2.5.x series that will address non-severe issues. Further enhancements in Java integration, for instance, will be seen in 2.6.
  • Jython 2.6 development will begin immediately following the 2.5.2 release. Jython 2.6 will require the use of JDK 6. We are hoping for some significant performance gains by being able to use invokedynamic support, either directly or through a back port of that support to JDK 6. More also on that in a future post.
Read and Post Comments

Trac on Jython - Genshi Support

July 11, 2009 at 06:18 PM | categories: Jython, PBCVM, AST | View Comments

(I was going to add this as a comment to NixDev Open Source Blog but the commenting system for that blog is currently broken. So here's my response, and maybe it will get me to start blogging here again too.)

There has been less interest in Genshi from Jython development as of late, probably because Mako is a very good alternative for Turbogears 2 and Pylons. But you need Genshi for Trac, so here's what you might do:

  • expat. Jython 2.5 has an implementation of expat (Lib/xml/parses/expat.py) that wraps SAX sufficiently that all of the unit tests for ElementTree pass. The only problem is that it's somewhat slow, since the wrapper is in Python. I would see that as a starting point to selective rewriting in Java, which is much easier to do in Jython than in CPython. Perhaps with just a little more emulation work it will also work with Genshi?
  • AST. Jython 2.5 implements the standard _ast and ast (the latter actually part of 2.6, but we needed it!) modules; we do not have any support for the older compiler module. I don't know the status of changeset 31 to support AST, but if it has been incorporated, or can be, we can work with this part then.
  • CPython bytecode. Lastly Jython 2.5 implements a CPython bytecode VM. This has been tested by compiling the entire regression test suite into CPython byte code (with CPython, we don't yet have a compiler for this path!), and except for some cases around code introspection and minor differences in floating point representation (which is an artifact of using CPython for the compilation process), it passes. So you should be able to generate bytecode and just have it run. Look at Lib/test/test_pbcvm.py for some details here.

Good luck! Feel free to ask any questions on the jython-dev mailing list or #jython on IRC.

Read and Post Comments

Flipping the 2.5 Bit for Jython

June 24, 2008 at 10:59 AM | categories: Jython, 2.5.x | View Comments

Something worth pointing out; as of 8 AM this morning (MDT) in rev 4748, Frank Wierzbicki flipped the bits and pronounced this about the ASM branch:

jbaker:~/jythondev/asm jbaker$ dist/bin/jython
Jython 2.5a0+ (asm:4750, Jun 24 2008, 10:56:16)
[Java HotSpot(TM) Client VM ("Apple Computer, Inc.")] on java1.5.0_13
Type "help", "copyright", "credits" or "license" for more information.
>>>

Yesterday there were easily the most commits we have seen in the Jython project. The real threshold was reached when we incorporated the UTF-16 and new-style exception branches into this branch, fixed the grammar to support most incremental parses, while repointing the standard library to CPythonLib 2.5. Along with a flurry of other fixes!

There's a lot more to go, but this should be an encouraging sign for everyone interested in Jython!

Read and Post Comments

Adopting UTF-16

June 24, 2008 at 10:13 AM | categories: Jython, Unicode | View Comments

Jython 2.5 standardizes on Java 5 as the base version for its implementation. Jython has always mapped both unicode and str types to java.lang.String, but the semantics of String changed as of Java 5. Instead of encoding characters as UCS-2, that is just the basic multlingual plane of 65536 code points, Java - like .Net - adopted the UTF-16 encoding. UTF-16 can represent all 1114112 Unicode code points (U+0 to U+10FFFF), except for isolated surrogates (U+D800 to U+DFFF). These surrogates act as escape characters in the UTF-16 encoding.

This makes things somewhat more complicated, to put it mildly. And this is without even considering combining characters!

Instead of a simple uniform encoding that we see in the narrow (UCS-2) or wide (UCS-4) builds of CPython, we get a variable-length encoding. And unlike UTF-8, it's usually not too efficient. In addition, we lose the ability to represent the isolated surrogates. Finally, because UTF-16 is so very close to UCS-2, it's prone to bugs.

Here's the implementation strategy we adopted. In supporting the unicode type with PyUnicode, we first determine if it's in the basic plane or not:

private enum Plane {
    UNKNOWN, BASIC, ASTRAL
}

private volatile Plane plane = Plane.UNKNOWN;

public boolean isBasicPlane() {
    if (plane == Plane.BASIC) {
        return true;
    } else if (plane == Plane.UNKNOWN) {
        plane = (string.length() == getCodePointCount()) ? Plane.BASIC : Plane.ASTRAL;
    }
    return plane == Plane.BASIC;
}

getCodePointCount is in turn implemented using String#codePointCount. Like other code point methods, it decodes any surrogate pairs.

String immutability means we can cache the result in the volatile field plane; idempotence of this operation ensures consistency. This allows us to equate code units (char) to code points (int), and use the implementations provided by PyString. As it turns out, this was always done before, the only difference between str and unicode was in the encoding rules.

In the rather rare case it isn't, we read with String#codePointAt and write with StringBuilder#appendCodePoint using iterators. A seemingly good alternative would be to use String#offsetByCodePoints. Too bad it doesn't reliably work. So instead we have our iterator implementations, lots and lots of them. And sometimes crazy stuff like this, seen in the implementation of PyUnicode#unicode_strip:

return new PyUnicode(new ReversedIterator(
    new StripIterator(sep,
        new ReversedIterator(
            new StripIterator(sep,
                newSubsequenceIterator())))));

If strip method was used extensively on strings that weren't in the basic plane, it might make sense to rewrite this to decode to an int[] buffer. But that's not likely to be case.

That's also the reason we avoid making the basic plane test unless we have to. There are many situations where Unicode can pass in and out of Jython - specifically to/from Java - without us caring about what planes its characters are drawn from. We assume some overhead from boxing with PyUnicode (although HotSpot mitigates the indirection cost), but we don't have to overdo it by computing this test on construction.

When comparing this with CPython, we do lose the ability to include isolated surrogate code points in Unicode strings. There are even some unit tests for this case. But ultimately this seemed like an implementation detail like testing ref counting, one certainly not worth time spent supporting.

It's worth mentioning that one alternative is to create our own representation, much like JRuby. Ruby's strings are mutable, unlike Python's. This forced the issue for the JRuby developers, because Ruby, like Python, needs good string performance. So JRuby uses byte arrays for strings, although they do use UTF-16 encoded, interned java.lang.String's to uniquely represent symbols (:xyz). Given that symbols are not strings, this works well. Ruby doesn't say anything about the encoding of such strings (ouch!), but JRuby does assume they're UTF-8 encoded when crossing the boundary with Java.

Supporting widened Unicode means having support for this in regular expressions. The first step was to just widen the SRE engine used by Jython to represent characters with int instead of short. So we always unpack to int in this case; see strip above. This engine is a direct translation of the CPython equivalent: it's a mini-VM, much like the pickle VM, and regexes are compiled to SRE bytecode. In the future, we may consider using JRuby's implementation (Joni, a port of Oniguruma to Java), but the devil is in supporting some specifics to Python. As was seen in the CPython case, it was quite straightforward to just doing the widening.

At this point, the biggest outstanding issue is backporting the changes to SRE to support wide character classes (aka big character sets), a pickle problem, as well as various bug fixes. A total of 4 test cases are currently failing in test_re.

And then that's it, at least until we start doing performance profiling.

Read and Post Comments

Realizing Jython 2.5

June 20, 2008 at 09:43 PM | categories: Jython, 2.5.x | View Comments

Jython 2.5 is really, finally, unbelievably coming together. This is the next release of Jython, after last summer's 2.2. In a nutshell, we have completed all new language features using an Antlr parser, except for absolute imports. All bytecode generation work, now using an ASM backend, is done. Of course, there are many outstanding bugs. And Python is not just a core language; we need to support fully the fact that "batteries are included". But let's look at where we are. Through the prism of what's new in 2.3, 2.4, and 2.5, here's what working:

  • 2.3: sets (PEP 218), generators (255), source code encoding (263), universal newline (278), enumerate (279), logging (282), Boolean (285), distutils (301), new import hooks (302), pickle enhancements (307), extended slices, datetimes, optparse. Still to go: csv, removing a dictionary in builtin that ensures that interned strings don't get in GC'ed (pre-2.3 behavior!, it helps to read what's new). Also various string, Unicode, and regex changes are mostly done in a separate utf16 branch that I'm currently in the midst of merging against trunk.
  • 2.4: unifying long integers (237), generator expressions (289), string.Template (292, but also needs new utf16 work), decorators (318), reverse iteration (322), subprocess module (324), multi-line imports (328), removal of OverflowWarning, min & max with keyword support, sorted. But we still need partial import with sys.modules, and I'm sure some more stuff I forgot. Decimal and -m support are working in student branches, we just need to incorporate.
  • 2.5: conditional expressions (308), partial functional (309, but we're cheating with a pure-Python version), distutils metadata (314), unified try/except/finally (341), coroutines and other generator functionality (342), with-statement, including contextlib (343), any, all. But we haven't done the exceptions remapping to new-style classes, absolute and relative imports, or all of the context manager support, such as in file. ctypes was a proposed Google Summer of Code project, but apparently PyPy has some work that's 95% the way there; we will talk with them at EuroPython. We need to look into what is necessary to make ElementTree work. sqlite3 depends on ctypes. As I was writing this, I tried out wsgiref; it works and I just committed it to the asm branch. (At some point, we will repoint everything like this to CPythonLib, but for now we are mixing it up as we go. Bear with us!)

Even quit() and exit() now work; I don't know when these oh-so-major features were added. We even now support large string constants. And of course, who can forget our support for the GIL (global interpreter lock) in Jython, something that Tobias Ivarsson, my Google Summer of Code student who is now working on an advanced compiler, added to __future__ as an Easter egg:

>>> from __future__ import GIL
Traceback (most recent call last):
  (no code object) at line 0
  File "<stdin>", line 0
SyntaxError: Never going to happen!

I would imagine that's definitive, we go against Java's native threads and compile to Java bytecode. It would be hard to have a GIL, even if we wanted one.

However, we are just turning the corner. The The Antlr parser in the asm branch currently does not support partial parses, and this breaks not only interactive sessions but doctests. Until this is solved - and Frank Wierzbicki is working like mad on this - we can't merge this branch onto trunk. But that should happen very soon.

With few exceptions, we simply go against the standard Python unit tests. Straightforward, cunning, or devious, we have labored against these unit tests. And in others, we have used Python as our foil: we support the same 2.5 AST parse tree, and we know this by comparing our parses with CPython's for all of the standard library - including those unit tests.

There's a lot more going on. I can't say enough about the work done by Charlie Groves, Philip Jenvey, Alan Kennedy, Nicholas Riley, and others to make this happen. Leo Soto, my other GSoC student, is making amazing progress on supporting Django on Jython, while finding and fixing bugs in Jython itself. Supporting Django forces us to find those gaps in compatibility. Similar efforts are going on with Pylons, TurboGears 2 (Ariane Paola, GSoC), and Zope (Georgy Berdyshev, GSoC). I'm also working on greenlet/Stackless support and involved in a collaboration with Jeremy Siek and Joe Angell at the University of Colorado to add gradual typing] (yes types! but only when you want to) to Jython. We have a T2000 contributed by Sun to let us see how much concurrency - in this case 32 hardware threads, 64 GB of memory - Jython can take advantage of. And so on.

Back to work!

Updates - 2008-06-24: we have support for new-style exceptions, the parser is now usable (but there are a couple of bugs left there), and Unicode support has been updated to UTF-16. See this posting, Flipping the 2.5 Bit for Jython.

Read and Post Comments

Next Page ยป