Jython 2.5.2 Beta 2 is Released

September 13, 2010 at 11:47 PM | categories: Jython, 2.5.x | View Comments

On behalf of the Jython development team, I would like to announce the second beta release of the 2.5.2 version of Jython. Our current plan is that this will be the last beta of 2.5.2, but this will depend on bug reports.

Download the installer JAR from SourceForge. Here are the checksums:

  • MD5, 560b43678059fd41a374a9487517235c
  • SHA1, 0c41db0e5d275bff80a2c4f9bc3de2e48969d0a6

The release was compiled on Mac OS X with JDK 5 and requires JDK 5 to run. Please try it out and report any bugs at http://bugs.jython.org.

This release fixes bugs related to resource leaks, Java integration, and a number of other issues. See the NEWS for more details. In particular, we did not completely fix the bug, Classloaders cannot GC, which exhausts permgen. Jython uses instances of ThreadState to manage its execution state, including frames, exceptions, and the global namespace. The ThreadState also indirectly refers to the ClassLoader objects used by Jython. Such usage can cause resource leaks when a Jython application is restarted under certain app containers, because the ThreadState often may not removed by the app server's thread pool. This is because ThreadState itself is managed by Java's ThreadLocal.

Fixing this problem without a backwards breaking API change appears to be difficult. Therefore we recommend exploring workarounds, such as the one published in this blog post, which also goes into these issues in more depth.

Jython 2.6 will introduce limited backwards breaking API changes, so it will be possible to fully resolve this bug, and related issues, in that version instead. In a future blog post, I will address what we can do with respect to ThreadState in our 2.6. work.

Let's turn to more on what has been fixed or extended in 2.5.2. In particular, I would like to highlight the following:

  • JSR 223 (javax.script) support was introduced in 2.5.1, but it's now fully usable and is supported by such solutions as RESTx, which runs on Mule ESB. In particular, we bundle our own JSR 223-compliant engine to interface with Jython. Much thanks goes to Jim White who kept pushing us -- and other implementations -- on this and provided us with the starting code; Nicholas Riley, who did much of the work on the Jython development team; and the many beta testers who provided valuable advice and bug reports. JSR 223 is an important, cross-language integration point.

  • Python functions can be directly passed to Java methods that take a single method interface (such as Callable or Runnable). This means you can now pass a callback function, usually a closure, instead wrapping it in a class implementing that interface. Tobias Ivarsson implemented this feature.

  • The collections.defaultdict type is now fully threadsafe. This change continues a trend with our 2.5.0 release to provide strong support for concurrent Python code. Previously any default values could be overwritten by competing threads. CPython is able to implicitly provide the same guarantees, but only on built-in type factories, by the fact that code is serialized through the use of its Global Interpreter Lock (GIL). Jython now uses Google Guava's support for collections. In particular, we leverage atomically computed function maps.

  • Jython's console now supports completing user input upon pressing the <TAB> key. Using JLine, which we have bundled since 2.5.0, means completion works on both Windows and Unix-derived platforms (including Linux and OS X). Use it just like in CPython:

    import readline
    import rlcompleter
    readline.parse_and_bind("tab: complete")
    

    Usually you would do this in a setup script or a Python shell like IPython. For now, you will also need to change a property setting. See the tracking issue on the specifics, but we hope to have this and IPython support complete, so to speak, by 2.5.2 final.

    Such completion is particularly useful in navigating Java APIs, most of which tend to be complex.

  • You can now call a Java constructor using keyword arguments. Geoffrey French contributed the patch for this nice feature. It will also be the last new feature implemented in the 2.5.x versions!

There are many other features and bug fixes, some small, some large. We will look at these in future posts, as well as some outstanding bugs we should be able to fix before the final release.

And -- last but not least -- please help spread the word:

Organizations using Jython 2.2.1, or earlier, should test their code against 2.5.2 beta 2 now so that bug fixes and/or workarounds may be identified. In particular, please note the following:

  • No additional work is anticipated on Jython 2.2.
  • Jython 2.5.2 is the last release in Jython 2.5.x series that will address non-severe issues. Further enhancements in Java integration, for instance, will be seen in 2.6.
  • Jython 2.6 development will begin immediately following the 2.5.2 release. Jython 2.6 will require the use of JDK 6. We are hoping for some significant performance gains by being able to use invokedynamic support, either directly or through a back port of that support to JDK 6. More also on that in a future post.
Read and Post Comments

Back to Blogging!

September 12, 2010 at 07:23 PM | categories: Meta | View Comments

I'm blogging again. And trying out Blogofile to make that happen. This blog is going to start rather sparse looking, and I need to backfill in my old blog posts. But I will get there.

A couple of points on the tool chain I'm trying out here.

I have chosen to use reStructuredText as the markup language. Since I co-authored The Definitive Guide to Jython (available on Amazon.com too!) in rst, this should come as no surprise. Using rst doesn't get in the way, I can easily include code fragments, and I can add the styling outside of the document. Then readily track with Mercurial. I would like to describe some more of this tool chain for writing books at some point, it was actually quite nice.

I had previously used blogger.com, going back to using it before the Google acquisition. Originally this particular blog, Front Range Pythoneering, was just used for announcements of the user group I was until recently leading, which is called the Front Range Pythoneers, not too surprisingly. As I became involved in Jython development, I started to use blogger.com as well for some blog posts. It was handy. But blogger.com was not. Working around it for posting code with a toolchain that included rst2html was not so much fun. When they stopped supporting external web sites, it finally forced me to switch. Unfortunately it also took me some time to make the switch too, so that further stymied my blogging, beyond the usual lack of time.

Naturally I really don't like it when the tools get in the way!

For this, Blogofile seems like a good choice. Setup so far has been reasonably easy, and I certainly don't have to be committed to it long-term. I'll try to update here on my thoughts about it as I get more experience with it.

Read and Post Comments

Trac on Jython - Genshi Support

July 11, 2009 at 06:18 PM | categories: Jython, PBCVM, AST | View Comments

(I was going to add this as a comment to NixDev Open Source Blog but the commenting system for that blog is currently broken. So here's my response, and maybe it will get me to start blogging here again too.)

There has been less interest in Genshi from Jython development as of late, probably because Mako is a very good alternative for Turbogears 2 and Pylons. But you need Genshi for Trac, so here's what you might do:

  • expat. Jython 2.5 has an implementation of expat (Lib/xml/parses/expat.py) that wraps SAX sufficiently that all of the unit tests for ElementTree pass. The only problem is that it's somewhat slow, since the wrapper is in Python. I would see that as a starting point to selective rewriting in Java, which is much easier to do in Jython than in CPython. Perhaps with just a little more emulation work it will also work with Genshi?
  • AST. Jython 2.5 implements the standard _ast and ast (the latter actually part of 2.6, but we needed it!) modules; we do not have any support for the older compiler module. I don't know the status of changeset 31 to support AST, but if it has been incorporated, or can be, we can work with this part then.
  • CPython bytecode. Lastly Jython 2.5 implements a CPython bytecode VM. This has been tested by compiling the entire regression test suite into CPython byte code (with CPython, we don't yet have a compiler for this path!), and except for some cases around code introspection and minor differences in floating point representation (which is an artifact of using CPython for the compilation process), it passes. So you should be able to generate bytecode and just have it run. Look at Lib/test/test_pbcvm.py for some details here.

Good luck! Feel free to ask any questions on the jython-dev mailing list or #jython on IRC.

Read and Post Comments

Flipping the 2.5 Bit for Jython

June 24, 2008 at 10:59 AM | categories: Jython, 2.5.x | View Comments

Something worth pointing out; as of 8 AM this morning (MDT) in rev 4748, Frank Wierzbicki flipped the bits and pronounced this about the ASM branch:

jbaker:~/jythondev/asm jbaker$ dist/bin/jython
Jython 2.5a0+ (asm:4750, Jun 24 2008, 10:56:16)
[Java HotSpot(TM) Client VM ("Apple Computer, Inc.")] on java1.5.0_13
Type "help", "copyright", "credits" or "license" for more information.
>>>

Yesterday there were easily the most commits we have seen in the Jython project. The real threshold was reached when we incorporated the UTF-16 and new-style exception branches into this branch, fixed the grammar to support most incremental parses, while repointing the standard library to CPythonLib 2.5. Along with a flurry of other fixes!

There's a lot more to go, but this should be an encouraging sign for everyone interested in Jython!

Read and Post Comments

Adopting UTF-16

June 24, 2008 at 10:13 AM | categories: Jython, Unicode | View Comments

Jython 2.5 standardizes on Java 5 as the base version for its implementation. Jython has always mapped both unicode and str types to java.lang.String, but the semantics of String changed as of Java 5. Instead of encoding characters as UCS-2, that is just the basic multlingual plane of 65536 code points, Java - like .Net - adopted the UTF-16 encoding. UTF-16 can represent all 1114112 Unicode code points (U+0 to U+10FFFF), except for isolated surrogates (U+D800 to U+DFFF). These surrogates act as escape characters in the UTF-16 encoding.

This makes things somewhat more complicated, to put it mildly. And this is without even considering combining characters!

Instead of a simple uniform encoding that we see in the narrow (UCS-2) or wide (UCS-4) builds of CPython, we get a variable-length encoding. And unlike UTF-8, it's usually not too efficient. In addition, we lose the ability to represent the isolated surrogates. Finally, because UTF-16 is so very close to UCS-2, it's prone to bugs.

Here's the implementation strategy we adopted. In supporting the unicode type with PyUnicode, we first determine if it's in the basic plane or not:

private enum Plane {
    UNKNOWN, BASIC, ASTRAL
}

private volatile Plane plane = Plane.UNKNOWN;

public boolean isBasicPlane() {
    if (plane == Plane.BASIC) {
        return true;
    } else if (plane == Plane.UNKNOWN) {
        plane = (string.length() == getCodePointCount()) ? Plane.BASIC : Plane.ASTRAL;
    }
    return plane == Plane.BASIC;
}

getCodePointCount is in turn implemented using String#codePointCount. Like other code point methods, it decodes any surrogate pairs.

String immutability means we can cache the result in the volatile field plane; idempotence of this operation ensures consistency. This allows us to equate code units (char) to code points (int), and use the implementations provided by PyString. As it turns out, this was always done before, the only difference between str and unicode was in the encoding rules.

In the rather rare case it isn't, we read with String#codePointAt and write with StringBuilder#appendCodePoint using iterators. A seemingly good alternative would be to use String#offsetByCodePoints. Too bad it doesn't reliably work. So instead we have our iterator implementations, lots and lots of them. And sometimes crazy stuff like this, seen in the implementation of PyUnicode#unicode_strip:

return new PyUnicode(new ReversedIterator(
    new StripIterator(sep,
        new ReversedIterator(
            new StripIterator(sep,
                newSubsequenceIterator())))));

If strip method was used extensively on strings that weren't in the basic plane, it might make sense to rewrite this to decode to an int[] buffer. But that's not likely to be case.

That's also the reason we avoid making the basic plane test unless we have to. There are many situations where Unicode can pass in and out of Jython - specifically to/from Java - without us caring about what planes its characters are drawn from. We assume some overhead from boxing with PyUnicode (although HotSpot mitigates the indirection cost), but we don't have to overdo it by computing this test on construction.

When comparing this with CPython, we do lose the ability to include isolated surrogate code points in Unicode strings. There are even some unit tests for this case. But ultimately this seemed like an implementation detail like testing ref counting, one certainly not worth time spent supporting.

It's worth mentioning that one alternative is to create our own representation, much like JRuby. Ruby's strings are mutable, unlike Python's. This forced the issue for the JRuby developers, because Ruby, like Python, needs good string performance. So JRuby uses byte arrays for strings, although they do use UTF-16 encoded, interned java.lang.String's to uniquely represent symbols (:xyz). Given that symbols are not strings, this works well. Ruby doesn't say anything about the encoding of such strings (ouch!), but JRuby does assume they're UTF-8 encoded when crossing the boundary with Java.

Supporting widened Unicode means having support for this in regular expressions. The first step was to just widen the SRE engine used by Jython to represent characters with int instead of short. So we always unpack to int in this case; see strip above. This engine is a direct translation of the CPython equivalent: it's a mini-VM, much like the pickle VM, and regexes are compiled to SRE bytecode. In the future, we may consider using JRuby's implementation (Joni, a port of Oniguruma to Java), but the devil is in supporting some specifics to Python. As was seen in the CPython case, it was quite straightforward to just doing the widening.

At this point, the biggest outstanding issue is backporting the changes to SRE to support wide character classes (aka big character sets), a pickle problem, as well as various bug fixes. A total of 4 test cases are currently failing in test_re.

And then that's it, at least until we start doing performance profiling.

Read and Post Comments

Next Page ยป