Thursday, July 31, 2008

Jython, It ain't no JRuby

All of my recent excursions into Python convinced me it was time to try out Jython. I have had a great experiences with JRuby. It is definitely faster than CRuby for long running processes (obviously slower for short scripts, because of the JVM startup overhead.) In my experiences, which range from doing mathematical algorithms to web applications using Rails, there is extremely good interoperability. I have had no code blow up in JRuby that ran fine in CRuby. Even IDE support has been on par. So my expectations were high for Jython.

Maybe that was the first problem, unrealistic expectations. Let's not jump into the shortcomings, just yet. First off, Jython installation is nice. Well nice as in "there is a gui." The installer only seemed to copy files into the installation directory, nothing more. That is fine, but makes me wonder I bothered with an installer at all. I at least expected into put the jython executable on my path, but it did not. No big deal.

Running a script is painless. It was odd to see a lot of activity the first time I ran a script, but the messaging (or should I say logging) was good enough to give me a good idea about what was going on and it only happens once. It also seemed like Jython was going out of its way to help performance, and JRuby had caused me to expect a nice performance boost from Jython.

So I tried out a script I wrote to solve a Project Euler problem, in particular Problem #43. The performance on Python is not that great, as it takes about 58 seconds to solve on my Macbook. Here is the code. It has been slightly optimized by Gilly.

def combinations(items, n):
if n == 0:
yield []
else:
i = 0
l = len(items)
while i < l:
for combo in combinations(items[:i] + items[i+1:], n-1):
yield [items[i]] + combo
i += 1

def permutations(items):
return combinations(items, len(items))

def to_int(seq):
return reduce(lambda x,y: 10*x + y, seq, 0)

def main():
primes = [2, 3, 5, 7, 11, 13, 17]
digits = range(0, 10)
sum = 0
for p in permutations(digits):
if p[5] == 5 and p[0] != 0:
j = 0
trait = True
while trait:
if j == 7:
y = to_int(p)
sum += y
print y
break
x = 100*p[j+1] + 10*p[j+2] + p[j+3]
trait = not x % primes[j]
j += 1
print "sum = " + str(sum)

if __name__ == '__main__':
main()

Lots of brute force, with a little bit of clever use of Python features. I was ready to crank this up with Jython, but when I tried to run it, I got this error message:

Traceback (innermost last):
(no code object) at line 0
File "euler43.py", line 5
yield []
^
SyntaxError: invalid syntax

Ouch. This actually made me feel stupid. I should have noticed that the current version of Jython is numbered 2.2.1 and that this obviously corresponds to Python 2.2.1 (that is obvious, right?) I had used a Python feature that did not exist in 2.2.1. Luckily there is an alpha version of Jython that is numbered 2.5. What about 2.3 or 2.4, you say? Uhh...

Anyways, with the alpha version of Jython 2.5 used instead everything worked. However, the performance was not what I expected. The same script ran in 171 seconds! It took three times longer than CPython. Wow.

I wrote a similar algorithm in Ruby and it was horribly slow. Perhaps this is why JRuby is faster than CRuby, CRuby is just so slow. Perhaps not. The nice thing about this is that it forced me to optimize the code more. Here is the optimized Ruby code.

def calc(seq)
seq.inject(0) {|x,y| x = 10*x + y}
end

def test(seq)
primes = [2,3,5,7,11,13,17]
j = 1
trait = true
while j < 8 && trait
num = calc(seq[j,3])
trait = (num % primes[j-1] == 0)
j += 1
end
trait
end

sum = 0
digits = (0..4).to_a + (6..9).to_a
for p in digits.permutation
if p[0] != 0
x = p[0,5] + [5] + p[5,4]
trait = test(x)
if trait
y = calc(x)
sum += y
puts "found one " + y.to_s
end
end
end
print "sum= " + sum.to_s

A couple of things to note. This uses Array#permutation, a new feature of Ruby 1.8.7. This is written in C, so you would think it would be super fast. You would be wrong. The latest JRuby is 1.1.3 and does not implement all of the 1.8.7 features, including Array#permutation. So this code will not run in JRuby. It winds up being much faster than the Python code, but only because the algorithm is so much better. It only deals with 9! numbers instead of 10!. Without the modification, Ruby was so slow that I did not have the patience to let it finsh. We're talking 10+ minutes for something that only took 1 minute in Python. With the improved algorithm it took about 40s to solve the problem in Ruby. When I ported the change over to Python, it dropped Python down to around 9s. When I tried the ported code in Jython, it would not run at all... I haven't tracked down that problem yet.

1 comment:

Pace said...

Thanks for this post. I am a java programmer who really likes the Python language, but I really want to be able to run both types of code at once in an efficient mannor. Jpype was the closest idea I saw to something that would make this happen, but I think it is not a well developed programme. I feel discouraged by your post, but at least more confirmed in not wasting my time with this anymore.