Programming and politics: jruby

Showing posts with label jruby. Show all posts

Wednesday, January 20, 2010

JVMOne

This morning, my co-worker Jason Swartz had the great idea of a conference focussing on JVM languages. This seemed like a particularly good idea, given the uncertainty surrounding JavaOne. Personally, I think the JavaOne powers-that-be have done a good job showcasing other languages, especially the last two years. Anyways, I joked that we could probably host it at our north campus, since it has proper facilities and regularly hosts medium sized conferences, and that we just needed the support of folks from the Groovy, JRuby, Scala, and Clojure communities. A lot of folks seemed to like the idea, and had some great feedback/questions. So let me phrase some of these questions in the context of "a JavaOne-ish replacement, focussing on alternative languages on the JVM, but ultimately driven by the community". Ok here are some of the questions.

1.) What about Java?
2.) What about languages other than Groovy, JRuby, Scala, and Clojure?
3.) What about first class technologies with their roots in Java, like Hadoop, Cassandra, etc.?

Certainly I have opinions on some of these, but obviously any kind of effort like this requires huge participation from the developer community. So what do you think? What other questions need to be asked? If there is enough interest, I will definitely try to organize it. So please leave comments below!

Monday, August 03, 2009

The Strange Loop

In October, I am speaking at the inaugural Strange Loop conference in St. Louis. This is not your run of the mill conference. It is organized by Alex Miller, who you might have seen speak the last couple of years at JavaOne. The speaker list is sweet: Alex Payne and Bob Lee are doing the keynotes, with sessions by Charles Nutter, Dean Wampler, Stefan Schmidt, Guillaume Laforge, Jeff Brown, and Alex Buckley. I am doing a talk on iPhone/Android development. It will probably be pretty boring compared to the other sessions. I may try to spice it up with some shameless plugging of Scala+Android balanced by some Fake Steve Jobs quotes about Android.

Friday, June 05, 2009

JavaOne Talk: Performance Comparisons of Dynamic Languages on the Java Virtual Machine

Below are the sldies. Major thanks to Charlie Nutter for great feedback and advice on the Ruby code and tuning JRuby performance. Thanks to Chouser and Timothy Pratley for help with the Clojure code. And major thanks to Brian Frank for help with the Fan code.

Performance Comparisons of Dynamic Languages on the Java Virtual Machine

View more PDF documents from michael.galpin.

Tuesday, May 05, 2009

JavaOne Talk: Ruby Prime Sieve

See this post for an explanation on why this code is being shown and what kind of responses would be useful. See this post for the Java algorithm to compare against.


class RubyPrimes
  attr_accessor :cnt, :primes
  def initialize(n)
    @primes = Array.new(n,0)
    @cnt = 0
    i = 2
    max = calcSize
    nums = Array.new(max,true)
    while (i < max && @cnt < @primes.length)
      p = i
      if (nums[i])
        @primes[@cnt] = p
        @cnt += 1
        (p .. (max/p)).each{ |j| nums[p*j] = false }
      end
      i += 1
    end
  end
  def calcSize
    max = 2
    max = max * 2 while ( max/Math.log(max) < @primes.length)
    max
  end
  def last
    @primes[@cnt -1]
  end
end

Sunday, August 31, 2008

Recent Other Writings

Last week, IBM published an article I wrote on using JRuby on Rails with Apache Derby. It concentrates on rapid prototyping/development. I didn't get too heavily into the IDE side of things, but when you add RadRails into the equation it really is nirvana-ish development. Very fun.

I've also been writing a lot on InformIT about Java Concurrency in Practice. I did some fun stuff over there too, like try to turn some Project Euler code into parallel code. I guess technically that succeeded just fine, but is a good example of when parallel code is not any faster. In this case, the algorithm was CPU bound anyways. Even having two cores didn't really help much. Oh well. I treated it like a strength exercise back when I took piano lessons.

Thursday, July 31, 2008

Jython, It ain't no JRuby

All of my recent excursions into Python convinced me it was time to try out Jython. I have had a great experiences with JRuby. It is definitely faster than CRuby for long running processes (obviously slower for short scripts, because of the JVM startup overhead.) In my experiences, which range from doing mathematical algorithms to web applications using Rails, there is extremely good interoperability. I have had no code blow up in JRuby that ran fine in CRuby. Even IDE support has been on par. So my expectations were high for Jython.

Maybe that was the first problem, unrealistic expectations. Let's not jump into the shortcomings, just yet. First off, Jython installation is nice. Well nice as in "there is a gui." The installer only seemed to copy files into the installation directory, nothing more. That is fine, but makes me wonder I bothered with an installer at all. I at least expected into put the jython executable on my path, but it did not. No big deal.

Running a script is painless. It was odd to see a lot of activity the first time I ran a script, but the messaging (or should I say logging) was good enough to give me a good idea about what was going on and it only happens once. It also seemed like Jython was going out of its way to help performance, and JRuby had caused me to expect a nice performance boost from Jython.

So I tried out a script I wrote to solve a Project Euler problem, in particular Problem #43. The performance on Python is not that great, as it takes about 58 seconds to solve on my Macbook. Here is the code. It has been slightly optimized by Gilly.


def combinations(items, n):
    if n == 0:
        yield []
    else:
        i = 0
        l = len(items)
        while i < l:
            for combo in combinations(items[:i] + items[i+1:], n-1):
                yield [items[i]] + combo
            i += 1
 
def permutations(items):
    return combinations(items, len(items))

def to_int(seq):
    return reduce(lambda x,y: 10*x + y, seq, 0)
 
def main():
    primes = [2, 3, 5, 7, 11, 13, 17] 
    digits = range(0, 10)
    sum = 0
    for p in permutations(digits):
        if p[5] == 5 and p[0] != 0:
            j = 0
            trait = True
            while trait:
                if j == 7:
                    y = to_int(p)
                    sum += y 
                    print y
                    break
                x = 100*p[j+1] + 10*p[j+2] + p[j+3]
                trait = not x % primes[j]
                j += 1
    print "sum = " + str(sum)
 
if __name__ == '__main__':
    main()

Lots of brute force, with a little bit of clever use of Python features. I was ready to crank this up with Jython, but when I tried to run it, I got this error message:

Traceback (innermost last):
(no code object) at line 0
File "euler43.py", line 5
yield []
^
SyntaxError: invalid syntax

Ouch. This actually made me feel stupid. I should have noticed that the current version of Jython is numbered 2.2.1 and that this obviously corresponds to Python 2.2.1 (that is obvious, right?) I had used a Python feature that did not exist in 2.2.1. Luckily there is an alpha version of Jython that is numbered 2.5. What about 2.3 or 2.4, you say? Uhh...

Anyways, with the alpha version of Jython 2.5 used instead everything worked. However, the performance was not what I expected. The same script ran in 171 seconds! It took three times longer than CPython. Wow.

I wrote a similar algorithm in Ruby and it was horribly slow. Perhaps this is why JRuby is faster than CRuby, CRuby is just so slow. Perhaps not. The nice thing about this is that it forced me to optimize the code more. Here is the optimized Ruby code.


def calc(seq)
  seq.inject(0) {|x,y| x = 10*x + y}
end
 
def test(seq)
  primes = [2,3,5,7,11,13,17]
  j = 1
  trait = true
  while j < 8 && trait
    num = calc(seq[j,3])
    trait = (num % primes[j-1] == 0)
    j += 1
  end
  trait
end

sum = 0
digits = (0..4).to_a + (6..9).to_a
for p in digits.permutation
  if p[0] != 0
    x = p[0,5] + [5] + p[5,4]
    trait = test(x)
    if trait
      y = calc(x)
      sum += y
      puts "found one " + y.to_s
    end
  end
end
print "sum= " + sum.to_s

A couple of things to note. This uses Array#permutation, a new feature of Ruby 1.8.7. This is written in C, so you would think it would be super fast. You would be wrong. The latest JRuby is 1.1.3 and does not implement all of the 1.8.7 features, including Array#permutation. So this code will not run in JRuby. It winds up being much faster than the Python code, but only because the algorithm is so much better. It only deals with 9! numbers instead of 10!. Without the modification, Ruby was so slow that I did not have the patience to let it finsh. We're talking 10+ minutes for something that only took 1 minute in Python. With the improved algorithm it took about 40s to solve the problem in Ruby. When I ported the change over to Python, it dropped Python down to around 9s. When I tried the ported code in Jython, it would not run at all... I haven't tracked down that problem yet.

Friday, July 25, 2008

JRuby 1.1.3 Performance

It's been awhile since I did a post with pretty graphs and meaningless micro-benchmarks. I am using the latest version of JRuby, 1.1.3 on a new project. I knew from some correspondence with Charlie Nutter that the performance bug that I had encountered previously were believed to be solved. I said believed because he had not had a chance to test on the IBM J9 JVM that (along with some other JVMs) exhibited the performance problem. Actually this was fixed in 1.1.2, and I should have tested things then, but didn't get around to it until now. I decided to throw in the latest beta of JDK 1.6 update 10, the magical JDK for The Rest of Us. Here is the purrty chart:

The good news is that even with the J9 JVM, JRuby outperforms native Ruby, at least on this mathematical algorithm. This was not the case prior, so the JRuby fixes have really helped. In fact, if I look at the last data point (x=100), the code executed in less than half the time it took previously with the only change being the JRuby 1.1.3 instead of 1.1. I could describe the source of the bug, but you are much better off reading Charlie's description.

Tuesday, May 06, 2008

Dynamic Language Performance

A couple of days ago, I read Charlie's post explaining the performance boost seen in Groovy 1.6. Reading stuff like this always leaves me with a great feeling. Not only do you learn something, but it makes other things make more sense. It brings order to chaos, or something like that. Around the same time I read that, I was working a new article about Grails, so the Groovy angle was particularly interesting. I love benchmarks, so it was time to have some fun.

I wrote a Groovy version of the same Ruby code I had used to benchmark JRuby. This was an extremely straightforward port. I was amazed at just how similar Groovy's syntax is to Ruby. Here is the code:

def expo(n,p){
  def r = n % p
  def exp = 0
  def div = p
  while (r == 0){
      exp += 1
      div *= p
      r = n % div
  }
  return exp
}

def factor(n){
  def factors = new java.util.HashMap<Integer,Integer>()
  def s = n * 0.5
  def p = (2..s).toArray()
  p.each{
      if (it) {
          def r = expo(n,it)
          if (r){
              factors[it] = r
          }
          def val = it*2
          while (val <= s){
              p[val -2] = null
              val += it
          }
      }  
  }
  return factors
}

def numDivisors(n){
  def total = 1
  factor(n).values().each{
      total *= (it+1)
  }
  return total
}

def n = 2
def num = 1
def max = Integer.parseInt(this.args[0])
def Integer triangle = 0
while (num <= max){
  triangle = n*(n+1) * 0.5
  num = numDivisors(triangle)
  n += 1
}
println(triangle)

Anyways, here is the chart.

There is definitely a performance boost for long running processes where JIT'ing can happen more easily in 1.6. It was not as dramatic as I thought it might be, but it is there. Of course this is just one silly benchmark that is heavy in integer math, so take that for what it's worth.

I also compared Groovy and JRuby. This was also surprising:

Pretty close! Groovy seems to start-up a little slower, but pulled ahead slightly on bigger tasks. Perhaps the apprentice has overtaken the master.

Also, just for kicks, I tried out Scala. Here is the code:

import scala.collection.mutable._

object Euler12{
   def expo(n:int, p:int):int = {
       var r = n % p
       var exp = 0
       var div = p
       while (r == 0){
           exp = exp + 1
           r = n % div
           div = div * p
       }
       if (exp == 0 ) 0 else (exp-1)
   }
  
   def factor(n:int):Map[int,int] = {
       var factors = new HashMap[int,int]()
       var s:int = (n/2) + 1
       val p = (2 until s).toArray
       p.foreach( (num) => {
           if (num > 1){
               val r = expo(n, num)
               if (r > 0){
                   factors.put(num, r)
               }
               var i = num*2
               while ((i-2) < p.length){
                   p(i - 2) = 0
                   i = i + num
               }
           }
       })
       return factors
   }
  
   def numDivisors(n:int):int = {
       var total = 1
       factor(n).values.foreach((num) => {
           total = total * (num+1)
       })
       factor(n).values.foldLeft(1)((p,m) => {
           p * (m+1)
       })
   }
  
   def main(args:Array[String]) : Unit = {
       val t = new java.util.Date()
       var n = 1
       var num = 1
       val max = Integer.parseInt(args(0))
       var triangle = 3
       while (num <= max){
           triangle = n*(n+1)/2
           num = numDivisors(triangle)
           n = n + 1
       }
       println(triangle)
   }
}

This turned out to not be fair. Scala's performance is exactly on par with Java and thus blows away JRuby and Groovy.

I guess that is what happens when you have a language written by a guy who once wrote javac... Actually I would guess this is mostly a function of the static typing in Scala. It certainly bodes well for initiatives to bring features of Scala, like (BGGA-style) closures and type inference, to Java. It seems possible to implement all of this with no impact on performance, even on a JVM that has not been made to support such features.

Wednesday, March 05, 2008

More JRuby Performance

In the aftermath of my post on JRuby's performance, I exchanged some info with Mr. JRuby himself, Charles Nutter. Per his request, I opened a bug on the matter. It looks like it is a JVM issue, i.e. JRuby ran slowly on IBM's J9 JVM. I did some micro-benching on Ruby vs. JRuby on a variety of platforms and JVMs. It was only on the J9/2.3 (IBM's JDK 5.0 JVM) that JRuby was was slower than the latest "native" Ruby implementation on that platform. Everybody loves charts, so here are some fun ones.

This was on my MacBook, with both the standard HotSpot Java 5.0 and Java 6.0 preview versions. I also compared using the -J-server (just becomes -server for the JVM) option, since HotSpot on the Mac runs in client mode by default. Thus the -server made a big difference.

This was on my home desktop system, a 32-bit Windows Vista system. I only did native Ruby vs. JRuby with and without the -server option. Again JRuby with the -server option crushed native Ruby.

Finally the environment that caused all the problems, my workstation. As you can see, it did quite poor compared to native Ruby. However, JRuby with either the 5.0 or 6.0 HotSpot JVM was much faster. I was actually hoping to see a better performance advantage on the 6.0 VM vs. the 5.0 one... I don't have IBM's 6.0 VM, so I could not include it. The HotSpot VMs were both 64-bit, whereas the IBM J9 on was a 32-bit VM.

Tuesday, March 04, 2008

JRuby Performance

During my lunch today, I solved Problem 12 from Project Euler. As usual, I wrote the solution in Ruby and was surprised by just how long it took to calculate. It made me decide to try JRuby.

First, a note about the problem and my solution. The problem was to find the first triangle number (where the n_th triangle number is 1+2+3...+n) that has at least 500 divisors. My solution was pretty brute force. I took each triangle number and computed its prime factorization. For example 28 = 2^2 * 7^1. Thus the number of factors is (2+1)*(1+1) = 6. Generally if N = A^a * B^b * ... where A,B,.. are primes, then the number of factors is (a+1)*(b+1)*...

With all of that in mind, why did I think JRuby would be faster than Ruby on a problem like this? This kind of calculation is well suited for JVM optimizations: unwinding of loops, JIT'ing of the code, etc. Thus I thought this might be the kind of problem where the JVM could make JRuby run a lot faster than Ruby. Boy was I wrong!

In general, I found JRuby to take twice as long as plain ol' Ruby (or C Ruby as the JRuby folks like to call it.) This was true on Windows, where Ruby is considered to have a poor implementation by many, and on OSX.

This made me thing that my conjecture was wrong to begin with. Maybe this was not the kind of code that the JVM could do much with. I re-wrote the algorithm in Java and re-ran it. It was exponentially faster in Java than in Ruby or JRuby. Indeed, the JVM was able to optimize the runtime execution of the code and make it fly.

Is this what I should have expected? Is JRuby generally much slower than Ruby? I really thought that part of the idea behind JRuby was to leverage the JVM to make Ruby faster.