Saturday, November 21, 2009

Passing on the Duby

Last week I had a fun conversation with Charles Nutter. It was about "" and in particular, Charlie's requirements for a true replacement for Java. He stated his requirements slightly before that conversation, so let me recap:

1.) Pluggable compiler
2.) No runtime library
3.) Statically typed, but with inference
4.) Clean syntax
5.) Closures

I love this list. It is a short list, but just imagine if Java 7 was even close to this. Anyways, while I like the list as a whole, I strongly objected to #2 -- No runtime library. Now, I can certainly understand Charlie's objection to a runtime library. It is a lot of baggage to drag around. All of the JVM langs have this issue. JRuby for example, has an 8 MB jar. Groovy has a 4 MB jar. Clojure has two jars totaling 4.5 MB. Scala is also around 4 MB. These become a real issue if you want to use any of these languages for something like Android development. I've written about this issue as well.

However, there is a major problem with this requirement. The building blocks of any programming language includes primitives and built-in types. Those built-in types are part of the runtime library of the language. So if your JVM based language cannot have a runtime library, then you will have to make do with primitives and Java's built-in types. Why is this so bad? Java's types (mostly) make sense in the context of Java, its design principles and syntax. I don't think they would make sense in the hypothetical language described above. The easiest example of this are collection classes. Don't you want a list that can make use of closures (#5 requirement above) for things like map, reduce, filter, etc. ? Similarly, if you static typing, you probably have generics, and wouldn't you like some of your collections to be covariant? You can go even further and start talking immutability and persistent data structures, and all of the benefits these things bring to concurrent programming. This is just collections (though obviously collections are quite fundamental to any language,) but similar arguments apply to things like IO, threading, XML processing, even graphics (I'd like my buttons to work with closures for event handling thank you very much.)

One argument against this is that you can just include the runtime library of your choice. Pick your own list, hash map, thread pool, and file handler implementations. This is what I'd like to call the C++ solution -- only worse. At least in C++ you can generally count on the STL being available. The thing that is so bad about this is that it really limits the higher-order libraries that can be built. Java, Ruby, and Python all have a plethora of higher order libraries that have been built and widely used. These libraries make extensive use of the built-in types of those languages. Imagine building ORMs like Hibernate or ActiveRecord if you did not have built-in types (especially collection classes) that were consistent with the language. You could not do it. If all you could rely on was the Java built-in types, then at best your libraries would have a very Java-ish feel to them, and doesn't that defeat the purpose?

Charlie gave an alternative to this -- leverage the compiler plugins. Now it is certainly true that with a pluggable compiler, you could do this like add a map method to the java.util.List interface, and all of the JDK classes that implement this interface. It can be done. However, if you are going to build higher order libraries on top of this language, then you need to be able to count on these enhancements being present. In other words, the compiler plugin needs to be a required part of the compiler. Fine. Now what is it that we were trying to avoid? Extra runtime baggage, right? Well if you have a compiler that is going to enhance a large number of the classes in the JDK, hello baggage. Maybe it won't be as much baggage as 4 MB runtime library, but it might not be a lot less either. Also, it raises the potential problem of interoperability with existing Java libraries. If I import a Hibernate jar that gives me a java.util.List, that list is not going to be enhanced by my compiler -- because it wasn't compiled with my, it was compiled with javac.old. What's going to happen here?

Now there is an obvious happy way to deal with the runtime library question: rt.jar. If was Java, and the super revamped collections, IO, threading, etc. classes were part of the JDK, then there is no need for a runtime library since now it has been included with the JDK. However, given the incredibly uncertain state of Java 7, not to mention the long term prospects of Java, does this seem remotely possible to anyone? I don't think so. I think the heir apparent to Java cannot be Java, and I think because of that, it's gotta have a runtime library of its own.


Jorge Ortiz said...

There's another problem with using compiler plugins instead of runtime libraries. I don't know what Charlie has in mind in terms of compiler plugin design for Duby, but generally compiler plugins don't compose well. They have global effects on your entire program, and the possible interactions of several compiler plugins are often unknowable. Given even a small set of plugins, there's a combinatorial explosion in the number of possible setups of including or not including a given plugin. It'll be impossible to try every combination to make sure they work well together, and very hard to predict whether they will work well together without trying each combination.

Charles Oliver Nutter said...

Somehow I missed this article when you originally wrote it. I have a few small rebuttals.

* It would be useful to mention that you're a Scala fan, since that may mean Duby really isn't for you. I'm not trying to build a Scala, I'm trying to build a lightweight replacement for Java that doesn't require learning an entirely new type system and set of standard libraries. I want something incrementally better than Java.

* I have never said that there might not be a standard set of libraries built in and made available for people to use with Duby. My goal is to avoid the requirement that "Hello, world" immediately shackles you with a runtime library. It should be up to the developer to choose their runtime library dependencies.

* Java has survived for years with only the JDK and third-party libraries written almost exclusively in Java. I would like to build off that base rather than creating an entirely new platform people need to learn, as is the case in Groovy or Scala. Duby can have the literals people have wanted (regex, lists and hashes, various string and numeric forms), closures and internal enumerations (largely represented as syntactic sugar over anonymous inner classes), various plugins to "virtually" extend core JDK classes, and a whole lot more without ever introducing a single runtime dependency. How can that *not* be worth doing?

I suppose my rebuttal boils down to a Scala versus "a better Java" argument. I don't feel like Scala is the way the platform should be going, since instead of directly improving the platform it's bending over backwards to ignore it (and its compiler and runtime has had to grow ever-more-complicated to compensate). Duby does the exact opposite, trying to marry the majority of "small language features" that people have been missing with a lightweight, pluggable, easy-to-understand compiler and no runtime dependencies except what developers bring to the table.

If Scala is the language for you, by all means use it and enjoy it. But there's a large community of Java developers that just want a slightly better Java, and I'm hoping to bring it to them.