Tuesday, September 02, 2008

Firefox 3.1: Bring on the JIT

Web developers everywhere are excited about Firefox 3.1. Part of that is because of CSS improvements, but the big reason is because of TraceMonkey. This a JavaScript engine with a JIT that uses trace trees, a pretty clever technique to turn interpreted JavaScript (read slow) into compiled native (read fast.) JIT is a big part of why VMs like the Java VM and the CLR are very fast, in general much faster than VMs that do not JIT like in Python, Ruby, or (until now) JavaScript. It is why JRuby is faster than Ruby. Thus the prospect of making JavaScript much faster is very exciting.

Recently I had done some micro-benchmarking of JavaScript performance vs. ActionScript/Flash performance. This concentrated on XML parsing only. Now the ActionScript VM is a JIT VM. In fact, Adobe donated it to Mozilla and it is known as Tamarin. It has been Mozilla's intention of using this for JavaScript in Firefox for awhile, as JavaScript is essentially a subset of ActionScript. TraceMonkey is based on Tamarin, but it adds the trace tree algorithm for picking what to JIT. The trace tree approach allows for smaller chunks of code to be JIT'd. For example if you had a large function, like say a single script that runs when the page loads, then with a traditional JIT you either JIT the whole function or not at all. Now what if that function has a loop that runs dozens of times, maybe populating a data table for example. With a trace JIT you can JIT just that one critical loop, but not the whole giant function. So it should be an improvement over Tamarin and thus ActionScript. Of course there is only one way to tell...

So I repeated the same XML parsing tests that I did for Firefox 3.0 and Safari 4 (beta). First, I had to enable JIT in Firefox. One of the links above describes how to do this (open about:config in FF 3.1, look for the jit.content option and set it to true.) I restarted FF 3.1 just to make sure this took effect. I then ran the tests. The results? Not much difference between FF 3.0 and 3.1b+JIT. FF 3.1b+JIT was about 4% faster, which is probably statistically negligible. It was still 6x slower than ActionScript and almost 3x slower than Safari 4.

So what went wrong? Not sure. Here is the code that gets executed in my test:

function load(){
var parser = new DOMParser();
var xml = {};
var start = 0;
var end = 0;
var msg = "";
var results = document.getElementById("result");
var li = document.createElement("li");
initReq();
req.open("GET", "LargeDataSet?size=50", true);
req.setRequestHeader("Connection", "close");
// use a closure for the response handler
req.onreadystatechange = function(){
if (req.readyState == 4 && req.status == 200){
msg = "XML Size=" + req.responseText.length;
start = (new Date()).getTime();
xml = parser.parseFromString(req.responseText, "text/xml");
end = (new Date()).getTime();
msg += " Parsing took: " + (end-start) + " ms";
li.appendChild(document.createTextNode(msg));
results.appendChild(li);
}
};
req.send(null);
}

Pretty simple code. I manually execute it 20 times. It would sure seem like it could be JIT'd. What gets timed is just the parser.parseFromString(...) call, where parser is a DOMParser. Maybe that object cannot be JIT'd? Maybe there is a bug with the JIT that will be resolved in the future? It does seem to suggest that TraceMonkey may not always be the slam dunk everyone expects.

I was surprised by the results. I thought that FF3.1 would be faster than FF3. I didn't think it would be faster than ActionScript in this case, but I thought that it might be close. In many other cases, I expect ActionScript to still be much faster than TraceMonkey. Why? Well there is one other ingredient in VMs like the JVM and CLR that make them fast: static typing. This allows the VM to make a lot of other optimizations that work in combination with JIT'ing. For example, knowing that a particular variable is a number or a string allows the VM to inline references to that variable. This can eliminate branches in logic (if-else statements, where maybe the else is not possible.) The JIT can then take place on the simplified, inlined code, and be about as fast as possible.

If you read about some of the techniques used in TraceMonkey, it tries to do a lot of the above via type inference. So in some cases TraceMonkey and the AVM2 (ActionScript VM) may be able to do the same level of optimizations. In fact, given its tracing approach, TraceMonkey may be able to do better. But I am guessing that there will be a lot of situations where AVM2 will be able to do more optimizations just because of the extra information it has at its disposal in the form of static typing.

No comments: