Programming and politics

Wednesday, September 03, 2008

More JavaScript Benchmarking

My old boss sent me this link about Google Chrome performance. It's a good read. It includes a link to an interesting JavaScript micro-benchmark. It included some interesting findings on Chrome vs. Firefox 3, Safari 3.1, and the new IE 8 beta 2. I was curious about some other browsers, namely Firefox 3.1 beta with and without JIT, Safari 4 beta, and Opera 9.5. Of course I made a nice picture of my results.

Interesting results. First off, FF 3.1 with JIT did not crash. It crashed so many times on me yesterday, that I was sure it would crash on this. Even though it did not crash, it was barely faster than FF 3.1 no JIT or FF 3.0.1. In fact, it was really only faster at error handling and the same on everything else. Apparently errors are easy to JIT for TraceMonkey!

Next, Safari 4 beta is fast. If you look at the link above, Safari 3.1 was already the fastest thing out there, so I guess this should not be a surprise. It crushed everything and it did it on the kind of tasks that real developers do a lot: array and string manipulation, regular expressions, and DOM manipulation (technically not part of your JS engine, but practically the most important test.) I am not used to seeing Opera lose when it comes to any kind of benchmark. If you throw out the array manipulation, it and Safari are pretty close.

I will have to boot up Parallels and try out Chrome vs. Safari 4 beta vs. FF 3.1 beta on Windows.

Tuesday, September 02, 2008

Firefox 3.1: Bring on the JIT

Web developers everywhere are excited about Firefox 3.1. Part of that is because of CSS improvements, but the big reason is because of TraceMonkey. This a JavaScript engine with a JIT that uses trace trees, a pretty clever technique to turn interpreted JavaScript (read slow) into compiled native (read fast.) JIT is a big part of why VMs like the Java VM and the CLR are very fast, in general much faster than VMs that do not JIT like in Python, Ruby, or (until now) JavaScript. It is why JRuby is faster than Ruby. Thus the prospect of making JavaScript much faster is very exciting.

Recently I had done some micro-benchmarking of JavaScript performance vs. ActionScript/Flash performance. This concentrated on XML parsing only. Now the ActionScript VM is a JIT VM. In fact, Adobe donated it to Mozilla and it is known as Tamarin. It has been Mozilla's intention of using this for JavaScript in Firefox for awhile, as JavaScript is essentially a subset of ActionScript. TraceMonkey is based on Tamarin, but it adds the trace tree algorithm for picking what to JIT. The trace tree approach allows for smaller chunks of code to be JIT'd. For example if you had a large function, like say a single script that runs when the page loads, then with a traditional JIT you either JIT the whole function or not at all. Now what if that function has a loop that runs dozens of times, maybe populating a data table for example. With a trace JIT you can JIT just that one critical loop, but not the whole giant function. So it should be an improvement over Tamarin and thus ActionScript. Of course there is only one way to tell...

So I repeated the same XML parsing tests that I did for Firefox 3.0 and Safari 4 (beta). First, I had to enable JIT in Firefox. One of the links above describes how to do this (open about:config in FF 3.1, look for the jit.content option and set it to true.) I restarted FF 3.1 just to make sure this took effect. I then ran the tests. The results? Not much difference between FF 3.0 and 3.1b+JIT. FF 3.1b+JIT was about 4% faster, which is probably statistically negligible. It was still 6x slower than ActionScript and almost 3x slower than Safari 4.

So what went wrong? Not sure. Here is the code that gets executed in my test:


function load(){
 var parser = new DOMParser();
 var xml = {};
 var start = 0;
 var end = 0;
 var msg = "";
 var results = document.getElementById("result");
 var li = document.createElement("li");
 initReq();
 req.open("GET", "LargeDataSet?size=50", true);
 req.setRequestHeader("Connection", "close");
 // use a closure for the response handler
 req.onreadystatechange = function(){
  if (req.readyState == 4 && req.status == 200){
   msg = "XML Size=" + req.responseText.length;
   start = (new Date()).getTime();
   xml = parser.parseFromString(req.responseText, "text/xml");
   end = (new Date()).getTime();
   msg += " Parsing took: " + (end-start) + " ms";
   li.appendChild(document.createTextNode(msg));
   results.appendChild(li);
  }
 };
 req.send(null);
}

Pretty simple code. I manually execute it 20 times. It would sure seem like it could be JIT'd. What gets timed is just the parser.parseFromString(...) call, where parser is a DOMParser. Maybe that object cannot be JIT'd? Maybe there is a bug with the JIT that will be resolved in the future? It does seem to suggest that TraceMonkey may not always be the slam dunk everyone expects.

I was surprised by the results. I thought that FF3.1 would be faster than FF3. I didn't think it would be faster than ActionScript in this case, but I thought that it might be close. In many other cases, I expect ActionScript to still be much faster than TraceMonkey. Why? Well there is one other ingredient in VMs like the JVM and CLR that make them fast: static typing. This allows the VM to make a lot of other optimizations that work in combination with JIT'ing. For example, knowing that a particular variable is a number or a string allows the VM to inline references to that variable. This can eliminate branches in logic (if-else statements, where maybe the else is not possible.) The JIT can then take place on the simplified, inlined code, and be about as fast as possible.

If you read about some of the techniques used in TraceMonkey, it tries to do a lot of the above via type inference. So in some cases TraceMonkey and the AVM2 (ActionScript VM) may be able to do the same level of optimizations. In fact, given its tracing approach, TraceMonkey may be able to do better. But I am guessing that there will be a lot of situations where AVM2 will be able to do more optimizations just because of the extra information it has at its disposal in the form of static typing.

Sunday, August 31, 2008

Recent Other Writings

Last week, IBM published an article I wrote on using JRuby on Rails with Apache Derby. It concentrates on rapid prototyping/development. I didn't get too heavily into the IDE side of things, but when you add RadRails into the equation it really is nirvana-ish development. Very fun.

I've also been writing a lot on InformIT about Java Concurrency in Practice. I did some fun stuff over there too, like try to turn some Project Euler code into parallel code. I guess technically that succeeded just fine, but is a good example of when parallel code is not any faster. In this case, the algorithm was CPU bound anyways. Even having two cores didn't really help much. Oh well. I treated it like a strength exercise back when I took piano lessons.

Thursday, August 28, 2008

Search Twitter from Flash

I have updated the Twitter ActionScript API. I added support for search. You are probably aware that search is provided by Summize, who was acquired by Twitter. It is pretty obvious that the APIs have not yet been merged!

Twitter's API is all based on Rails ActiveResource ... which is awesome. It turns any resource (often a database table) into a RESTful service. REST is often associated with XML, but Rails (and thus Twitter) supports it as JSON (Twitter supports ATOM and RSS as well) too. For ActionScript, XML is great. Or I should say POX is great and that is what Rails serves up.

The Twitter Search API is different. It supports two formats: ATOM and JSON. No POX. I went with the ATOM format. For JSON, I would have used Adobe's corelib. It works well, but I didn't want to add the weight. Plus, JSON tends to parse much slower in AS3 than XML. That is because AS3 implements E4X. To get E4X to work with ATOM, you have to be mindful of namespaces. For example, here is the code I used to interate over the entries: for each (entryXml in xml.atom::entry). Here the xml variable is the parsed result from search.twitter.com and atom is a Namespace object. Not as pretty as just xml.entry, but oh well.

Sunday, August 24, 2008

Parsing XML on the Client: JavaScrpt vs. ActionScript

Developer: Wow the JavaScript interpreter on Firefox 3 is awesome, but the one on the new Safari is even better. It is a great time to be a JavaScript developer!

Me: It is still a lot slower than ActionScript.

Developer: Oh don't quote me old numbers, the new browsers are so much faster.

Me: Still slower than ActionScript.

Developer: Maybe slower at doing useless things like calculating prime numbers. Who would do that in a browser anyways? The new browsers are fast at doing realistic things.

Me: But still slower than ActionScript.

Developer: Show me some proof on something realistic, like parsing XML coming back from an Ajax call.

Me: [Whips together a servlet for producing huge chunks of XML and some JS and a SWF for calling it and doing a DOM parse.] Alright let's see the results of these tests...

Everything is O(N), as you would expect, and can verify by doing a linear regression of XML document size vs. parse time. Safari 4 is much faster than Firefox 3, the ratio of their slopes (FF3/S4) = 2.95. But they both lose badly to ActionScript 3 (running Flash Player 10, RC2), (FF3/AS3) = 6.36 and (S4/AS3) = 2.16. Maybe IE can do better, should we give it a try?

Developer: Now you are being a jerk.

Thursday, August 21, 2008

Automounting a Drive in OSX

One of my colleagues had an interesting question for me. We needed to auto-mount a Windows drive from a Mac. The Mac was being used to automatically create screenshots of web pages on various Mac browsers. It needed to then upload the screenshots to a shared drive. Thus mounting the drive and just doing a copy seemed like the easiest way to go.

Mounting a Windows drive is very easy with a Mac. Just go to Finder -> Go -> Connect to Server -> and then enter smb://some-windows-machine. Automator seemed like the way to go here. I had actually never used it, but it proved quite easy. Here is what it looked like for me:

As you can see from the screenshot, I used the Actions -> Files & Folders. I first selected the Get Specified Servers action and added the same URL that I normally used to manually mount the drive. I then added a Connect to Servers action. You will want to test it once, so that you can submit your credentials while making sure to add the credentials to your Keychain. Next, you'll want to do Saves As and change the format to Application. Now to get to execute the Application automatically at startup, go to System Preferences -> Accounts -> Login Items and then browse to wherever you saved the application. Re-boot and that's it!

Tuesday, August 19, 2008

Cache Discussions

My post about using MySQL for caching got picked up by reddit and viewed a few thousand times. It sparked some discussion, but unfortunately it has been spread out on a few different sites. So I decided to aggregate them here.

"Here's a big reason to use MemCached: expiry!
Let's say you only want to do a complicated query once every fifteen minutes. Do it once, put it in a cache by key with an expiry of 15 minutes. Let memcached worry about when to take it out for you."

Yes, this is a good reason to use memcached. I used this pattern for the aggroGator, with Google's version of memcached. Which reminds me that Part 3 of the series I wrote on GAE is out... Anyways, in that app, RSS feed results are cached in memcached with a five minute poll (initiated client-side, so only polling for logged in users.)

Expiry is the cache eviction policy for memcached, where as any database cache is going to be more of an LRU policy. There will be cases where expiry is more useful, but I would actually guess that LRU is appropriate for the majority of use cases...

"MySQL memory table is not as fast as memcached. Depending on your data, memcached is 3X times or more as fast for get/set (select/insert)."

Really? I would love to see some objective results for this. Of course it would have to be an apples-to-apples comparison. The data would need to be retrieved from a cache node on separate physical machine and for the MySQL cache, it would need to be a select by primary key. Now I wouldn't be terribly surprised if memcached was slightly faster, but 3X? I would be even more surprised if a put/insert was faster at all.

"This has been said many times. MySQL and memcached serve different purposes. Memcached is used to store processed data, while MySQL generally contains raw, normailized data, which needs lots of complex queries and other processing."

I actually mentioned this at the end of my post... So obviously I agree. But I have a feeling that people use memcached to cache a lot of data that is not very processed at all. Also, the last line is very misleading. You do not need to do much normalizing of your data. I can tell you that anybody doing federated database systems have to do a lot of de-normalizing of their data. And complex queries and other processing? That is just silly.

"Facebook needs memcache for the obvious reason that it's pages are highly complex and include many pictures."

Eh? Don't see how pictures would matter... But if Facebook is using memcache for HTML fragments, then I would agree that this is the right kind of cache. I don't know if this is the case or not. Other things like my list of friends or my contact info would be a poor choice for memached. Something like the Facebook feed... That is a lot tougher. There are limits to what you can cache, since the feed changes a lot and you might have a low tolerance for stale data. You might be able to create HTML fragments for the stories and cache those?

" Also, fewer of Facebooks pages are time-critical when compared to eBay. On eBay you basically can't cache a page rendering (memcache) if it has less than a minute
of auction time left"

Item listings are certainly time-critical, i.e you expect the price to be accurate when you are looking at a listing and considering bidding on it. This is true regardless of the time remaining, being less than a minute doesn't matter too much. However, that is just one page, many other pages are not so sensitive, but they are very dynamic.

When it comes to picking between MySQL and memcached, I would first say: are you using an ORM but need caching? If the data is being accessed through ORM, then your cache layer should be a database, not memcached. Again the only exception I could see to this would be a graph, i.e. data that is hard to describe relationally (requires self referential foreign keys, etc.)

Thursday, August 14, 2008

Cache Money

Scalability is a hard question, but a lot of people think that scalability is all about caching. In particular, memcached is the answer for caching. I think we can blame Facebook for this. Everybody knows that Facebook makes heavy use of memcached. Terry says that social graphs are a scalability problem for databases that is solved by memcached, so he is clearly drinking the Kool-Aid. The benefits of caching are obvious, but is memcached really the best/only way?

Earlier this year, eBay won an award from MySQL. This was for application we built that we originall Gem Cache. It is a caching tier that is built on top of MySQL. When the caching tier was designed, memcached was given a lot of consideration, but there were some very good advantages we got out of using MySQL instead.

First off, can MySQL be as fast as memcached? Absolutely. MySQL is aggressive about keeping things in memory, and if everything is in memory, it will be as fast as memcached. You can use MySQL's MEMORY engine, to accomplish this, or you can stick with MyISAM and let MySQL's caching put things in memory for you. Obviously you need to split your database, but we already knew how to do that efficiently. With that in mind, here are the advantages that MySQL offers.

1.) SQL Semantics. You are not limited to just simple "put" and "get." You can do selects and joins, aggregates, etc.
2.) Uniform Data Access. Do you use some kind of ORM? You can leverage this with a MySQL based cache.
3.) Write-through Caching. In a typical memcached setup, updates are still done to the database and this invalidates one or more objects in memcached. With a MySQL based cache, a row in the cache corresponds to a row in the "real" database. So we can write the cache and then asynchronously update the system of record.
4.) Read-through Cache. Similarly, you can always attempt to read from the cache database. If there is a cache miss, you can invisibly read from the real database and add to the cache at the same time.
5.) Replication. MySQL allows for replication of data, so it is easy to add redundancy and fail-over to your cache. Replication can also be useful when you have multiple data centers.
6.) Management. There are lots of great management tools for DBA, operations folks, etc. to use with MySQL.
7.) Cold starts. When your cache is a copy of database rows, it is easy to bootstrap it from your source, since the source and the cache are so similar.
8.) Eviction. Memcached gives you basic expiration, but otherwise you are handling eviction yourself. The caching in MySQL is a more useful LRU policy.

So there, just a few obvious advantages to using MySQL as a cache instead of memcached. Now I know that a lot of folks use MySQL as their "real" database, so it may seem weird to use it as a cache as well. But they are probably (hoepfully) using InnoDB for the "real" DB and that really is a different beast than MySQL with MyISAM or MEMORY tables. And it's not like you have to pay for extra licenses are anything... What are the advantages of memcached over MySQL? The only obvious one to me is if you want to cache things that don't fit in the database, like deep object graphs or HTML fragments, etc.

Tuesday, August 12, 2008

Netgear WNR2000

My trusty old D-Link DI624 started having problems recently. Actually it only started having problems immediately after a Comcast technician switched out our cable modem. Coincidence? Probably, but whatever...

I knew my Macbook was supported "Draft 2" of the 80211.n standard, so when I saw a reasonably priced 80211.n router, I went for it. "It" was a Netgear WNR2000. It installed very easily. I was able to re-use my old SSID and security settings, so that I did not to change any of my devices (two laptops and a Wii) or devices of friends and family who had previously used my network. Very nice. All of my devices accessed the new network with no problem. Happy happy joy joy! Not so fast.

Everything worked great except for my Macbook. It had no problems with the network, but its Internet connection was horrible. It was like being on dial-up, and it was only this way for my Macbook. My wife's laptop was blazing along as was our desktop computer (with a wired connection to the WNR2000.) It was only my Macbook that was badwidth impaired.

I started tweaking with the WNR2000's settings, well actually just one (wireless) setting: maximum network speed. This was set to 300 mpbs, or half the theoretical maximum for 80211.n and nearly six times as fast as my old DI-624's 80211.g network. I started tweaking it down, but to no effect. Until I set to 54 mpbs, i.e. the same speed as you get with 80211.g. Then my Internet connection on my Macbook was as fast as it was for every device on the network. Order had been restored, but it is not a satisfying solution.

My only guess is that I fell victim to some kind of "mixed" network issue, but that is mostly a guess. I thought the 80211.n would come in handy when copying files between my desktop computer and my Macbook. I do this a lot for music, photos, and videos. Now I basically have the same wireless network speed as before, but I could have gotten that with a cheaper router.

Friday, August 08, 2008

Blogging Tools for Programmers

What kind of tools do you use for blogging? I am writing this, and most of my posts, in Blogger's web interface. I have tried a few other tools, but none of them are very good. There are some basic things that I want out of a blogging tool:

0.) Obviously it has to work with Blogger.
1.) Rich formatting. If nothing else I need to be able to easily create links. Full WYSIWYG editing would be great though. Don't make me do manual HTML formatting, but please don't prevent me either.
2.) Image hosting integration. I would like to be able to take either images on my local computer or off the web to include in the blog.
3.) Blogger tags/labels. I have a small blog, with about 100-150 unique viewers daily (as measure by MyBlogLog, which seems reliable.) There may be more folks who use blog reading tools, too, who knows. A decent number of the views come from people doing Google searches for "blah". These searches often lead to blog posts that I have tagged as "blah." No tagging means less visibility, so forget that.
4.) Offline mode would be nice. Really the first three thing are handled pretty well by Blogger's native web interface. But it would be nice to able to compose a post while offline.
5.) OSX. I blog on my MacBook almost exclusively and I do not want to boot Parallels just to blog. Integration with OSX spellchecker is pretty much implied too.

Does that seem like so much? I don't think so, but yet I have not found an acceptable solution for this. Seems like most desktop apps are designed for WordPress, TypePad, and Movable Type. I have tried things that are supposed to be great, like Mars Edit, and was underwhelmed. So even though nothing quite satisfies the above simple requirements, what I would really like is something that also supported:

6.) Code. I like to include code in my posts. It would be great to have something that made it easy for me to copy-n-paste code. It should escape special characters for me (like greater than and less than signs), provide code highlighting based on the language, and provide scrolling. Right now I use a PHP highlighter from Gilly. I had to hack in the CSS for this in one of my sidebar widgets. It works ok, but is a bit manual and the code often overflows.

Is there a tool like this out theere and I just don't know it? Is there a tool out there that does all of this but only with one of the other blogging platforms? If that were true, I would have to figure out how much pain would be involved with migrating... Maybe I should just build this using AIR?

The Wrong Color of Green

So apparently the folks in Green Bay didn't listen to me about the best way to resolve the Brett Favre situation. It's not like they are going to get a first or second round pick now, for the simple reason that Jets stink. Whatever. I'm glad we've still got Lilly. Anyways...

I am not happy about Favre playing for the Jets. Of course, I think I know Green Bay's strategy here. They know that Favre will have to play against New England twice a year for as long as he stays unretired. They know that Belichick is an evil genius who likes nothing better than to cause his opponent's mind to implode.

Last year we learned about another little fetish of Belichick: football video. Think of all of the video of Favre that the Packers have accumulated. Think of all of the other goodies that might be laying around, like Rorscach test results, etc. Now imagine all of that in the hands of Belichick... Ted Thompson may just get the last laugh.

Update: Looks like trickle down economics works after all as Miami has signed Jets castoff Chad Pennington.

New developerWorks Articles: DWR and Google App Engine

This week, IBM developerWorks published two new articles that I wrote. The first article, is the third of the three part series I did on Ajax toolkits. This one is on Direct Web Remoting or DWR as it is commonly known. I had always been skeptical of DWR because of its RPC nature. However, after doing the article, I have to admit it is a pleasure to use. In the article I used the Java Persistence API (JPA) in combination with DWR. I was very pleased with being able to simply annotate some very simple, vanilla code and create an Ajax service that talked to a database. Ruby on Rails often seems magical to people in its ability to very easily work with databases and simplify Ajax development. DWR shows that you can do some pretty magical things in Java, too. DWR also makes Google Web Toolkit’s RPC code look ridiculous.

Speaking of Google (that is called a segue, if you are keeping score at home) the second article is the first of a three part series on the Google App Engine. I had mentioned some of the work on this here, and now you can read all about it on developerWorks. Of course you can also check out the application developed in the series, i.e. the aggroGator and you can get its source code here.

Thursday, July 31, 2008

Jython, It ain't no JRuby

All of my recent excursions into Python convinced me it was time to try out Jython. I have had a great experiences with JRuby. It is definitely faster than CRuby for long running processes (obviously slower for short scripts, because of the JVM startup overhead.) In my experiences, which range from doing mathematical algorithms to web applications using Rails, there is extremely good interoperability. I have had no code blow up in JRuby that ran fine in CRuby. Even IDE support has been on par. So my expectations were high for Jython.

Maybe that was the first problem, unrealistic expectations. Let's not jump into the shortcomings, just yet. First off, Jython installation is nice. Well nice as in "there is a gui." The installer only seemed to copy files into the installation directory, nothing more. That is fine, but makes me wonder I bothered with an installer at all. I at least expected into put the jython executable on my path, but it did not. No big deal.

Running a script is painless. It was odd to see a lot of activity the first time I ran a script, but the messaging (or should I say logging) was good enough to give me a good idea about what was going on and it only happens once. It also seemed like Jython was going out of its way to help performance, and JRuby had caused me to expect a nice performance boost from Jython.

So I tried out a script I wrote to solve a Project Euler problem, in particular Problem #43. The performance on Python is not that great, as it takes about 58 seconds to solve on my Macbook. Here is the code. It has been slightly optimized by Gilly.


def combinations(items, n):
    if n == 0:
        yield []
    else:
        i = 0
        l = len(items)
        while i < l:
            for combo in combinations(items[:i] + items[i+1:], n-1):
                yield [items[i]] + combo
            i += 1
 
def permutations(items):
    return combinations(items, len(items))

def to_int(seq):
    return reduce(lambda x,y: 10*x + y, seq, 0)
 
def main():
    primes = [2, 3, 5, 7, 11, 13, 17] 
    digits = range(0, 10)
    sum = 0
    for p in permutations(digits):
        if p[5] == 5 and p[0] != 0:
            j = 0
            trait = True
            while trait:
                if j == 7:
                    y = to_int(p)
                    sum += y 
                    print y
                    break
                x = 100*p[j+1] + 10*p[j+2] + p[j+3]
                trait = not x % primes[j]
                j += 1
    print "sum = " + str(sum)
 
if __name__ == '__main__':
    main()

Lots of brute force, with a little bit of clever use of Python features. I was ready to crank this up with Jython, but when I tried to run it, I got this error message:

Traceback (innermost last):
(no code object) at line 0
File "euler43.py", line 5
yield []
^
SyntaxError: invalid syntax

Ouch. This actually made me feel stupid. I should have noticed that the current version of Jython is numbered 2.2.1 and that this obviously corresponds to Python 2.2.1 (that is obvious, right?) I had used a Python feature that did not exist in 2.2.1. Luckily there is an alpha version of Jython that is numbered 2.5. What about 2.3 or 2.4, you say? Uhh...

Anyways, with the alpha version of Jython 2.5 used instead everything worked. However, the performance was not what I expected. The same script ran in 171 seconds! It took three times longer than CPython. Wow.

I wrote a similar algorithm in Ruby and it was horribly slow. Perhaps this is why JRuby is faster than CRuby, CRuby is just so slow. Perhaps not. The nice thing about this is that it forced me to optimize the code more. Here is the optimized Ruby code.


def calc(seq)
  seq.inject(0) {|x,y| x = 10*x + y}
end
 
def test(seq)
  primes = [2,3,5,7,11,13,17]
  j = 1
  trait = true
  while j < 8 && trait
    num = calc(seq[j,3])
    trait = (num % primes[j-1] == 0)
    j += 1
  end
  trait
end

sum = 0
digits = (0..4).to_a + (6..9).to_a
for p in digits.permutation
  if p[0] != 0
    x = p[0,5] + [5] + p[5,4]
    trait = test(x)
    if trait
      y = calc(x)
      sum += y
      puts "found one " + y.to_s
    end
  end
end
print "sum= " + sum.to_s

A couple of things to note. This uses Array#permutation, a new feature of Ruby 1.8.7. This is written in C, so you would think it would be super fast. You would be wrong. The latest JRuby is 1.1.3 and does not implement all of the 1.8.7 features, including Array#permutation. So this code will not run in JRuby. It winds up being much faster than the Python code, but only because the algorithm is so much better. It only deals with 9! numbers instead of 10!. Without the modification, Ruby was so slow that I did not have the patience to let it finsh. We're talking 10+ minutes for something that only took 1 minute in Python. With the improved algorithm it took about 40s to solve the problem in Ruby. When I ported the change over to Python, it dropped Python down to around 9s. When I tried the ported code in Jython, it would not run at all... I haven't tracked down that problem yet.

Wednesday, July 30, 2008

Python Threads

I finished Core Python. My last post on it hits on a few random things towards the end of the book. The most interesting is threading in Python and begs the question, is Python more or less multi-threaded than it claims?

Sunday, July 27, 2008

Unreal Guitar Hero

I love playing Guitar Hero. Whenever I feel like I am pretty good at the game, I just play with my wife Crystal and she puts me in my place. If that was not enough, I could always watch this video of Chris Chike, the greatest Guitar Hero player. Ever.

Friday, July 25, 2008

JRuby 1.1.3 Performance

It's been awhile since I did a post with pretty graphs and meaningless micro-benchmarks. I am using the latest version of JRuby, 1.1.3 on a new project. I knew from some correspondence with Charlie Nutter that the performance bug that I had encountered previously were believed to be solved. I said believed because he had not had a chance to test on the IBM J9 JVM that (along with some other JVMs) exhibited the performance problem. Actually this was fixed in 1.1.2, and I should have tested things then, but didn't get around to it until now. I decided to throw in the latest beta of JDK 1.6 update 10, the magical JDK for The Rest of Us. Here is the purrty chart:

The good news is that even with the J9 JVM, JRuby outperforms native Ruby, at least on this mathematical algorithm. This was not the case prior, so the JRuby fixes have really helped. In fact, if I look at the last data point (x=100), the code executed in less than half the time it took previously with the only change being the JRuby 1.1.3 instead of 1.1. I could describe the source of the bug, but you are much better off reading Charlie's description.

Thursday, July 24, 2008

The City of San Jose

The Man has been keeping me down lately. Time for a laundry list of complaints about the city I live and work in.

Licensure -- Last year I got a letter from The City telling me that I owed them money, and lots of it. They said that I had a business in San Jose, but that I had not been paying my Business License Fee. Not only did I owe money for the License, but I owed it for the past couple of years plus penalties for not paying it. This was because of money I make for writing for IBM. Of course I had to also pay this License fee again this year...

Crime -- As I mentioned a few months ago, my wife's mini-van got broken into and the DVD player was stolen. I filed a report with The City. The City's response? That would be no response. That is what I should expect, right?

Police Harassment -- Last month, we were parked in downtown San Jose, taking the kids to get ice cream at Ben & Jerry's. The spot was a metered spot. We went past the time limit by 10 minutes and of course we got a parking ticket for $28.

Trespassing -- This is the worst. On Monday, I was walking to my car about to go to work. I notice a piece of paper on my wife's mini-van. I look, and it is a ticket. The $35 ticket was for blocking the sidewalk. Our van was parked in our driveway, but the rear of the van did indeed protrude on to said sidewalk. It did not prevent it from being used, and there was more driveway past the sidewalk, so there was plenty of room to walk by my house without having to use the street or walk in our yard/driveway. Whatever. It was very aggravating to me that I could have our car parked at our house and a policeman could come on to my property to give me a ticket.

All in all I feel that The City of San Jose is doing a lot to take money from me, but doing very little in return. I guess this can be said of any government. A lot of my fellow libertarian-thinkers like to favor local (state, county, city) over federal, but this is just a gentle reminder that even the smaller governments will screw you over as much as possible.

Wednesday, July 23, 2008

I ♥ Slots

The title makes me want to go to Vegas ... No gambling, just a discussion on slotted classes in Python and in particular how to pickle and sort them.

Tuesday, July 22, 2008

Get Lifted

Yeah, I'm a John Legend fan, but that's not the reference here. The reference is to Lift, the web application for Scala. Today IBM published an article I wrote about Lift. This was an article that I pitched to IBM and I was very excited about writing. Lift itself takes a different approach to web development than the typical MVC approaches. It is also in Scala, and takes great advantage of that languages features, especially its native support for XML.

One of the most surprising features of Lift was the ORM that it includes. Lift's creator, David Pollak, commented that he would use JPA for complex schemas and not Lift's ORM. I must admit, it was the part of Lift that took me the longest to get my head around. I consider myself quite the veteran of ORMs, but Lift's is definitely unique. I think it needs some tooling around it, as it felt like some boilerplate-ish code was present (like creating a model class and singleton factory for the model class.) However, I really liked its use of generics.

I didn't have time in the article to get into Comet with Lift and Scala's Actors. That is an awesome feature. Hmm, perhaps that should be another IBM article..

Finally, of Lift and Scala related news... check out Graceless Failure by some of the developer at Twitter. It seems to imply that Scala and maybe Lift or at least "in the mix" at Twitter. I wonder if it is replacing Ruby and/or Rails in some places. Certainly Actors would seem like an obvious way to handle new updates on Twitter.

Monday, July 21, 2008

Movies

I see 1 or maybe 2 movies a year. By seeing a movie, I mean going to a traditional movie theater to see a movie. This year I had to miss the eBay summer movie. That's where eBay takes over a local cineplex, and watch a movie. This year it was the new Indiana Jones movie. Last year, it was The Simpsons.

So by missing the eBay movie, it seemed likely that I would see no movies this summer. Instead, I have seen three. Amazing. Being such an obvious connoisseur, I thought I would share my opinions.

The Hulk -- I was a little skeptical of Ed Norton, and very skeptical of Liv Tyler ... but it was pretty good. Norton was good. Tyler was not convincing as a scientist, sorry. Even worse, she wasn't really that attractive in the movie either. Was her acting good? I'm no good at making such determinations. Still, pretty good.

Iron Man -- I love comic book movies, obviously. The Hulk was good, but Iron Man was much better. Its faithfulness to the mythology was excellent. Obviously Robert Downey, Jr. was the perfect person to play Tony Stark. I really hope that future installments tap into the Stark the Drunk + Rhodey as Iron Man, as it seemed to hint at during the movie.

Wall E -- I took my kids to see this. I was a little nervous about my youngest son, Raymond. He is 2 1/2 years old, and very lively, so I wasn't sure if he would make it through the movie. I will never know if my worries were warranted to or not, as he fell asleep about 20 minutes into the movie. My 4 year old, Michael, Jr. absolutely loved it, and so did I. It is easily the best Pixar movie, and that is really saying a lot.