Thursday, October 08, 2009

San Francisco Giants 2009

The Giants had a pretty good season. They were in contention for a playoff spot until the final couple of weeks. But they didn't make it. Folks around here are busy talking about what the Giants need to do to take the next step. The answer is obvious all at once: they need better hitting. However the problem is that the Giants don't actually know how to evaluate hitting talent. So they really don't have any chance at getting good hitting without paying a huge price for it. Let's take a look at some numbers to understand this.

One easy way to evaluate the Giants is by looking at the players who have come up through their farm system. Of their homegrown players that had at least 200 plate appearances, on average they saw 3.62 pitches per plate appearance and walked in 7.2% of their plate appearances. To be fair, Fred Lewis has very good plate discipline (and his plate appearances went way down this year.) If you take him out, then the numbers are 3.52 pitches per plate appearance and a 6.8% walk rate. For comparison's sake, if you look at the top seven teams (in terms of runs scored), they average a walk in 9.8% of their plate appearances.

So the Giants farm system sucks. Maybe they can sign good hitters? Nope. If you look at their top hitters that they signed from other teams, they see 3.58 pitches per plate appearance and a 5.9% walk rate. That's right, they see a few more pitches, but they walk even less. The Giants even made a "big" trade at midseason, for Freddy Sanchez. Everyone is concerned that Freddy might have become injury prone suddenly. What they should really be worrying about is that Freddy sucks. He sees 3.81 pitches per plate appearance, which is not too bad. However, he only walks 4.5% of the time. Further, his 0.417 career slugging percentage is just awful, even for a second baseman (which is generally a strong offensive position in modern baseball.) So he doesn't swing at everything (he also has a low strikeout rate) but yet he still does not try to get a good pitch to hit and winds up playing for a single. This is your blockbuster trade material? This is what you get in exchange for the second best pitcher in your farm system?

Yeah, so clearly the Giants front office has no idea what makes a good hitter. People like to say that the Giants home park is a great pitchers park, as it is a tough place to hit home runs. That does not make it a tough place to take a bad pitch or take a walk. In fact, you would think that in such a park, they would an even higher premium on hitters who can get on base any way they can. It's funny, you often hear that the reason that the A's are no longer a good offensive team is because other teams figured out what they were doing. If you look at the Yankees and Red Sox, or for that matter the Rays and the Rockies, you can see evidence of this. However, clearly there is one team that has not figured things out and that is the Giants.

Tuesday, September 15, 2009

Scala Word Clouds

Just before I went to bed last night, I got an interesting tweet from Dave Briccetti. You can see from the link in tweet what the problem was, and if you look at the bottom of the page, you can see the clever solution by Jorge Ortiz. Right after I read the tweet from Dave, I asked my wife how long until the end of the TV that she was watching. She said seven minutes. I decided that wasn't enough time, so I went to sleep.

I didn't look at the problem again until today during lunch. By then, I saw Jorge's solution and I saw no way to improve on it. However, this inspired a little experiment for me. Actually it mostly reminded me of a programming problem that I was given a few years back by a well known company that will remain anonymous as they are very touchy about people talking about their interviewing process. The problem was essentially the same as Dave's problem -- take a bunch of sentences (instead of tweets), figure out the most frequent words. There was a twist though -- I was not allowed to use any Java collection. Actually I could not use anything outside of the java.lang package, so I was pretty much stuck with arrays...

At the time, I figured that I would have to either implement my own hash table and a sort, but then I came up with a way to solve the problem with no hash table. So when I saw Dave's problem, I thought "wow my old solution will be so much more awesome in Scala":

def createCloud(strings:Seq[String])={
  val x:List[String] = Nil
  var s = ("",0)
  val cloud = new scala.collection.mutable.ArrayBuffer[Tuple2[String,Int]]
  strings.map(_.split("\\s").toList.sort(_ < _)).foldLeft(x){mergeSorted(_, _)}.foreach( (a) => {
    if (a == s._1) 
      cloud(cloud.length - 1) = (a, s._2 + 1)
    else
      cloud += (a,1)
    s = cloud(cloud.length - 1)
  })
  cloud.toList.sort((a,b) => a._2 > b._2)
}
Ok, so I cheated this time and used an ArrayBuffer instead of array. In my solution years ago, I made an array whose size was the total number of words (worst case scenario.) I didn't have tuples, so I had to use two arrays. Anyways, the idea here is to split each sentence into a list of words, and then sort that list of words. Now merge together each sorted list, and keep things sorted using a merge sort. Of course you could turn everything into a big list first using flatMap, similar to Jorge's solution, and then sort. Anyways, you then roll up the sorted list, taking advantage of the fact that it is sorted. You never have to look up a word to see if it has been encountered before, because of the sorting. You just keep track of the last word encountered, and just compare the next word to the last word to determine if the word has been previously encountered and thus needs to be incremented. Of course you must be wondering about the aforementioned merge sort. Here is a Scala implementation of it:
def mergeSorted[A](comparator: (A,A) => Boolean)(x:List[A], y:List[A]):List[A] = {
  if (x.isEmpty) y
  else if (y.isEmpty) x
  else if (comparator(x.head, y.head)) x.head :: mergeSorted(comparator)(x.tail, y)
  else y.head :: mergeSorted(comparator)(x, y.tail)
}
  
def mergeSorted[A <% Ordered[A]](x:List[A], y:List[A]):List[A] = mergeSorted((a:A,b:A) => a < b)(x,y)
While writing this, I decided that I should touch up my knowledge of currying and partially applied functions in Scala, so I read about this on Daniel Spiewak's blog. The first version of mergeSorted is very generic and takes a comparator. This is what lead me into the world of currying. Actually, my first version of the above looked like this:
class MergeSorter[A](comparator: (A,A) => Boolean){
  def apply(x:List[A], y:List[A]):List[A] = {
    if (x.isEmpty) y
    else if (y.isEmpty) x
    else if (comparator(x.head, y.head)) x.head :: apply(x.tail, y)
    else y.head :: apply(x, y.tail)    
  }
}
I created a class to encapsulate the sort. In some ways I like this syntax a little better, but it made the second version of the sort a little uglier. I wanted a "default" sort that could be used for anything implementing Scala's Ordered trait. It is this second version that I used in the createCloud function above, as strings can be implicitly converted to RichString which implements Ordered.

It is still easy to use mergeSorted with objects that do not implement Ordered, or to provide a different sort function for those that do:
// provide your own sort
val sList = 
  mergeSorted((s1:String, s2:String) => s1.toLowerCase < s2.toLowerCase)(List("aa","aB","Cb"), List("Ab", "ca"))
// create a reusable sort
val listSort = mergeSorted( (a:List[Any], b:List[Any]) => a.length < b.length) _
val z = listSort(List(List("a","b"), List(1,2,3)), List(List(true,true,true), List("x","y","z","w")))
In the first example, we create a new sort by providing different sorting function that is case insensitive. It is then applied to a list of strings. In the second example, we create the sorting function and assign to a val. Notice the mysterious underscore _ at the end of the statement. This creates a partially applied function. If you forget the underscore, the Scala compiler actually provides a pretty useful error message about this. With this, we can then apply the function however we want.

Sunday, September 13, 2009

Some Tasty Scala Sugar

I have finally been getting around to working some more on Scala+Android. A lot needs to be done, especially since I am talking about it next month at the Silicon Valley Code Camp. Scala advocates (and I guess that includes me) often say that is very easy for application developers to use Scala, i.e. that "most" developers are never exposed to the more complex aspects of the language. The folks who "suffer" are APIs developers. I've mostly fallen into the first camp (app developers), but now that I'm trying to create a Scala API on top of the Android SDK, I get exposed to more aspects of Scala. Here are a couple of things that I found interesting.

Tuesday, September 01, 2009

Metaprogramming in Groovy and Scala

This morning I read an excellent pair of articles by Scott Davis on metaprogramming in Groovy. It reminded me of some of the different ways to approach this kind of problem. The second article in particular detailed a new feature in Groovy 1.6, the delegate annotation. Here is an example of using this feature:

public class Monkey {
def eatBananas(){
println("gobble gulp")
}
}

public class RoboMonkey{
@Delegate final Monkey monkey
public RoboMonkey(Monkey m){
this.monkey = m
}
public RoboMonkey(){
this.monkey = new Monkey()
}
def crushCars(){
println("smash")
}
}

This allows for the following usage:

def monkey = new RoboMonkey()
monkey.eatBananas()
monkey.crushCars()

What is interesting here is that you cannot just create a Monkey and call crushCars on it. RoboMonkey is meant to be a way to add functionality to the Monkey class (in this case the crushCars method), while still being able to treat the RoboMonkey as if it was a Monkey. I point this out because of how this would be done in Scala:

class Monkey{
def eatBanaas = println("gobble gulp")
}

class RoboMonkey(val monkey:Monkey){
def this() = this(new Monkey)
def crushCars = println("smash")
}

object Monkey{
implicit def makeRobo(monkey:Monkey):RoboMonkey = new RoboMonkey(monkey)
implicit def makeNormal(robo:RoboMonkey):Monkey = robo.monkey
}

Now arguably this is more complicated, because you have to create the Monkey object in addition to the Monkey class. What Scala requires is that you get those implicit functions (makeRobo, makeNormal) in scope. This is just one way to do that. The added benefit is the following usage:

val monkey = new Monkey
monkey.eatBanaas
monkey.crushCars

I guess the Groovy way to do this is to use the meta class:

Monkey.metaClass.crushCars = {-> println("smash")}
def monkey = new Monkey()
monkey.eatBananas()
monkey.crushCars()

In both languages you can accomplish similar things, though with very different ways to implement it. Meta programming in Groovy is very powerful, and there are things you can do in Groovy that you cannot do in Scala -- see methodMissing and propertyMissing. Of course you lose some of the benefits of static typing, but software is all about tradeoffs.

Sunday, August 30, 2009

To Suggest or Not To Suggest

Ajax has been one of the most hyped technologies in recent memory. However, much of the hype is deserved. It really has changed the way we (web developers) build applications and the expectation of users. One of the archetypes of Ajax is the auto-suggest text box. I don't know who first came up with, but I first remember seeing it on Google. I think it was once a "special" version of the Google home page (Google Suggest?), but not it is standard:

It makes for a natural option on search text boxes. Here it is on some of the other search engines out there. Yahoo:

And on the search hotness, Bing:

Of course pure search engines are hardly the only sites that have search and thus have auto-suggest text boxes. It's pretty useful for ecommerce sites too...

As you can see, some sites have gotten creative with their suggestions. Here is another great example of that from Apple:

All of these examples are for search text boxes. If what you want is not suggested, you can still type it and the application will perform a search on it. A little more interesting example is on Facebook:

Here there is a "closed list" behind the suggest box: your friends. However, it is still a search box. If you type something that is not in the list, it will still perform a search that will return results. Of course, really all of the suggest boxes have a closed list behind them as well, but that list is probably much bigger than your list of friends on Facebook (unless you are Robert Scoble.) So the theme continues: there is a finite set of predetermined suggestions, but if you type in something not in that set, the application can still process your input.
Recently, I saw a different use for auto-suggest boxes: as a drop-in replacement for select/option boxes (a.k.a. drop-down box or combo box). This is fundamentally different than any of the examples above. It would be like the Facebook example, but with the limitation that your friends were the only valid input into the search box. In fact, Facebook has a scenario that is similar to this: tagging people in photos/videos:

However, even in this case, you can type the name of somebody who is not one of your friends. This is valid input. After all, maybe not everybody that you take pictures of has an account on Facebook. My kids are growing up fast, but my five year old son is not yet on Facebook...
I imagine that this pattern -- using auto-suggest box to replace a select/option box -- is used on websites out there. It seems like a reasonable thing to do to avoid a drop-down with a large number of choices, or a drop-down that is expensive to calculate and is often not used. However, it seems awkward, too. What do you do when a user types in something that is not in the box? I like to type, and I'm both reasonably fast at it and make enough spelling mistakes that this could be a common scenario from me.
Now I am no user experience expert. In fact, as a web developer, I would say that I am perhaps the least qualified person when it comes to judging user experience on the web. My knowledge makes me very forgiving of things that might be confusing to others, while it also makes me critical of things that others would not notice. So I am curious about other's people's experiences and opinions. Are there sites out there that use this pattern? Is it useful or awkward?

Saturday, August 15, 2009

A Tipping Point for Scala

This past week's BASE meeting was all about IDE support for Scala. You can read my notes, posted to the Scala tools mailing list. I was very surprised by this meeting. Not by the findings, if you will, as I have used all three IDEs at various times in the last few months. What I was surprised by was the feedback from the group, and the logical conclusion of this discussion: Scala is near a tipping point, but IDE support is holding it back.

First off, there was a large turnout for the BASE meeting. I would say it was the second largest BASE meeting, only bested by the June meeting where Martin Odersky spoke. It is funny, because I think our esteemed organizer, Dick Wall, had been intending this topic to be like "well if we have nothing else to talk about it, we'll talk about IDEs." If there had been an alternative topic brought up, I don't think people would have objected. After all, developers and their attitude towards IDEs are contradictory. Most developers I know would tell you that IDE support for a language is very important, but they would also act indifferent about IDEs when it came to them personally. It's like "all of those other developers really need IDEs, but I would be ok without them." We all know our APIs so well, that we don't need code completion, right? And we don't write bugs, so a debugger is of limited use, right? However, I am sure that if the meeting had not been about IDEs, then there would have been less people in attendance.

So why so much interest? Like it or not, but Scala's primary audience right now are Java developers. Yes, I know Scala appeals to some dynamic language folks, and to some functional programming folks, and that its .NET implementation is being updated, but you could sum up all of the Scala developers from those disciplines and it would be dwarfed by the Java contingency. Scala has a lot of appeal on its own merits, but it is always going to be framed against Java. Scala's most (only?) likely path to mass appeal is as "the long term replacement for java."

So when you talk about developers choosing to use Scala, you are really talking about Java developers choosing to use Scala instead of Java. This is not the only use case, but not only is it the most common use case, it is arguably the only use case that matters. Without this use case, Scala will at most be a marginal language, a la Haskell, or OCaml, or Groovy for that matter.

Back to my point... Java developers need great IDEs. This is not because they "need" help from their IDE because of some lack of skill. No, it's because they have had great IDEs for a long time now, and thus it has become a requirement. I remember when I joined Ludi Labs (it was still called Sharefare at the time) several years ago, we had a Java programming quiz. Candidates were given a clean install of Eclipse to use for writing their programs. We could have given them Vi or Emacs and a command line, but that would have been asinine and foolish. IDEs are an integral part of Java development.

I knew all of the above before the BASE meeting, but what I did not know was how many development organizations were at a critical juncture when it comes to Scala. For many folks, Scala, the language, has won the arguments. Whatever perceived extra complexity that it has, has been judged as worth it. Whatever challenges there may be in hiring people to develop in Scala can be mitigated. Legacy code is not even a factor, as integration with existing Java code is trivial. Maybe it's bleak future of Java, or maybe it's the high profile use of Scala at Twitter. Who knows, but Scala is poised to take a big piece of the Java pie.

Thus the missing piece is IDE support. Development orgs can't switch to Scala without IDE support, and the support is not there yet. That's the bad news. The good news is that Scala is ready to explode once the IDE support is there. There are a lot of folks out there ready to adopt Scala simply as a "better Java." They just need an IDE that is on par with Java IDEs. That is the standard.

All of that being said, there is a lot of concern around the IDEs. Many people expressed to me that they are worried that IDE progress is being coupled to the release of Scala 2.8. That seems reasonable at first, but what happens if 2.8 is not released until 2010 sometime? Will Scala lose its momentum and window of opportunity?

Monday, August 03, 2009

The Strange Loop

In October, I am speaking at the inaugural Strange Loop conference in St. Louis. This is not your run of the mill conference. It is organized by Alex Miller, who you might have seen speak the last couple of years at JavaOne. The speaker list is sweet: Alex Payne and Bob Lee are doing the keynotes, with sessions by Charles Nutter, Dean Wampler, Stefan Schmidt, Guillaume Laforge, Jeff Brown, and Alex Buckley. I am doing a talk on iPhone/Android development. It will probably be pretty boring compared to the other sessions. I may try to spice it up with some shameless plugging of Scala+Android balanced by some Fake Steve Jobs quotes about Android.

Friday, July 31, 2009

War and Decision

Recently I started reading Doug Feith's War and Decision.

I say "started reading" because I have not finished it yet. The book is about how the Bush Administration and in particular the Defense Department reacted to 9/11 and managed the wars in Afghanistan and Iraq. I have read most of the material on 9/11 and Afghanistan, but have not gone into the deeper material on Iraq.

I would highly recommend the book. Feith takes a very analytical approach that is full of insight and is though provoking. However, the book certainly has its faults. If you take the book as an extended position paper, then I would view these faults as fatal. If you take the book as a memoir, then these faults are perhaps the most valuable aspects of the narrative.

First off, Feith devotes a lot of the book to calling out statements by journalists and political opponents, and completely disproving these statements. From a literary perspective, this is easily the greatest weakness of the book. It is really beneath Feith to so painstakingly thumb his nose at detractors. How much joy can we really take in disproving pundits? This just in, journalists exaggerate and twist the truth in the quest for sensationalism...

I guess you can forgive Feith for all of this, as it must be frustrating to have to hear so many false claims in the press on a daily basis. Fine. However, the other mistake that Feith makes is not as annoying, but is in many was much worse. The funny part is that it is a mistake that, if we buy into Feith's characterizations, that his boss, Donald Rumsfeld, would have immediately pointed out. Feith claims to build up logical arguments for the decisions made by the Bush Administration, but he fails to point out the assumptions behind these decisions and their severe consequences. According to Feith, Rumsfeld was notorious for demanding to know the assumptions made by his analysts and advisors, hence the irony of Feith's prose.

The most basic and essential example of this is the reaction to the 9/11 attacks. Feith & co. immediately started thinking about how to broaden the response beyond just the perpetrators. This assumption, that the response needed to be broad, is hardly questioned. The only hand-waving given for it is that such a response would be more like a police action (punish the criminals) and thus ineffective. Justifications are only developed after the initial idea is pursued. For example, one justification is that a broad military action is the most likely way to prevent further attacks without negatively affecting the American way of life. Maybe this is true, the book is certainly thought provoking, but this justification is only developed ex post facto.

So more examples... Feith states that Iraq having WMDs was a given that everyone just accepted. He says that in late 2001 nobody would have argued that Iraq did not have WMDs. Feith also states that another given was that another 9/11-ish attack was imminent. It is not hard to see that if you start out assuming that the U.S. needed to make a broad military reaction, that Iraq had WMDs, and that devastating attacks were imminent, then you have to conclude the U.S. had to invade Iraq.

Anyways, I will reiterate that the book is good read, especially if you are an opponent of the war, as I am. Feith does admit early on that the Administration did not do a good job of communicating to the public. He underplays this (at least in the first third of the book or so, perhaps this changes.) Poor communication and an act of war have no place together in a democracy, at least in my opinion.

Sunday, July 19, 2009

My HTML in Scala Wishlist

One of my favorite features in Scala is its native support for XML(link warning, shameless plug.) I love the "projections DSL" (for lack of a better term) that is the scala.xml package. It is great for slicing up XML, and is almost like having native XPath and XQuery built into the language. Using XML in Scala for HTML generation is not as universally appealing to people. Some people think it encourages mixing presentation and business logic and is not "clean." I have even heard folks say that this is a critical flaw in Lift. I completely disagree. Of course I also think MVC is an anti-pattern, but anyways... HTML support in Scala is great in my opinion, but could be even better. Here is my wishlist.
  • Better understanding HTML semantics. For example, it is great that the compiler won't let me do something like this <div>Hello world</di>. However, I would not want it to even let me do <di>Hello world</di>. Similarly I should not be able to stick a thead outside of a table. If I have an anchor tag with an href attribute, then I'd like the compiler to do some validation on the value. Am I asking for too much? Well, that's I'm supposed to do, right? Is some/all of this possible with the help of XML schema support? Maybe so.
  • CSS. HTML without CSS ceased to exist a long time ago. With all of the DSL possibilities in Scala, is there some way to allow CSS literals and create some type safety for me. For example, I if I fat finger the name of a style on a class attribute, the compiler should catch this. Also I can say from experience that introducing an object-oriented model is really useful in CSS. Finally, the XPath/XQuery-like support in Scala XML makes me think that something similar could be done for CSS selectors.
  • JavaScript. Honestly I don't know what I would want here. I like JavaScript, but I know the idea of being able to write your web application in a single language is very appealing to a lot of people. If nothing else, it would once again be nice if the compiler would prevent me from doing something like <button onclick="foo()" ...> if there as no function called "foo" in scope.
  • Tooling. Most modern IDEs support things like syntax highlighting, code completion, error checking, code formatting for HTML and XML+schema. I expect at least this much for XML/HTML literals in Scala. I'd also like to be able to step through and debug literals. Refactoring on subtrees of XML (extract method for example) would be sweet. Oh, and large XML literals should not overwhelm my IDE :-)

Wednesday, July 15, 2009

Functional Paradigms on Android using Scala

I am slowly plugging away on my little side project. One of the fun things about "pimping" an API, is you can tweak the design of the API. For example, I was modifying TextView, a class used for creating text input boxes of various sorts in Android. One common use case in Android is to create an auto-complete text box. To do this, you need to subclass TextView. Then you can override various lifecycle methods like onEditorAction, onKeyUp, etc. However, I thought that it might be nice to do something like this:

val textView = new TextView
textView.keyUp = (code:Int, event:KeyEvent) => {
// do stuff to your TextView
}

As Debasish pointed out to me, this is a lot like you would do in JavaScript. Is it better? It's a little easier in certain cases. If you just need to change the behavior of a single TextView in a very specific way, then this would seem like the way to go. Subclassing TextView just to override a single method and then using that subclass exactly once definitely seems like overkill. Now if you really wanted to create a new kind of TextView that you could use in multiple places, then subclassing makes sense.

One thing worth mentioning is how this kind of thing is handled in Cocoa on the iPhone. The Cocoa way would be to create an untyped delegate. In the delegate you could override the lifecycle methods that you were interested in. It's untyped, so you have to know the right names. You still wind up defining a new class, even though you usually only want to customize the behavior of one particular instance. Also, it's a little clunky as the TextView instance would have to be passed in to each of the delegate methods, so that your delegate could actually do something to the TextView. I find this a little better than having to subclass as you do in Android, but once again I think using Scala can yield some nice results.

There was another interesting case that popped up while sugaring TextView. I wanted to sugar the setCompoundDrawablesWithIntrinsicBounds methods. This method is overloaded, taking either four integers or four Drawables. Here is what I first thought of doing:

def compoundDrawablesWithIntrinsicBounds_=(bounds:Tuple4[Int,Int,Int,Int]) {...}
def compoundDrawablesWithIntrinsicBounds_=(bounds:Tuple4[Drawable,Drawable,Drawable,Drawable]) {...}
// this allows
myTextView.compoundDrawablesWithIntrinsicBounds = (1,2,3,4)
// or
myTextView.compoundDrawablesWithIntrinsicBounds = (d1,d2,d3,d4)

Of course this won't work. The type parameters of Tuple4 are going to be erased by the compiler, and the two methods are indistinguishable. However, The Amazing Mr. Spiewak gave me a solution:

def compoundDrawablesWithIntrinsicBounds_=[T <% Either[Int, Drawable]](bounds:Tuple4[T,T,T,T]){
bounds._1 match {
case Left => baseTextView.setCompoundDrawablesWithIntrinsicBounds(bounds._1.left.get, bounds._2.left.get,
bounds._3.left.get, bounds._4.left.get)
case Right => baseTextView.setCompoundDrawablesWithIntrinsicBounds(bounds._1.right.get, bounds._2.right.get,
bounds._3.right.get, bounds._4.right.get)
}
}

In addition to this, you need implicits to convert an Int to a Left and a Drawable to a Right. The API looks a lot different, but it allows exactly the same usage that I wanted. Another case of how you need some non-trivial Scala (though I wouldn't be surprised if there is a more elegant way than the above) to create an API that is super simple to use.

Wednesday, July 08, 2009

WebOS'es, Big Oil, and The Nutty Professor

I'm not going to talk about it. However, it reminds me of some things from my past...

First, let me take you back to 2001. I was working for a start-up in the East Bay called RMX. We claimed to be the "Yahoo of convenience stores." In other words, we were a portal for owner/operators of convenience store/gasoline stations. We were backed by Chevron. All of their non-company stores used RMX to get data from Chevron, like when their next delivery of gas would be and how much it was going to cost them. They would also get promotional stuff from Chevron and other folks who sold lots of goods at convenience stores. Anyways, noticed I said non-company stores. Most gas stations selling Chevron gas are not owned by Chevron, they simply buy their gas from Chevron. Those stores were all required (by Chevron) to use RMX. Ch-ching. However, Chevron's so called "company owned, company operated" stores, or CoCos, did not use RMX :-( The reason? They did not have computers in these stores for fear that employees would waste time surfing the web and playing solitaire.

Being a savvy start-up whose income was tied to the number of stores using our service, we came up with a clever idea. We built the first WebOS. Ok, so not really, but close. We built a Windows program that would take over the Windows shell. So no explorer.exe for you. The new shell, would embed an IE control with the page automatically opened to the RMX portal. There was no navigation bar, or bookmarks, etc. If you tried to leave the RMX portal, we'd catch it and send you back. If you tried to close the shell, it would reboot the computer. There was no escape! WebOS FTW!

Anyways, many years later I would join another start-up called Sharefare (later renamed Ludi Labs.) Our goal was to build a WebOS! Our vision of a WebOS was a platform that handled all of your common tasks: email, bookmarks, photo/video/music management, blogging, etc. It was all through the web with all of your data stored in ... no not the cloud. We did not use that buzzword (maybe we should have.) Instead we had what we called a "cell architecture" and your data was stored in your cell. This was the brainchild of our founder/CEO, The Nutty Professor (Alan Bush, who is a great guy and brilliant to boot.) Our architecture was to have a UI layer on top of the WebOS. We called this the WebTop and that was my job. I wrote a programming language for the WebTop and implemented the language using a flammable combination of Java and JavaScript. It was a lot of fun, and probably the most academic exercise I will ever get to experience in my professional career. We never got to put it in the hands of customers, but at least I got a patent out of it.

So there, I've done the WebOS thing. Twice. Do people want this? I have no idea. In one case, we created a super-simplistic "OS", forced it on people, and did not care about their feedback (ok technically Chevron did the second two things.) In another case, we went for it all, but we never got the thing out the door, partially because it was a huge task.

Midseason Baseball Thoughts

It's the middle of the MLB season, and the All-Star Game is next week. This season has had its share of surprises so far. Since I live in the Bay Area, I must recognize the surprising season of the San Francisco Giants. They currently have the second best record in the National League. They are far back of the Dodgers, but would still be in the playoffs as the NL wildcard, if the season was over today. I have to admit that I keep wondering: are the Giants for real? The answer: Yes!

Based on the Giants runs scored vs. runs allowed, the Giants expected wins at this point is 46.9. Their actual wins are 46. So they have not been "lucky" in terms of wins and losses. Their 303 runs allowed is the best in baseball. The pitching has been awesome. The offense has been middle of the pack, but hey that is not as bad as many (including me) expected. So the Giants are for real because they have the best pitching in baseball. Will this continue to be true in the second half of the season?

It is not a slam dunk. Lincecum is even more dominating this year than he was last year when he won the Cy Young. Some people worry that he is pitching too many innings given his age, but I won't jump on that bandwagon. He is one of the Giants' two All-Stars. The other is Matt Cain. Now I love Matt Cain. He is from Dothan, Alabama. That is just two hours north of my hometown in Florida, and is where my brother lives. However, Cain has been a little lucky this year. On balls put in play, Cain's batting average against is a little. Plus he's been lucky with stranding base runners. If this luck does not continue, then you can expect his hits allowed and runs allowed to both jump up a bit. Cain's a flyball pitcher too, so a jump in home runs allowed is always possible. Randy Johnson has been ok, but more importantly healthy. He just suffered his first injury of the season, which does not look like it will be too bad. However, it's hard to expect him to stay healthy the rest of the season. Barry Zito has been better than last year (which is not saying much) but could actually do even better in the second half. The fifth spot in the rotation has been in flux, so things could really only get better for that spot.

Enough about the Giants... As for the rest of the NL. My favorite team, the Atlanta Braves, are mediocre and there's no reason to expect a lot different from them. The Phillies are better than their record, while the Marlins are a lot worse than their record. The Cardinals are also not quite as good as their record, but the rest of the NL Central is very mediocre. The Rockies have been a huge surprise, and they look a legitimate contender for the wildcard spot. They should be the strongest challenger to the Giants, and of course the two teams will get plenty of chances to play each other as they are both in the NL West.

As for the AL, arguably the four best teams are all in the AL East. The Yankees have a nice lead in the wildcard over Tampa Bay. Still, Tampa Bay has proven that last year was not a fluke. The Yankees are definitely getting a nice ROI on all the money they spent on pitching. They are only a game back of Boston. Both teams look to improve in the second half. The Yankees will get three months of Alex Rodriguez, and the Red Sox's David Ortiz seems to have finally regained his form. The AL Central is a mess, with three pretty good teams within 2 games of each other. It's easy to count out Minnesota, but statistically they have been the best of the three teams. Finally, the AL West is a two horse race, and the two teams, LA and Texas, are statistically very even.

Thursday, July 02, 2009

Hacking My iPhone 3GS

If you are a long time reader of this blog ... nevermind there are no long time readers of this blog. Anyways as it turns out, two of my most popular blog posts of all time have been about hacking my phone. First there was Hacking my A950. That was about enabling MP3 playback and Bluetooth dial-up-networking on my Samsung A950. A year later came Hacking my 8830. This was mostly about useful tips for the Blackberry, along with a failed attempt to get GPS working (thanks for nothing Verizon.)

Two weeks ago I deactivated my Blackberry after two years of excellent service, and bought the new iPhone 3GS. I have been working on iPhone apps (one app in particular) so this made sense. Oh and the iPhone is the best phone ever. The Exchange support is so good on it that it is an easy move for a Blackberry user, but I digress. Back to the point. As a rite of passage, it was time for me to write about hacking the iPhone.

There are two things that I will point you would-be hackers to. First up is ringtones. Just like with the Blackberry, there is no need to spend any money on ringtones for the iPhone. There are a few ways to create ringtones on the iPhone. This is the guide that got me going. If you don't want to read it (you should) the idea is simple. A ringtone for the iPhone is an AAC encoded, .m4r file. So clip a song down to less 40 seconds, convert it to AAC, and rename the file from .m4a to .m4r. It is a painless process. I created a half dozen or so ringtones in a few minutes, mostly Girl Talk songs...

The other tip is also pretty well known. The iPhone OS 3. 0 has support for tethering and MMS, but neither is currently supported by AT&T. However you can hack this at the software level, and pwn AT&T. Here is the best guide I have found for this. What is great is that it only involves going to a web page in Mobile Safari, and it is easy to undo. For me it works great for tethering, but MMS did not get enabled.

Thursday, June 25, 2009

iPhoto Faces Awesomeness

I've been using iLife '09 since it came out. I've made heavy use of its integration with Facebook and Flickr. One of the features that I've wanted to experiment with is Faces. This is face recognition technology. Using it is simple. You have to train it by taking a few pictures and telling iPhoto who are in those pictures. This is very easy. Look at a picture, click on the "Name" button, and iPhoto immediately recognizes the faces in the picture. There is a little text box next to each face, and you just type in the name. It remembers previously typed names or names from your Address Book, so there is usually very little typing to do. Obviously the more training you do, the better the results that iPhoto will have.

Despite this being so easy, I've been too lazy to use this. Until now. I was importing a handful of pictures taken with my iPhone. These were mostly pictures of my boys playing and being goofy. Here is a picture of Michael, Jr. (age 5):

And here is a picture of Raymond (age 3):

Now the reason that I show the pictures and mention the ages will become obvious momentarily. Back to the narrative. After tagging my kids in these pictures, iPhoto was ready to show me other pictures with the same faces. The very first one that it showed me nearly made me cry:

Yes that is my oldest son. He was just six months old in that picture. The pics I used to train iPhoto were all very recent. It was really cool that it recognized his face, even though he certainly looks a lot different today than he did four and a half years ago. Similarly, iPhoto served up this gem for Raymond:

Monday, June 22, 2009

Buyer Beware: ProFlowers

Last month I was in Israel for three weeks. These three weeks included the date of my anniversary. I still wanted to do something nice for my wife on our anniversary, so I decided to send her flowers. I used a company that I had seen an ad for ProFlowers. While I was checking out, they gave me an offer for $15 off my order. All I had to do was fill out a survey, which I did. As part of this process I agreed to sign up for a service called EasySaver. Joining was oh-so easy as ProFlowers shared my credit card information with EasySaver. Of course EasySaver then starts charging me $15 per month for membership.

Now I was clearly stupid for falling for something like this. I was quite embarassed when my wife noticed the charges on our credit card. Luckily we only got nailed for May and June. Anyways, clearly my fault. However, I really hate shady operations like this. I will never use ProFlowers again. As far as I can tell, EasySaver is completely worthless anyways. Hopefully other folks will be smarter than me and not fall for scams like this.

Thursday, June 18, 2009

Scala and Android

Recently I was working on an article for developerWorks about using Scala on Android. Stay tuned to developerWorks to see that article. Since then I have started a little side project to create a set of Scala libraries for working on Android. Now you may ask, why bother with this? Is this just a classic case of mental masturbation? Well, no, at least I hope not.

I think Scala is well suited as a programming language for mobile devices. I am not an expert on all mobile platforms, my experience is mainly around iPhone and Android development. For the iPhone you are developing in Objective-C, and for Android in Java. Now in general, I think the Android development environment is superior in almost every way to the iPhone. IDE is mostly a matter of taste, but Eclipse+ADT certainly has a lot more features than XCode. Debugging is definitely better on the Android. It is much easier to debug on a device, and the emulator gives you more ways to vary the environment (cpu/memory/network resources) for more realistic emulation. The only thing advantage I would give to iPhone is that XCode and the iPhone emulator both startup much faster than Eclipse and the Android emulator.

Programming in Java gives you a lot more options than Objective-C. There are just a lot more open source Java libraries out there to help you avoid reinventing wheels. In addition, I think the platform specific APIs added by Google are much better than the ones on the iPhone. A simple example is XML processing. Java and Objective-C both already have XML parsing APIs. However, Android adds extra APIs as they recognize this as a common task for mobile applications that leverage web services. They added convenience APIs for SAX based parsing. They also leveraged a StAX-style parser by leveraging an Apache project. There is nothing like this in iPhone, despite the fact that the iPhone has been around longer and should be more mature. It may be an oversimplification, but I think this comes back to the corporate cultures that produced these platforms. Apple has always been a company that uses software to sell hardware, and this continues with the iPhone. Google is much more of a software company, with strong ties to open source software. It should be no surprise that Android is a much more development friendly platform. And don't even get me started when it comes to garbage collection (Android has it, iPhone does not, despite the fact that Objective-C 2.0 has it.)

So Android is great, why do we need Scala? First there is verbosity. Actually both Objective-C and Java are quite verbose -- Objective-C is even more verbose than Java. Scala is a much more concise language than either. Objective-C can actually be more dynamically typed than Java/Scala, but this rarely makes the code any more concise or readable, and often makes it harder to debug. Next, neither Objective-C or Java have much support for functional programming. This is a bigger deal than you might think at first. UI frameworks are full of events that you must write code to handle. Writing a function to handle those events is so much easier than writing a class. On the iPhone you can often set any type of object as an event handler, and you just have to know the magic methods to implement. On Android, you usually have an interface to implement, and that interface has a single method. This is actually better than what you do on the iPhone, but the interface and its implementation class are both classic cases of boilerplate.

Thus it is my goal to make it possible to write Android applications in a much more concise way by using Scala. In general, you can replace Java code with Scala that is much easier to read and understand. I think this will be especially true on Android, and the contrast will be even more true.

The last reason that I think Scala is a great language for mobile development is threading. Both iPhone and Android support multi-threading, as any UI framework must. Any good UI will perform any non-trivial operations on a background thread, to keep the main UI thread responsive to the user. However, threads are tricky, especially this kind of multi-threading. Updating your UI from a background thread is a recipe for disaster. Android enforces this by using a Handler on the main thread that your background threads send messages to. If you are familiar with Scala then message passing as a way to implement multi-threading should ring a bell. This is exactly how actors work in Scala. I haven't tried using Actors on Android yet, but it sure seems like a good fit.

So there is some of the motivation. I think Scala on Android could be really compelling. My side project is pretty small so far: a handful of wrapper classes and implicit conversions to bring some syntactic sugar and functional programming into the mix on some common Android classes. The project also has a special build of the Scala library that I found necessary to get the library to work on Android and its compiler. I am hoping to add more sugar and to add the actors ideas listed above. I am also hoping to come up with a nice way to integrate something like Proguard to optimize the bytecodes that get compiled for Android.

Tuesday, June 16, 2009

HTML 5: Don't Believe The Hype

In case you haven't heard, HTML 5 is The New Hotness. I have been watching this tide rising for awhile. There have been many skirmishes between The Unite Emirates of JavaScript and The Kingdom of Flash over how plugin technologies were going to be made obsolete by HTML 5. To hear some folks talk about the future of HTML 5, it is a veritable morality play. HTML 5 is all about the nobleness of open source and open standards. It is democracy and freedom. Everything else is closed and tyrannical.

Things really took off a few weeks ago at Google's IO conference. GOOG officially put its chips in the HTML 5 pot. This should not have been a surprise, as Google has invested in improving browser technology first through Gears and then through its own browser, Chrome. Of course Chrome actually uses the now ubiquitous WebKit for its rendering engine. This is at the heart of not only Apple's Safari browser, but also the mobile browsers on the iPhone, Android, and Palm Pre. Google used the "standardization" of WebKit on mobile devices to deliver mobile versions of GMail and Calendar thus punting on native applications.

So HTML5 is the future, right? Not so fast. I know it seems this way to a lot of people, especially the emphatic fans of open standards. However, those guys live in a different world than I. In the world I live in, browser fragmentation is at an all-time high. In a world of fragmentation, standards are an amusing concept. If history has taught us anything, it is that standards are set by whoever has the most users.

This has long been a source of pain for Google and others. In all fairness, IE8 implements some of the HTML 5 features, but not all of them. Also, the IE family of browsers has been losing share for several years. However, it is going to be awhile before web developers can forget about IE6, not to mention IE7. Given the update cycle of IE in the past, we can expect the older versions to stick around a long time before IE8 finally becomes the king. Again it does not even implement all of the highly touted HTML 5 features. Maybe more will be supported in IE9? How long before that becomes the dominant browser?

My point is that if you want to hitch yourself to HTML 5, you are not hitching yourself to Firefox, Safari, and Chrome. Well ok, you can hitch yourself to those browsers (or IE+Gears), but you are saying go away to 60-70% of the world. So assuming you don't want to do that, then hitching your app to HTML5 means hitching your app to future IE adoption. That just seems like a scary proposition to me.

But what about mobile devices? Here things are clearly better. Again the standards are set by the dominant player. WebKit is the dominant technology in high-end mobile browsers. Still there is fragmentation. The exact same APIs do not work on Mobile Safari as on the Android browser (don't know about the Pre.) Thus you may have to do some abstractions to smooth over the differences. Oh wait, that is starting to sound a lot like the situation with desktop browsers... Anyways, many HTML 5 features are present on the iPhone and Android browsers, so you have much more latitude to use them. However, there is another, more important question to answer when considering going down the path of GMail/Calendar on the iPhone/Android. Can a browser based be as good as a native app? Probably not, but maybe it's "good enough." However, I have to wonder, how many GMail users use the GMail web app on their iPhone instead of using the native Mail application? The GMail web app gives you all of the features you are used to, like conversations, labels, stars, and awesome search. The native Mail app provides none (or at best much weaker implementations) of this, but yet I would guess that most people still prefer it over the web app.

Sunday, June 07, 2009

Scala Concurrency

I got to see Scala creator Martin Odersy speak twice last week. First time was at the June BASE meeting, and the second was at the Scala LiftOff. At both meetings, he stated that he wants Scala to be the best language for concurrent programming. He also mentioned two types of concurrency. One is the type you hear about most often, where processes work in parallel and need to simultaneously affect some shared data or global state. This is the kind of problem that becomes hard to deal with using primitives common in Java and C++, but can be a little easier using Actors in Scala. The other kind is where a big process needs to be split up into smaller ones that can be run in parallel. It wasn't clear to me if Actors were a good fit for this or not, or maybe improvement is needed. So I decided to experiment with this.

I took one of the algorithms I wrote for my JavaOne talk on language performance, and I used Actors to do some processing parallel. In my original "driver" code, I took a list of files and sequentially passed each file to the sort class. In the new code, I created an Actor to pass the file to, and the Actor then fanned out the processing to an internal list of Actors that could sort the words in the file. When sorter Actors finish, they send back the results to the "master" Actor. Here is the code:

// messages
case class FileToSort(fileName:String)
case class SortResults(fileName:String, words:List[String])

class WordSortActor(master:Actor) extends Actor{
start
def sortedWords(fileName:String) = {
val dataFile = new File(fileName)
var words:List[String] = Nil
Source.fromFile(dataFile).getLines.foreach(
_.split(" ").foreach(words ::= _))
words.sort(_.toLowerCase < _.toLowerCase)
}
def act {
loop {
react {
case FileToSort(fileName:String) => {
println("Sorting file = " + fileName +
" on thread " + Thread.currentThread.getName)
val words = sortedWords(fileName)
master ! SortResults(fileName,words)
// why doesn't sender work instead?
//sender ! SortResults(fileName, words)
}
}
}
}
}

class SortMaster(docRoot:String, numActors:Int) extends Actor{
private
var sorted : List[List[String]] = Nil
var workers:List[ScalaWordSort] = Nil
var latch:CountDownLatch = null
start

def beginSorting {
workers = (for (i <- 0 until numActors)
yield new WordSortActor(this)).toList
val dir = new File(docRoot)
var cnt = 0
dir.list.foreach(file =>{
workers(cnt % numActors) ! FileToSort(docRoot + file)
cnt += 1
})
latch = new CountDownLatch(cnt)
}

def waitUntilDone = latch.await

def act {
loop {
react {
case SortResults(fileName:String, words:List[String]) => {
sorted ::= words
latch.countDown
println("Received results for file=" + fileName)
}
}
}
}
}


So I don't know if this is a good use of Actors or not. Seems like you might need to tune the number of threads in the thread pool sitting behind the pool of Actors. It seems like the pattern might be useful. It takes advantage of Actors for combining the results done in parallel. The results of each sort are sent back to the master, who could then do useful things. In this example, it only dumps the results into a list, but you could imagine some kind of coordination, like adding to a map, that would require locking in a typical Java implementation.

One other strange thing, in the WordSortActor, I tried to send the results back to the sender, but this did not seem to work. I had to keep an explicit reference to the master Actor, and then it worked fine. One other annoying thing that came out of this little exercise ... I discovered how hard it is to debug Actors. I was using IntelliJ, and it was completely useless to set a break point in an Actor. This is not the case with Java threads. I did not try Eclipse or NetBeans.

Friday, June 05, 2009

JavaOne Talk: Performance Comparisons of Dynamic Languages on the Java Virtual Machine

Below are the sldies. Major thanks to Charlie Nutter for great feedback and advice on the Ruby code and tuning JRuby performance. Thanks to Chouser and Timothy Pratley for help with the Clojure code. And major thanks to Brian Frank for help with the Fan code.

JavaOne Talk: RIAs Done Right

Thursday, June 04, 2009

Liveblogging of Scala Actors Talk from JavaOne

Today I was attending a talk on Scala Actor by Phillipp Haller and Frank Sommers. Tyler asked me to live blog it. I thought that was a good idea, and that Twitter would be the way to do it. This did not work out, as Twitter shut me down for too many tweets in a given hour. So now I am posting the full live blog, both the tweets that went through and the ones that Twitter rejected. However, pulling out the text of all of the good tweets is not totally straightforward. I figured the easiest was to do this was to use some Scala with the Twitter API via the REPL:


scala> import java.net._
import java.net._

scala> import scala.xml._
import scala.xml._

scala> val url = new URL("http://twitter.com/statuses/user_timeline.xml?since_id=2035817713&max_id=2036208882&screen_name=michaelg&url: java.net.URL = http://twitter.com/statuses/user_timeline.xml?since_id=2035817713&max_id=2036208882&screen_name=michaelg&count=75

scala> val conn = url.openConnection conn: java.net.URLConnection = sun.net.www.protocol.http.HttpURLConnection:http://twitter.com/statuses/user_timeline.xml?since_id=2035817713&max_id=2036208882&screen_name=michaelg&count=75

scala> val tweets = XML.load(conn.getInputStream) tweets: scala.xml.Elem =


scala> (tweets\\"text").foreach{ node => println(node.text) }


The result of the last call produces the following:

This allows for many more actors than worker threads #javaone
Work stealing makes this even more efficient. Let worker threads steal work from other worker threads #javaone
So decouple from threads using thread pools from java.util.concurrent. #javaone
A naive impl of thread-per-actor has overhead from thread creation, context-switching and memory use. #javaone
How event-based actors decouples threads and actors. How the execution enviro is setup. How the DSL works #javaone
Now time to go under the hood of Scala Actors #javaone
Only use receive when you must return a value from an Actor #javaone
Use react whenever possible and only use receive when necessary #javaone
When to use receive vs. react? #javaone
Replace while(true) { ... } with loop { ... } #javaone
Go from 5,000 actors/JVM to 1,000,000/JVM using event based Actors #javaone
Replace receive with react, receiveWithin with reactWithin, and while(condition) with loopWhile (condition) #javaone
To scale to a large number of Actors, go for event based Actors #javaone
The sender of a message is still sent across the network #javaone
Everything else works exactly the same as it would for local actors #javaone
Use the select function to get a proxy to the remote Actor #javaone
To make an actor remote is very easy. Call the alive function and the register function #javaone
So far everything has been within a single VM, but Scala Actors scales using remote actors #javaone
To specify timeouts, use receiveWithin. This takes a timeout and sends a TIMEOUT message #javaone
All of this makes it easy to keep track of subscribers and send them messages, like new chat messages #javaone
You can also wait for futures using receive function #javaone
When you try to access the future, it will then block if the reply has not been received #javaone
For async with assurance that the message was received using double-bang !! Returns a future representing the reply #javaone
Synchronous messages are allowed too using !? This blocks until reply received #javaone
Bang is asyc, returns immediately and passes a reference to the sender #javaone
To subscribe a user to the chat room, just send a message using the "bang" operator (method): ! #javaone
There is another API shortcut for creating an actor inline using the "actor" function #javaone
Creating a subscription is just a matter of sending message stating as much. Lets recipient capture the state #javaone
Common pattern of "child actors", benefits from asych communication of actors #javaone
Make each user an Actor as well to maintain state, such as who sent a message to the chat room #javaone
Now going back to the chat application to demonstrate Actor subscriptions #javaone
Patterns are tried in order, and the first match wins. No fall-through. Looks like a factory method call. #javaone
This is all based on Scala's pattern matching, think Java's switch statement but on steroids #javaone
If no message in mailbox, the actor will suspend until there is a message. Syntax for matching messages is super simple #javaone
Use the receive function to process messages within the act method of your Actor. Receive gets a message from the Actor's mailbox #javaone
Defining messages is made easy by using Scala's case classes. These are pure data classes typically. #javaone
Creating an actor is very similar to creating a thread in Java. #javaone
Creating an Actor is easy, just do 3 things. Extend Actor. Implement act method. Start the actor. #javaone
Frank showing a chat application architecture that uses Actors "the quintessential Actor system" #javaone
Actor implemented entirely as a DSL, but part of Scala std. library. Used in big systems such as Twitter #javaone
Actors can be local or remote, no change to programming model. #javaone
Actors are event based in Scala, not tied to Java threads. So much more lightweight. A single JVM can have millions of actors #javaone
Actors use several Scala language features: pattern matching, closures, traits/multiple inheritance, and Java interop. #javaone
Actors in Scala are probably the closest implementation of Erlang model on the JVM #javaone
OTP: library of Erlang modules for building reliable distributed systems #javaone
Erlang has support for Actors built-in. Erlang VM is process-oriented, makes Actor model very scalable. #javaone
Best known example of Actors though is from Erlang, "a pure message passing language" #javaone
Actors are an established pattern. Research dates to 70s at MIT in Smalltalk and Simula #javaone
Mutable messages are also possible. Actors can also be local or remote -- same programing model #javaone
Actors share nothing, private state. Thus no synchronization. Actors send messages, but the message are immutable #javaone
Actors are active objects and has a mailbox. Actors consume messages, perform actions but the code is sequential within an actor #javaone
There is an existing concurrency model that fits with the described solutions: Actors. #javaone
The solution: DSL + sequential programs. Shared nothing. Message passing. Asynchronous communication #javaone
Why learn about Scala actors? Concurrency presents opportunity and problems for developers, especially large teams of developers #javaone
Frank and Phillip are writing a book on Scala actors #javaone
Phillip Haller and Frank Sommers talking about Scala actors now #javaone

So that's the good tweets. Here are the ones that did not go through...


The actor method takes a closure creates an Actor
The react method saves pattern matching as a closure to be executed later
Uses exception for control flow, any code after react will not execute
Actors usually execute on different threads, but you can control what threads
This is useful for threadLocals and for Swing
A demo of SwingActor
Example of using link and trapExit, but not much detail of this shown
Scala Actors provide a simple programming model to write concurrent code on the JVM
Many projects already use actors: Twitter, Lift, Scala OTP (@jboner), partest
Future of Scala Actors in 2.8
Pluggable schedulers -- create actors with daemon-like behavior for example
Integrating exceptions and Erlang-style error handling. Get the best of both Java and Erlang error handling
Support for continuations. Currently requires optional compiler-plugin. Allows for more flexible control structures
Actor migrations (from machine to machine) could also be done with continuations
More static checking of messages, in particular an @immutable annotation
Actor isolution -- no accidentally sharing mutable state
This will allow for better passing mutable messages for performance benefits
No problem passing large methods, as they are passed by reference

Wednesday, May 20, 2009

JavaOne Talk: Scala Reversible Numbers

See this post about why this code is being shown. See this post for the Java version to compare against.

class ScalaReversible(max:Int){
import java.lang.Character._
lazy val numReversible = (11 to max).filter(reversible(_)).size
private
def reverse(n:Int)=n + parseInt(n.toString.reverse)
def allOdd(n:Int) = n.toString.map(digit(_,10)).forall(_ % 2 == 1)
def reversible(n:Int) = allOdd(reverse(n))
}

JavaOne Talk: Python Reversible Numbers

See this post about why this code is being shown. See this post for the Java version to compare against.

class PyReversible(object):
__slots__ = ['max','numReversible']

def __init__(self, max):
self.max = max
self.numReversible = self.countReversible()

def countReversible(self):
return len([i for i in xrange(11,self.max+1) if self.reversible(i)])

def allOdd(self,n):
for ch in str(n):
if int(ch) % 2 != 1:
return False
return True

def reverse(self,n):
return n + int(str(n)[::-1])

def reversible(self,n):
return self.allOdd(self.reverse(n))

JavaOne Talk: Ruby Reversible Numbers

See this post about why this code is being shown. See this post for the Java version to compare against.

class RubyReversible
def initialize(max)
@max = max
@numReversible = nil
end

def allOdd(n)
digits = n.to_s.split(//)
digits.length == digits.map{|c| c.to_i}.select{|i|
i % 2 == 1}.length
end

def reverse(n)
n + n.to_s.reverse.to_i
end

def reversible(n)
allOdd(reverse(n))
end

def countReversible()
@numReversible ||= (11..@max).select{|i|
reversible(i)}.length
@numReversible
end
end

JavaOne Talk: Groovy Reversible Numbers

See this post about why this code is being shown. See this post for the Java version to compare against.

public class GroovyReversible {
final Integer max
final Integer numReversible = 0
public GroovyReversible(max){
this.max = max
this.numReversible = this.countReversible()
}

def countReversible(){
numReversible ?: (11..max).findAll{reversible(it)}.size()
}

def reversible(n){
allOdd(reverse(n))
}

def allOdd(n){
n.toString().toList().collect {it.toInteger()}.every{it % 2 == 1}
}

def reverse(n){
n + n.toString().toList().reverse().join().toInteger()
}
}

JavaOne Talk: Reversible Numbers

See this post about why this code is being shown. This is a different algorithm, it is a brute force solution to Project Euler problem #145. Here is the Java "reference" implementation.

public class Reversible {
final int max;
final int numReversible;

public Reversible(int max){
this.max = max;
this.numReversible = this.countReversible();
}

private int countReversible(){
if (numReversible > 0){
return numReversible;
}
int cnt = 0;
for (int i=11;i<=max;i++){
if (reversible(i)) {
cnt++;
}
}
return cnt;
}

private boolean reversible(int n){
return allOdd(reverse(n));
}

private boolean allOdd(int n){
while (n > 0){
int remainder = n % 2;
if (remainder == 0) return false;
n = n / 10;
}
return true;
}

private int reverse(Integer n){
char[] digits = n.toString().toCharArray();
char[] rev = new char[digits.length];
for (int i=digits.length-1;i>=0;i--){
rev[i] = digits[digits.length -i-1];
}
return n + Integer.parseInt(String.valueOf(rev));
}
}

JavaOne Talk: Scala Word Sort

See this post about why this code is being shown. See this post for the Java version to compare against.

class ScalaWordSort(fileName:String){
lazy val sortedWords = {
Source.fromFile(dataFile).getLines.foreach(_.split(" ").foreach(words ::= _))
words.sort(_.toLowerCase < _.toLowerCase)
}
private
val dataFile = new File(fileName)
var words:List[String] = Nil
}

JavaOne Talk: Python Word Sort

See this post about why this code is being shown. See this post for the Java version to compare against.

class PyWordSort(object):
def __init__(self, fileName):
print "init " + fileName
self.dataFile = open(fileName)
self.words = []
def sortedWords(self):
if len(self.words) == 0:
for line in self.dataFile:
for word in line.split():
self.words.append(word)
self.words.sort(lambda w,w1: cmp(w.lower(), w1.lower()))
return self.words

JavaOne Talk: Ruby Word Sort

See this post about why this code is being shown. See this post for the Java version to compare against.

class RubyWordSort
def initialize(fileName)
@dataFile = File.new(fileName)
@words = Array.new
end
def sortedWorts
if (@words.length == 0)
@dataFile.each_line{ |line| line.split(' ').each{ |word| @words << word}}
@words = @words.sort{ |w,w1| w.upcase <=> w1.upcase }
end
@words
end
end

JavaOne Talk: Groovy Word Sort

See this post about why this code is being shown. See this post for the Java version to compare against.

public class GroovyWordSort {
File dataFile
List words = []
public GroovyWordSort(fileName){
dataFile = new File(fileName)
}

def sortedWords(){
if (!words){
dataFile.eachLine{it.tokenize().each {words.add(it) }}
words = words.sort{w,w1 -> w.toLowerCase() <=> w1.toLowerCase()}
}
words
}
}

JavaOne Talk: Word Sort

See this post about why this code is being shown. This is a different algorithm than the previous examples, so once again I will start with the Java version. The program takes a file name, and reads the file into memory. The file contains a list of randomly generated words, separated by spaces. The program does a case-insensitive sort of all of the words. Here is the Java "reference" implementation:

import java.io.*;
import java.util.*;

public class WordSort {
private final File dataFile;
private final List<String> words;
public WordSort(String fileName){
dataFile = new File(fileName);
words = new ArrayList<String>();
}

public List<String> getSortedWords(){
if (words.size() > 0){
return words;
}
try {
BufferedReader reader = new BufferedReader(new FileReader(dataFile));
try{
String line;
while ( (line = reader.readLine()) != null){
for (String string : line.split(" ")){
words.add(string);
}
}
} finally {
reader.close();
}
Collections.sort(words, new Comparator<String>(){
public int compare(String s, String s1) {
return s.toLowerCase().compareTo(s1.toLowerCase());
}
});
return words;
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}

Monday, May 18, 2009

JavaOne Talk: Fan Prime Sieve

See this post for an explanation on why this code is being shown and what kind of responses would be useful. See this post for the Java algorithm to compare against.

Note: This solution is courtesy of Mr. Fan himself, Brian Frank. Thanks Brian!

class Primes {
Int[] primes := Int[,]
Int size

new make(Int n) {
size = n
primes.capacity = n
init
}

private Void init() {
i := 2
max := calcSize
nums := Bool[,]
nums.fill(true, max)
while (i < max && primes.size < size) {
p := i
if (nums[i])
{
primes.add(p)
j := 2*p;
while (j < max - p) {
nums[j] = false
j += p;
}
}
i += 1;
}
}

Int last() { primes.last }

private Int calcSize() {
max := 2
while ((max.toFloat/max.toFloat.log).toInt < size && max < 0x7fff_ffff && max > 0) {
max *=2; // memory inefficient, but fast
}
return max;
}
}

Saturday, May 09, 2009

JavaOne Talk: Scala Prime Sieve

See this post for an explanation on why this code is being shown and what kind of responses would be useful. See this post for the Java algorithm to compare against.

import java.lang.Integer._
import java.lang.Math._
import scala.collection.mutable.ArrayBuffer

class ScalaPrimes(val size:Int){
val primes = new ArrayBuffer[Int]
var cnt = 0
init
def last = primes(primes.size - 1)

private
def init {
var i = 2
val max = calcSize
val nums = new ArrayBuffer[Boolean]
for (i<-0 until max) nums += true
while (i < max && cnt < size){
val p = i
if (nums(i)){
primes += p
cnt += 1
(p until max/p).foreach((j) => nums.update(p*j,false))
}
i += 1
}
}
def calcSize = {
var max = 2
while ( (max/log(max)) < size && max < MAX_VALUE && max > 0) {
max *= 2
}
max
}
}

JavaOne Talk: Python Prime Sieve

See this post for an explanation on why this code is being shown and what kind of responses would be useful. See this post for the Java algorithm to compare against.

from math import log
import sys

class PyPrimes(object):
__slots__ = ['cnt','primes','size']

def __init__(self,n):
self.primes = n*[0]
self.cnt = 0
self.size = n
i = 2
max = self.calcSize()
nums = max*[True]
while i < max and self.cnt < self.size:
p = i
if nums[i]:
self.primes[self.cnt] = p
self.cnt += 1
for j in xrange(p, max/p):
nums[p*j] = False
i+= 1

def calcSize(self):
max = 2
while max/log(max) < self.size:
max = max *2 # this makes the sieve too big, but fast
return max

def last(self):
return self.primes[self.cnt - 1]

Tuesday, May 05, 2009

JavaOne Talk: Ruby Prime Sieve

See this post for an explanation on why this code is being shown and what kind of responses would be useful. See this post for the Java algorithm to compare against.

class RubyPrimes
attr_accessor :cnt, :primes
def initialize(n)
@primes = Array.new(n,0)
@cnt = 0
i = 2
max = calcSize
nums = Array.new(max,true)
while (i < max && @cnt < @primes.length)
p = i
if (nums[i])
@primes[@cnt] = p
@cnt += 1
(p .. (max/p)).each{ |j| nums[p*j] = false }
end
i += 1
end
end
def calcSize
max = 2
max = max * 2 while ( max/Math.log(max) < @primes.length)
max
end
def last
@primes[@cnt -1]
end
end

JavaOne Talk: Groovy Prime Sieve

See this post for an explanation on why this code is being shown and what kind of responses would be useful. See this post for the Java algorithm to compare against.

public class GroovyPrimes {
int cnt = 0
final def primes = []
final int size

public GroovyPrimes(int n){
size = n
init()
}

int last() {
primes[cnt-1]
}

private void init(){
int i = 2
int max = calcSize()
def nums = [true]*max

while (i < max && primes.size() < size) {
int p = i
if (nums[i]) {
primes << p
(p..(max/p)).each{ j-> if (p*j < max) nums[p*j] = false}
}
i += 1
}
}

private int calcSize(){
int max = 2
while ((max / Math.log(max)) < size &&
max < Integer.MAX_VALUE && max > 0) {
max *= 2
}
max
}
}

JavaOne Talk: Being Less Efficient

See previous post if you have not already. I was getting ready to post some of implementations of the prime sieve in other languages, when I realized I had probably unfairly over-optimized the Java solution by using arrays. This kind of optimization is available in some languages, but not others. So it would be hard to do a fair comparison if this optimization was used at all. So here is the Java prime sieve, but using resizable lists instead of primitive arrays:

import java.util.List;
import java.util.ArrayList;

public class Primes {
final private List<Integer> primes;
private final int size;

public Primes(int n){
size = n;
primes = new ArrayList<Integer>(n);
init();
}

private void init(){
int i=2;
int max = calcSize();
List<Boolean> nums =
new ArrayList<Boolean>(max);
for (int j=0;j<max;j++){
nums.add(true);
}
while (i < max && primes.size() < size){
int p = i;
if (nums.get(i)){
primes.add(p);
int j = 2*p;
while (j < max - p){
nums.set(j,false);
j += p;
}
}
i += 1;
}
}

public int last(){
return primes.get(primes.size()-1);
}

private int calcSize(){
int max = 2;
while ((max/Math.log(max)) < size &&
max < Integer.MAX_VALUE && max > 0){
max *=2;
}
return max;
}
}

Sunday, May 03, 2009

JavaOne Talk: Prime Sieve

Next month I am speaking at JavaOne on the performance of various programming languages running on the JVM. For the talk, I am taking several algorithm/programming problems solved in each of the various languages, and comparing the relative performance and tuning tips. For each problem, I am using "native Java" as a reference point. This is the first of many blog posts that show these various programming problems solved in the various languages. I am posting these programs as a way for other folks to let me know where I've screwed up, i.e. done something that will negatively affect performance, etc.

I am not too interested in significantly different ways to solve these problems. For example, the first algorithm is a prime sieve. So I would not be interested in somebody letting me know that there are other ways to generate primes besides a sieve. However, I am interested in finding mistakes in my implementations. I will not claim to be an expert in any of these languages -- in fact I have been learning two of the languages, Fan and Clojure, specifically for this talk -- so there is a good chance that I did screw something up. Even in the languages that I'd like to think I am pretty good at (including Java), I figure there is always a good chance of me doing something screwy there too. So without further ado, here is my prime sieve in Java.


import java.util.Arrays;

// calculates the first n primes
public class Primes {
final private int[] primes;
private int cnt = 0;

public Primes(int n){
primes = new int[n];
init();
}

private void init(){
int i=2;
int max = calcSize();
boolean[] nums = new boolean[max];
Arrays.fill(nums, true);
while (i < max && cnt < primes.length){
int p = i;
if (nums[i]){
// add it to the list of primes
primes[cnt++] = p;
int j = 2*p;
// remove all the multiples of the prime
while (j < max - p){
nums[j] = false;
j += p;
}
}
i += 1;
}
}

public int last(){
return primes[cnt-1];
}

// calculates number of integers needed to find n primes
private int calcSize(){
int max = 2;
while ((max/Math.log(max)) < primes.length && max < Integer.MAX_VALUE && max > 0){
max *=2; // memory inefficient, but fast
}
return max;
}
}