Showing posts with label scala. Show all posts
Showing posts with label scala. Show all posts

Saturday, June 04, 2011

The Concurrency Myth

For nearly a decade now technology pundits have been talking about the end of Moore's Law. Just this week, The Economist ran an article about how programmers are starting to learn functional programming languages to make use of the multi-core processors that have become the norm. Indeed inventors of some of these newer languages like Rich Hickey (Clojure) and Martin Odersky (Scala) love to talk about how their languages give developers a much better chance of dealing with the complexity of concurrent programming that is needed to take advantage of multi-core CPUs. Earlier this week I was at the Scala Days conference and got to hear Odersky's keynote. Pretty much the second half of his keynote was on this topic. The message is being repeated over and over to developers: you have to write concurrent code, and you don't know how to do it very well. Is this really true, or is it just propaganda?

There is no doubt that the computer that we buy are now multi-core. Clock speeds on these computers have stopped going up. I am writing this blog post on a MacBook Air with a dual-core CPU running at 2.13 GHz. Five years ago I had a laptop with a 2.4 GHz processor. I'm not disputing that multi-core CPUs are the norm now, and I'm not going to hold my breath for a 4 GHz CPU. But what about this claim that it is imperative for developers to learn concurrent programming because of this shift in processors? First let's talk about which developers. I am only going to talk about application developers. What I mean are developers who are writing software that is directly used by people. Well maybe I'll talk about other types of developers later, but I will at least start off with application developers. Why? I think most developers fall into this category, and I think these are the developers that are often the target of the "concurrency now!" arguments. It also allows me to take a top-down approach to this subject.

What kind of software do you use? Since you are reading this blog, I'm going to guess that you use a lot of web software. Indeed a lot of application developers can be more precisely categorized as web developers. Let's start with these guys. Do they need to learn concurrent programming? I think the answer is "no, not really." If you are building a web application, you are not going to do a lot of concurrent programming. It's hard to imagine a web application where one HTTP request comes in and a dozen threads (or processes, whatever) are spawned. Now I do think that event-driven programming like you see in node.js will become more and more common. It certainly breaks the assumption of a 1-1 mapping between request and thread, but it most certainly does not ask/require/suggest that the application developer deal with any kind of concurrency.

The advancements in multi-core processors has definitely helped web applications. Commodity app servers can handle more and more simultaneous requests. When it comes to scaling up on a web application, Moore's Law has not missed a beat. However it has not required all of those PHP, Java, Python, Ruby web developers to learn anything about concurrency. Now I will admit that such apps will occasionally do something that requires a background thread, etc. However this has always been the case, and it is the exception to the type of programming usually needed by app developers. You may have one little section of code that does something concurrent, and it will be tricky. But this has nothing to do with multi-core CPUs.

Modern web applications are not just server apps though. They have a lot of client-side code as well, and that means JavaScript. The only formal concurrency model in JavaScript are Web Workers. This is a standard that has not yet been implemented by all browsers, so it has not seen much traction yet. It's hard to say if it will become a critical tool for JS development. Of course one of the most essential APIs in JS is XMLHttpRequest. This does indeed involve multiple threads, but again this is not exposed to the application developer.

Now one can argue that in the case of both server side and client side web technologies, there is a lot of concurrency going on but it is managed by infrastructure (web servers and browsers). This is true, but again this has always been the case. It has nothing to do with multi-core CPUs, and the most widely used web servers and browsers are written in languages like C++ and Java.

So is it fair to conclude that if you are building web applications, then you can safely ignore multi-core CPU rants? Can you ignore the Rich Hickeys and Martin Oderskys of the world? Can you just stick to your PHP and JavaScript? Yeah, I think so.

Now web applications are certainly not the only kind of applications out there. There are desktop applications and mobile applications. This kind of client development has always involved concurrency. Client app developers are constantly having to manage multiple threads in order to keep the user interface responsive. Again this is nothing new. This has nothing to do with multi-core CPUs. It wasn't like app developers used to do everything in a single thread, but now that multi-core CPUs have arrived, you need to start figuring out how to manage multiple threads (or actors or agents or whatever.) Now perhaps functional programming can be used by these kind of application developers. I think there are a lot of interesting possibilities here. However, I don't think the Hickeys and Oderskys of the world have really been going after developers writing desktop and mobile applications.

So if you are a desktop or mobile application developer, should you be thinking about multi-core CPUs and functional programming? I think you should be thinking about it at least a little. Chances are you already deal with this stuff pretty effectively, but that doesn't mean there's room for improvement. This is especially true if language/runtime designers started thinking more about your use cases.

I said I was only going to talk about application developers, but I lied. There is another type of computing that is becoming more and more common, and that is distributed computing. Or is it called cloud computing? I can't keep up. The point is that there are a lot of software systems that run across a cluster of computers. Clearly this is concurrent programming, so bust out the functional programming or your head will explode, right? Well maybe not. Distributed computing does not involve the kind of shared mutable state that functional programming can protect you from. Distributed map/reduce systems like Hadoop manage shared state complexity despite being written in Java. That is not to say that distributed systems cannot benefit from languages like Scala, it's just that the benefit is not necessarily the concurrent problems/functional programming that are often the selling points of these languages. I will say that Erlang/OTP and Scala/Akka do have a lot to offer distributed systems, but these frameworks address different problems than the multi-core concurrency.

It might sound like I am a imperative program loving curmudgeon, but I actually really like Scala and Clojure, as well as other functional languages like Haskell. It's just that I'm not sure that the sales pitch being used for these languages is accurate/honest. I do think the concurrency/functional programming angle could have payoffs in the land of mobile computing (desktop too, but there's not much future there.) After all, tablets have already gone multi-core and there are already a handful of multi-core smartphones. But these languages have a lot of work to do there, since there are already framework features and common patterns for dealing with concurrency in mobile. Event driven programming for web development (or the server in client/server in general) is the other interesting place, but functional languages have more to offer framework writers than application developers in that arena.  My friend David Pollak recently wrote about how the current crop of functional languages can hope for no more than to be niche languages like Eiffel. I think that he might be right, but not just because functional programming has a steep learning curve. If all they can offer is to solve the concurrency problem, then that might not be enough of a problem for these languages to matter.

Wednesday, January 20, 2010

JVMOne

This morning, my co-worker Jason Swartz had the great idea of a conference focussing on JVM languages. This seemed like a particularly good idea, given the uncertainty surrounding JavaOne. Personally, I think the JavaOne powers-that-be have done a good job showcasing other languages, especially the last two years. Anyways, I joked that we could probably host it at our north campus, since it has proper facilities and regularly hosts medium sized conferences,  and that we just needed the support of folks from the Groovy, JRuby, Scala, and Clojure communities. A lot of folks seemed to like the idea, and had some great feedback/questions. So let me phrase some of these questions in the context of "a JavaOne-ish replacement, focussing on alternative languages on the JVM, but ultimately driven by the community". Ok here are some of the questions.

1.) What about Java?
2.) What about languages other than Groovy, JRuby, Scala, and Clojure?
3.) What about first class technologies with their roots in Java, like Hadoop, Cassandra, etc.?

Certainly I have opinions on some of these, but obviously any kind of effort like this requires huge participation from the developer community. So what do you think? What other questions need to be asked? If there is enough interest, I will definitely try to organize it. So please leave comments below!

Tuesday, November 03, 2009

Persistent Data Structures

A few weeks ago I blogged about concurrency patterns and their relative tradeoffs. There were some really poor comments from Clojure fans -- the kind of abusive language and trolling that reminded me of Slashdot in its heyday. In fact, for the first time in a very long time, I had to actually delete comments... A lot of people get confused by the fact that I used Clojure as a representative for software transactional memory, and thought that I was talking about concurrency in Clojure. Anyways, I was flattered that Stuart Halloway commented on my blog. He said I needed to watch Rich Hickey's talk on persistent data structures. So I did. I wanted to embed it on my blog, but it didn't look like InfoQ supports that. So I stole his slides instead:
A little after all of this, I learned that persistent data structures were coming to Scala. So a lot of folks seem to think that these are pretty important. So why is that exactly? It was time to do some homework...

You can read some basic info on Wikipedia. You can go to the original source, Phil Bagwell's paper. The latter is very important. What's amazing is that these are relatively new ideas, i.e. the science behind this is less than ten years old. Anyways, I would say that the important thing about persistent data structures is that they are data structures designed for versioning. You never delete or for that matter change anything in place, you simply create new versions. Thus it is always possible to "rollback" to a previous version. Now you can see how this is a key ingredient for software transactional memory.

Going back to the original ideas... Bagwell's VLists are the foundation for persistent arrays and thus persistent hashtables in Clojure. These form the foundation of Clojure's Multi-Version Concurrency Control, which in turn is the basis of its STM.

These are not just persistent data structures, they are a very efficient implementation of the persistent data structure concept. Each incremental version of a list shares common data with the previous version. Of course you still pay a price for getting the built-in versioning, but this is minimized to some degree.

Scala fans should also be interested in persistent data structures. They are coming to Scala, which is supposed to be available in beta by the end of this month. The aforementioned Bagwell did his work at EPFL -- the same EPFL that is the epicenter of Scala. Indeed, Bagwell himself has contributed to improving the implementations of VLists in Scala. With VLists in tow, Scala STM is just around the corner.

Thursday, October 22, 2009

Concurrency Patterns: Java, Scala, and Clojure

Today I was at The Strange Loop conference. The first talk I attended was by Dean Wampler on functional programming in Ruby. Dean brought up the Actor concurrency model and how it could be done in Ruby. Of course I am quite familiar with this model in Scala, though it is copied from Erlang (which copied it from some other source I'm sure.) The next talk I went to was on software transactional memory and how it is implemented in Clojure. I had read about Clojure's STM in Stuart Halloway's book, but I must admit that it didn't completely sink in at the time. I only understood the basic idea (it's like a database transaction, but with retries instead of rollbacks!) and the syntax. As I've become more comfortable with Clojure in general, the idea has made more and more sense to me.

Now of course, no one size fits all. There are advantages and drawbacks to any concurrency model. A picture of these pros and cons kind of formed in my head today during the STM talk. I turned them into pictures. Here is the first one:

This is meant to be a measure of how easy it is to write correct concurrent code in Java, Scala, and Clojure. It is also meant to measure how easy it is to undersand somebody else's code. I think these two things are highly correlated. I chose to use these various languages, though obviously this is somewhat unfair. It wold be more accurate to say "locking/semaphores" instead of Java, the Actor model of Scala, and software transactional memory instead of Clojure -- but you get the point. So what does this graph mean?
Well obviously, I think Java is the most difficult language to write correct code. What may be surprising to some people is that I think Clojure is only a little simpler. To write correct code in Clojure, you have to figure out what things need to be protected by a dosync macro, and make sure those things are declared as refs. I think that would be an easy thing to screw up. It's still easier than Java, where you have to basically figure out the same things, but you must also worry about multiple lock objects, lock sequencing, etc. In Clojure you have to figure out what has to be protected, but you don't have to figure out how to protect it -- the language features take care of that.
So Clojure and Java are similar in difficulty, but what about Scala and the Actor model? I think this is much easier to understand. There are no locks/transactions. The only hard part is making sure that you don't send the same mutable object to different actors. This is somewhat similar to figuring what to protect in Clojure, but it's simpler. You usually use immutable case classes for the messages sent between actors, but these are used all over the place in Scala. It's not some special language feature that is only used for concurrency. Ok, enough about easy to write/understand code, there are other important factors, such as efficiency:

Perhaps this should really be described as memory efficiency. In this case Java and the locking model is the most efficient. There is only copy of anything in such a system, as that master copy is always appropriately protected by locks. Scala, on the other hand, is far less efficient. If you send around messages between actors, they need to be immutable, which means a lot of copies of data. Clojure does some clever things around copies of data, making it more efficient than Scala. Some of this is lost by the overhead of STM, but it still has a definite advantage over Scala. Like in many systems, there is a tradeoff between memory and speed:

Scala is the clear king of speed. Actors are more lightweight than threads, and a shared nothing approach means no locking, so concurrency can be maximized. The Java vs. Clojure speed is not as clear. Under high write contention, Clojure is definitely slower than Java. The more concurrency there is, the more retries that are going on. However, there is no locking and this really makes a big deal if there are a lot more reads than writes, which is a common characteristic of concurrent systems. So I could definitely imagine scenarios where the higher concurrency of Clojure makes it faster than Java. Finally, let's look at reusability.

By reusability, I mean is how reusable (composable) is a piece of concurrent code that you write with each of these languages/paradigms? In the case of Java, it is almost never reusable unless it is completely encapsulated. In other words, if your component/system has any state that it will share with another component, then it will not be reusable. You will have to manually reorder locks, extend synchronization blocks, etc. Clojure is the clear winner in this arena. The absence of locks and automatic optimistic locking really shine here. Scala is a mixed bag. On one hand, the Actor model is very reusable. Any new component can just send the Actor a message. Of course you don't know if it will respond to the message or not, and that is a problem. The bigger problem is the lack of atomicity. If one Actor needs to send messages to two other Actors, there are no easy ways to guarantee correctness.
Back in June, I heard Martin Odersky says that he wants to add STM to Scala. I really think that this will be interesting. While I don't think STM is always the right solution, I think the bigger obstacle for adoption (on the Java platform) is the Clojure language itself. It's a big leap to ask people to give up on objects, and that is exactly what Clojure requires you to do. I think a Scala STM could be very attractive...

Sunday, October 18, 2009

The IntelliJ IDEA Bomb

In case you missed the announcement from JetBrains yesterday, the popular IntelliJ IDEA has gone free and open source. Sort of. There is now a community edition, that is FOSS, and an enterprise edition that you must pay for. Why is this a big deal? Read on.

IntelliJ really revolutionized Java development. It wasn't the first IDE to allow for code completion, or even the first Java IDE to do this. However, it was definitely a pioneer in code refactoring. And this was huge. It took many of the code improvement ideas catalogued by Martin Fowler, and turned them into simple commands that any programmer could use. This also allowed for things like code navigation, where you could go from the usage of a class or a method to the implementation (or at least the declaration, if the usage only referred to the interface.) This was only the beginning. IntelliJ was a pioneer of bringing in the Java ecosystem. I remember how much easier IntelliJ made it to use Struts, for example.

This reminds me of my own personal use of IntelliJ over the years. When I first started working in enterprise Java development, I was working for a startup that built EJB applications targeted for WebLogic and WebSphere. For WebSphere, we used Visual Age for Java. For WebLogic, we just used Kawa (a bare bones editor.) I hated Visual Age with a passion. It scarred me badly. For years afterwards, I stuck to simple text editors.

Several year later (2003), I joined a fresh startup, KeepMedia (later renamed to MyWire). Prior to KeepMedia, I had a short stint with a .NET startup, Iteration Software (later renamed to Istante, and sold to Oracle.) There I used Visual Studio quite a bit, and appreciated the productivity gains it provided. So at KeepMedia, I took a look at various Java IDEs: JBuilder, JDeveloper, and IntelliJ. I was one of three programmers at KeepMedia, and one of my colleagues was really in love with Struts. He worked on the front of KeepMedia, but I worked mostly on our back end system that integrated magazine content from publishers. I didn't have to use Struts for that obviously, but I did occasionally help with the front end (we had three programmers after all.) I hated Struts, but IntelliJ made it more tolerable. The only negative about it was that it was flaky on Linux, but pretty much everything was flaky on Linux back then.

After I worked at KeepMedia, I worked at a consulting company, LavaStorm Engineering (2004). I had been using IntelliJ for a long time, and was convinced that everything else was crap. One of my colleagues introduced me to Eclipse. I was horrified. Most of Eclipse's codebase came from an all-Java rewrite of Visual Age (the version I had used was written in Smalltalk.) So even though it shared no code with the beast that I had hated, it shared a lot of the look-and-feel. For example, take a look at the outline view in Eclipse (any version, even the most recent.) This was completely taken from Visual Age. Even the icons are exactly the same. There was no way I was going to give up my IntelliJ for that thing...

A couple of years later, I was at another startup, Sharefare (which later changed its name to Ludi Labs.) Being a startup, there was no standardization arounds tools. However, everyone used Eclipse. I had warmed up to Eclipse (though I still preferred IntelliJ), and I had started writing Eclipse related articles for IBM. So I went with Eclipse there, as there were advantages to everyone using the same IDE (being able to share .project/.classpath, plugins, etc.)

After Ludi Labs went down in early 2007, I joined eBay. To say we're a major Eclipse shop, would be putting it mildly...
We're not the only major Java shop that has invested heavily in Eclipse plugins. If you're doing GWT, App Engine, or Android development, then you're probably using Eclipse too. I know I am. I didn't really start using IntelliJ again, until the past year. The reason was simple: Scala. It is the best IDE for doing Scala development currently. Actually IntelliJ's Scala offering vs. the competition, is similar to anything else. It's not that it necessarily has a lot more features, but it has similar and most importantly, it is of higher quality. Here's a great visualization of this key difference:

This is why the open sourcing of IntelliJ is important. It's not that we didn't already have great open source IDEs for Java. No, it's more about what does this mean for IntelliJ. Will the attention to detail go down hill? Will the stable plugin ecosystem be disrupted?

Of course, for most current and would-be users of IntelliJ, the availability of a free IntelliJ is really not big news. The free version is really quite limited. You're not going to build web apps, or apps that connect to databases, use web services, etc. Not with the community version, unless in-the-wild plugins became available that enable this. IntelliJ has awesome support for all of these things, but those parts of IntelliJ remain behind a price wall.

There is one notable exception here: Scala. Right now, IntelliJ has the best Scala support. It has more features and less bugs than NetBeans, and it is much more stable than Eclipse. Scala support is one of the features supported in the community edition. That means a lot of newbie Scala developers can just use IntelliJ. This is great news. I would not give NetBeans or Eclipse to Scala newbies, for the simple reason that both of them report syntax errors inconsistently. In other words, both are guilty of either not identifying an error, or identifying a false error (or both.) That's anathema for somebody learning a language. I'm able to put up with it, simply because I know just enough Scala to say "you're wrong IDE" at times. I've rarely seen this happen with IntelliJ.

Tuesday, October 13, 2009

Scala Manifests FTW

I came across a piece of code written by a colleague. It was a flexible XML/JSON parser. It would turn an XML or JSON structure into a map. The keys were strings. The values were either strings, lists, or maps. The lists could be lists of strings, lists, or maps. The maps had strings as keys and value as (wait for it) strings, lists, or maps. We had run across a bug recently. Usually a particular web service returned data that looked something like:
{ "details" : { "a" : "x", "b" : "y" } }
So we had code that looked like :
val response = // code that called the parser
val foo = response("details").asInstanceOf[Map[String,String]]("a")
However, one day we got some bad data:
{ "details" : "" }
So of course the earlier code blew up. I wanted to have something like this:
trait SafeMapTrait {
    def getString(key:String):String
    def getList(key:String):List[AnyRef]
    def getMap(key:String):Map[String, AnyRef]
}
Now this can be accomplished pretty easily:
class EasySafeMap(val map:Map[String, AnyRef]){
  def getString(key:String):String = {
    if (map.contains(key)){
      if (map(key).isInstanceOf[String]) map(key).asInstanceOf[String] else null
    } else null
  }
  // etc.    
}
There would be similar methods for lists and maps. I didn't like this, and thought I should be able to do better. Looking at the final solution, I'm not sure that I did. But I did learn some things about Scala Manifests... Before we get there, let's look at me first naive attempt to do better:
class NotSoSafeMap(val map:Map[String,AnyRef]){

  def getString(key:String):String = getType(key)
  def getList(key:String):List[AnyRef] = getType(key)
  def getMap(key:String):Map[String,AnyRef] = getType(key)

  private def getType[T](key:String):T  = {
    val value = map.getOrElse(key, null)
    if (value != null && value.isInstanceOf[T]) value.asInstanceOf[T] else null
  }

}
That would have been, huh? I really wanted to use a parameterized method for the extraction, comparison, casting. The problem with this is that there is no way to know the type T. You could explicitly add the parameter, i.e. getType[String](key) but it doesn't help because of erasure. I tried this instead:
class NotSoSafeMap(val map:Map[String,AnyRef]){

  def getString(key:String):String = getType(key,null)
  def getList(key:String):List[AnyRef] = getType(key,null)
  def getMap(key:String):Map[String,AnyRef] = getType(key,null)

  private def getType[T](key:String, default:T):T  = {
    val value = map.getOrElse(key, default)
    if (value.isInstanceOf[T]) value.asInstanceOf[T] else default
  }

}
I thought that this might be better because of the type information being given in the default value. This didn't work. Using the null default seemed dumb, but even adding defaults like the empty string, an empty list, etc. did not help. Erasure was once again kicking my ass. So it was time to learn about Manifests.
I had heard Jorge Ortiz talk about manifests previously. He has also written an excellent blog post about them. He told me that these were still "experimental" (i.e. undocumented) in Scala 2.7.x, but were officially part of the upcoming 2.8 release. Sounded good to me. Here is the solution I came up with:
class SafeMap(val map:Map[String,AnyRef]){
  import scala.reflect.Manifest

  def getString(key:String):String = getType[String](key) match {
    case Some(s:String) => s
    case _ => null
  }

  def getMap(key:String):Map[String, AnyRef] = getType[Map[String,AnyRef]](key) match {
    case Some(m:Map[String, AnyRef]) => m
    case _ => null
  }

  def getList(key:String):List[AnyRef] = getType[List[AnyRef]](key) match {
    case Some(list:List[AnyRef]) => list
    case _ => null
  }

  private def getType[T](key:String)(implicit m:Manifest[T]):Option[T] = {
    map.getOrElse(key, null) match {
      case a:AnyRef => if (m >:>  Manifest.classType(a.getClass)) Some(a.asInstanceOf[T]) else None
      case null => None
    }
  }
}
Ok, a few things to note here. First the local import of scala.reflect.Manifest. Again it's not a documented class, but it's in there. Now my getType method. Notice that it uses the function_name (param:type) (param:type) syntax. Also notice the implicit Manifest parameter. The callers don't add this, the compiler adds it for you. Next notice that it returns an Option class. I wanted it to just return T. However, I could not have a case where it returned null if T was the declared return type of the method. So I went with Option. Finally, notice the Manifest magic. That's the m >:> Manifest.classType(a.getClass). The right hand side of the call uses a factory method in the Manifest singleton object, to create a Manifest for the (class of the) value coming back from the map. The >:> operator checks to see if the right hand side represents a subclass of the left hand side. This is important. For the getMap method, the manifest will represent the Map trait (actually a Java interface in this case.) The call to a.getClass gives you the runtime class of a. Of course this runtime class implements the Map trait, but you can't do equality comparison. Hence the >:> operator. One last thing, notice that the getString method uses the explicit getType[String]. You would think that the compiler could infer this since the left hand is explicitly declared as a String. It doesn't. When I tried it without the explicit type parameter, my manifest would always Manifest[Nothing].

Tuesday, September 15, 2009

Scala Word Clouds

Just before I went to bed last night, I got an interesting tweet from Dave Briccetti. You can see from the link in tweet what the problem was, and if you look at the bottom of the page, you can see the clever solution by Jorge Ortiz. Right after I read the tweet from Dave, I asked my wife how long until the end of the TV that she was watching. She said seven minutes. I decided that wasn't enough time, so I went to sleep.

I didn't look at the problem again until today during lunch. By then, I saw Jorge's solution and I saw no way to improve on it. However, this inspired a little experiment for me. Actually it mostly reminded me of a programming problem that I was given a few years back by a well known company that will remain anonymous as they are very touchy about people talking about their interviewing process. The problem was essentially the same as Dave's problem -- take a bunch of sentences (instead of tweets), figure out the most frequent words. There was a twist though -- I was not allowed to use any Java collection. Actually I could not use anything outside of the java.lang package, so I was pretty much stuck with arrays...

At the time, I figured that I would have to either implement my own hash table and a sort, but then I came up with a way to solve the problem with no hash table. So when I saw Dave's problem, I thought "wow my old solution will be so much more awesome in Scala":

def createCloud(strings:Seq[String])={
  val x:List[String] = Nil
  var s = ("",0)
  val cloud = new scala.collection.mutable.ArrayBuffer[Tuple2[String,Int]]
  strings.map(_.split("\\s").toList.sort(_ < _)).foldLeft(x){mergeSorted(_, _)}.foreach( (a) => {
    if (a == s._1) 
      cloud(cloud.length - 1) = (a, s._2 + 1)
    else
      cloud += (a,1)
    s = cloud(cloud.length - 1)
  })
  cloud.toList.sort((a,b) => a._2 > b._2)
}
Ok, so I cheated this time and used an ArrayBuffer instead of array. In my solution years ago, I made an array whose size was the total number of words (worst case scenario.) I didn't have tuples, so I had to use two arrays. Anyways, the idea here is to split each sentence into a list of words, and then sort that list of words. Now merge together each sorted list, and keep things sorted using a merge sort. Of course you could turn everything into a big list first using flatMap, similar to Jorge's solution, and then sort. Anyways, you then roll up the sorted list, taking advantage of the fact that it is sorted. You never have to look up a word to see if it has been encountered before, because of the sorting. You just keep track of the last word encountered, and just compare the next word to the last word to determine if the word has been previously encountered and thus needs to be incremented. Of course you must be wondering about the aforementioned merge sort. Here is a Scala implementation of it:
def mergeSorted[A](comparator: (A,A) => Boolean)(x:List[A], y:List[A]):List[A] = {
  if (x.isEmpty) y
  else if (y.isEmpty) x
  else if (comparator(x.head, y.head)) x.head :: mergeSorted(comparator)(x.tail, y)
  else y.head :: mergeSorted(comparator)(x, y.tail)
}
  
def mergeSorted[A <% Ordered[A]](x:List[A], y:List[A]):List[A] = mergeSorted((a:A,b:A) => a < b)(x,y)
While writing this, I decided that I should touch up my knowledge of currying and partially applied functions in Scala, so I read about this on Daniel Spiewak's blog. The first version of mergeSorted is very generic and takes a comparator. This is what lead me into the world of currying. Actually, my first version of the above looked like this:
class MergeSorter[A](comparator: (A,A) => Boolean){
  def apply(x:List[A], y:List[A]):List[A] = {
    if (x.isEmpty) y
    else if (y.isEmpty) x
    else if (comparator(x.head, y.head)) x.head :: apply(x.tail, y)
    else y.head :: apply(x, y.tail)    
  }
}
I created a class to encapsulate the sort. In some ways I like this syntax a little better, but it made the second version of the sort a little uglier. I wanted a "default" sort that could be used for anything implementing Scala's Ordered trait. It is this second version that I used in the createCloud function above, as strings can be implicitly converted to RichString which implements Ordered.

It is still easy to use mergeSorted with objects that do not implement Ordered, or to provide a different sort function for those that do:
// provide your own sort
val sList = 
  mergeSorted((s1:String, s2:String) => s1.toLowerCase < s2.toLowerCase)(List("aa","aB","Cb"), List("Ab", "ca"))
// create a reusable sort
val listSort = mergeSorted( (a:List[Any], b:List[Any]) => a.length < b.length) _
val z = listSort(List(List("a","b"), List(1,2,3)), List(List(true,true,true), List("x","y","z","w")))
In the first example, we create a new sort by providing different sorting function that is case insensitive. It is then applied to a list of strings. In the second example, we create the sorting function and assign to a val. Notice the mysterious underscore _ at the end of the statement. This creates a partially applied function. If you forget the underscore, the Scala compiler actually provides a pretty useful error message about this. With this, we can then apply the function however we want.

Sunday, September 13, 2009

Some Tasty Scala Sugar

I have finally been getting around to working some more on Scala+Android. A lot needs to be done, especially since I am talking about it next month at the Silicon Valley Code Camp. Scala advocates (and I guess that includes me) often say that is very easy for application developers to use Scala, i.e. that "most" developers are never exposed to the more complex aspects of the language. The folks who "suffer" are APIs developers. I've mostly fallen into the first camp (app developers), but now that I'm trying to create a Scala API on top of the Android SDK, I get exposed to more aspects of Scala. Here are a couple of things that I found interesting.

Tuesday, September 01, 2009

Metaprogramming in Groovy and Scala

This morning I read an excellent pair of articles by Scott Davis on metaprogramming in Groovy. It reminded me of some of the different ways to approach this kind of problem. The second article in particular detailed a new feature in Groovy 1.6, the delegate annotation. Here is an example of using this feature:

public class Monkey {
def eatBananas(){
println("gobble gulp")
}
}

public class RoboMonkey{
@Delegate final Monkey monkey
public RoboMonkey(Monkey m){
this.monkey = m
}
public RoboMonkey(){
this.monkey = new Monkey()
}
def crushCars(){
println("smash")
}
}

This allows for the following usage:

def monkey = new RoboMonkey()
monkey.eatBananas()
monkey.crushCars()

What is interesting here is that you cannot just create a Monkey and call crushCars on it. RoboMonkey is meant to be a way to add functionality to the Monkey class (in this case the crushCars method), while still being able to treat the RoboMonkey as if it was a Monkey. I point this out because of how this would be done in Scala:

class Monkey{
def eatBanaas = println("gobble gulp")
}

class RoboMonkey(val monkey:Monkey){
def this() = this(new Monkey)
def crushCars = println("smash")
}

object Monkey{
implicit def makeRobo(monkey:Monkey):RoboMonkey = new RoboMonkey(monkey)
implicit def makeNormal(robo:RoboMonkey):Monkey = robo.monkey
}

Now arguably this is more complicated, because you have to create the Monkey object in addition to the Monkey class. What Scala requires is that you get those implicit functions (makeRobo, makeNormal) in scope. This is just one way to do that. The added benefit is the following usage:

val monkey = new Monkey
monkey.eatBanaas
monkey.crushCars

I guess the Groovy way to do this is to use the meta class:

Monkey.metaClass.crushCars = {-> println("smash")}
def monkey = new Monkey()
monkey.eatBananas()
monkey.crushCars()

In both languages you can accomplish similar things, though with very different ways to implement it. Meta programming in Groovy is very powerful, and there are things you can do in Groovy that you cannot do in Scala -- see methodMissing and propertyMissing. Of course you lose some of the benefits of static typing, but software is all about tradeoffs.

Saturday, August 15, 2009

A Tipping Point for Scala

This past week's BASE meeting was all about IDE support for Scala. You can read my notes, posted to the Scala tools mailing list. I was very surprised by this meeting. Not by the findings, if you will, as I have used all three IDEs at various times in the last few months. What I was surprised by was the feedback from the group, and the logical conclusion of this discussion: Scala is near a tipping point, but IDE support is holding it back.

First off, there was a large turnout for the BASE meeting. I would say it was the second largest BASE meeting, only bested by the June meeting where Martin Odersky spoke. It is funny, because I think our esteemed organizer, Dick Wall, had been intending this topic to be like "well if we have nothing else to talk about it, we'll talk about IDEs." If there had been an alternative topic brought up, I don't think people would have objected. After all, developers and their attitude towards IDEs are contradictory. Most developers I know would tell you that IDE support for a language is very important, but they would also act indifferent about IDEs when it came to them personally. It's like "all of those other developers really need IDEs, but I would be ok without them." We all know our APIs so well, that we don't need code completion, right? And we don't write bugs, so a debugger is of limited use, right? However, I am sure that if the meeting had not been about IDEs, then there would have been less people in attendance.

So why so much interest? Like it or not, but Scala's primary audience right now are Java developers. Yes, I know Scala appeals to some dynamic language folks, and to some functional programming folks, and that its .NET implementation is being updated, but you could sum up all of the Scala developers from those disciplines and it would be dwarfed by the Java contingency. Scala has a lot of appeal on its own merits, but it is always going to be framed against Java. Scala's most (only?) likely path to mass appeal is as "the long term replacement for java."

So when you talk about developers choosing to use Scala, you are really talking about Java developers choosing to use Scala instead of Java. This is not the only use case, but not only is it the most common use case, it is arguably the only use case that matters. Without this use case, Scala will at most be a marginal language, a la Haskell, or OCaml, or Groovy for that matter.

Back to my point... Java developers need great IDEs. This is not because they "need" help from their IDE because of some lack of skill. No, it's because they have had great IDEs for a long time now, and thus it has become a requirement. I remember when I joined Ludi Labs (it was still called Sharefare at the time) several years ago, we had a Java programming quiz. Candidates were given a clean install of Eclipse to use for writing their programs. We could have given them Vi or Emacs and a command line, but that would have been asinine and foolish. IDEs are an integral part of Java development.

I knew all of the above before the BASE meeting, but what I did not know was how many development organizations were at a critical juncture when it comes to Scala. For many folks, Scala, the language, has won the arguments. Whatever perceived extra complexity that it has, has been judged as worth it. Whatever challenges there may be in hiring people to develop in Scala can be mitigated. Legacy code is not even a factor, as integration with existing Java code is trivial. Maybe it's bleak future of Java, or maybe it's the high profile use of Scala at Twitter. Who knows, but Scala is poised to take a big piece of the Java pie.

Thus the missing piece is IDE support. Development orgs can't switch to Scala without IDE support, and the support is not there yet. That's the bad news. The good news is that Scala is ready to explode once the IDE support is there. There are a lot of folks out there ready to adopt Scala simply as a "better Java." They just need an IDE that is on par with Java IDEs. That is the standard.

All of that being said, there is a lot of concern around the IDEs. Many people expressed to me that they are worried that IDE progress is being coupled to the release of Scala 2.8. That seems reasonable at first, but what happens if 2.8 is not released until 2010 sometime? Will Scala lose its momentum and window of opportunity?

Monday, August 03, 2009

The Strange Loop

In October, I am speaking at the inaugural Strange Loop conference in St. Louis. This is not your run of the mill conference. It is organized by Alex Miller, who you might have seen speak the last couple of years at JavaOne. The speaker list is sweet: Alex Payne and Bob Lee are doing the keynotes, with sessions by Charles Nutter, Dean Wampler, Stefan Schmidt, Guillaume Laforge, Jeff Brown, and Alex Buckley. I am doing a talk on iPhone/Android development. It will probably be pretty boring compared to the other sessions. I may try to spice it up with some shameless plugging of Scala+Android balanced by some Fake Steve Jobs quotes about Android.

Sunday, July 19, 2009

My HTML in Scala Wishlist

One of my favorite features in Scala is its native support for XML(link warning, shameless plug.) I love the "projections DSL" (for lack of a better term) that is the scala.xml package. It is great for slicing up XML, and is almost like having native XPath and XQuery built into the language. Using XML in Scala for HTML generation is not as universally appealing to people. Some people think it encourages mixing presentation and business logic and is not "clean." I have even heard folks say that this is a critical flaw in Lift. I completely disagree. Of course I also think MVC is an anti-pattern, but anyways... HTML support in Scala is great in my opinion, but could be even better. Here is my wishlist.
  • Better understanding HTML semantics. For example, it is great that the compiler won't let me do something like this <div>Hello world</di>. However, I would not want it to even let me do <di>Hello world</di>. Similarly I should not be able to stick a thead outside of a table. If I have an anchor tag with an href attribute, then I'd like the compiler to do some validation on the value. Am I asking for too much? Well, that's I'm supposed to do, right? Is some/all of this possible with the help of XML schema support? Maybe so.
  • CSS. HTML without CSS ceased to exist a long time ago. With all of the DSL possibilities in Scala, is there some way to allow CSS literals and create some type safety for me. For example, I if I fat finger the name of a style on a class attribute, the compiler should catch this. Also I can say from experience that introducing an object-oriented model is really useful in CSS. Finally, the XPath/XQuery-like support in Scala XML makes me think that something similar could be done for CSS selectors.
  • JavaScript. Honestly I don't know what I would want here. I like JavaScript, but I know the idea of being able to write your web application in a single language is very appealing to a lot of people. If nothing else, it would once again be nice if the compiler would prevent me from doing something like <button onclick="foo()" ...> if there as no function called "foo" in scope.
  • Tooling. Most modern IDEs support things like syntax highlighting, code completion, error checking, code formatting for HTML and XML+schema. I expect at least this much for XML/HTML literals in Scala. I'd also like to be able to step through and debug literals. Refactoring on subtrees of XML (extract method for example) would be sweet. Oh, and large XML literals should not overwhelm my IDE :-)

Wednesday, July 15, 2009

Functional Paradigms on Android using Scala

I am slowly plugging away on my little side project. One of the fun things about "pimping" an API, is you can tweak the design of the API. For example, I was modifying TextView, a class used for creating text input boxes of various sorts in Android. One common use case in Android is to create an auto-complete text box. To do this, you need to subclass TextView. Then you can override various lifecycle methods like onEditorAction, onKeyUp, etc. However, I thought that it might be nice to do something like this:

val textView = new TextView
textView.keyUp = (code:Int, event:KeyEvent) => {
// do stuff to your TextView
}

As Debasish pointed out to me, this is a lot like you would do in JavaScript. Is it better? It's a little easier in certain cases. If you just need to change the behavior of a single TextView in a very specific way, then this would seem like the way to go. Subclassing TextView just to override a single method and then using that subclass exactly once definitely seems like overkill. Now if you really wanted to create a new kind of TextView that you could use in multiple places, then subclassing makes sense.

One thing worth mentioning is how this kind of thing is handled in Cocoa on the iPhone. The Cocoa way would be to create an untyped delegate. In the delegate you could override the lifecycle methods that you were interested in. It's untyped, so you have to know the right names. You still wind up defining a new class, even though you usually only want to customize the behavior of one particular instance. Also, it's a little clunky as the TextView instance would have to be passed in to each of the delegate methods, so that your delegate could actually do something to the TextView. I find this a little better than having to subclass as you do in Android, but once again I think using Scala can yield some nice results.

There was another interesting case that popped up while sugaring TextView. I wanted to sugar the setCompoundDrawablesWithIntrinsicBounds methods. This method is overloaded, taking either four integers or four Drawables. Here is what I first thought of doing:

def compoundDrawablesWithIntrinsicBounds_=(bounds:Tuple4[Int,Int,Int,Int]) {...}
def compoundDrawablesWithIntrinsicBounds_=(bounds:Tuple4[Drawable,Drawable,Drawable,Drawable]) {...}
// this allows
myTextView.compoundDrawablesWithIntrinsicBounds = (1,2,3,4)
// or
myTextView.compoundDrawablesWithIntrinsicBounds = (d1,d2,d3,d4)

Of course this won't work. The type parameters of Tuple4 are going to be erased by the compiler, and the two methods are indistinguishable. However, The Amazing Mr. Spiewak gave me a solution:

def compoundDrawablesWithIntrinsicBounds_=[T <% Either[Int, Drawable]](bounds:Tuple4[T,T,T,T]){
bounds._1 match {
case Left => baseTextView.setCompoundDrawablesWithIntrinsicBounds(bounds._1.left.get, bounds._2.left.get,
bounds._3.left.get, bounds._4.left.get)
case Right => baseTextView.setCompoundDrawablesWithIntrinsicBounds(bounds._1.right.get, bounds._2.right.get,
bounds._3.right.get, bounds._4.right.get)
}
}

In addition to this, you need implicits to convert an Int to a Left and a Drawable to a Right. The API looks a lot different, but it allows exactly the same usage that I wanted. Another case of how you need some non-trivial Scala (though I wouldn't be surprised if there is a more elegant way than the above) to create an API that is super simple to use.

Thursday, June 18, 2009

Scala and Android

Recently I was working on an article for developerWorks about using Scala on Android. Stay tuned to developerWorks to see that article. Since then I have started a little side project to create a set of Scala libraries for working on Android. Now you may ask, why bother with this? Is this just a classic case of mental masturbation? Well, no, at least I hope not.

I think Scala is well suited as a programming language for mobile devices. I am not an expert on all mobile platforms, my experience is mainly around iPhone and Android development. For the iPhone you are developing in Objective-C, and for Android in Java. Now in general, I think the Android development environment is superior in almost every way to the iPhone. IDE is mostly a matter of taste, but Eclipse+ADT certainly has a lot more features than XCode. Debugging is definitely better on the Android. It is much easier to debug on a device, and the emulator gives you more ways to vary the environment (cpu/memory/network resources) for more realistic emulation. The only thing advantage I would give to iPhone is that XCode and the iPhone emulator both startup much faster than Eclipse and the Android emulator.

Programming in Java gives you a lot more options than Objective-C. There are just a lot more open source Java libraries out there to help you avoid reinventing wheels. In addition, I think the platform specific APIs added by Google are much better than the ones on the iPhone. A simple example is XML processing. Java and Objective-C both already have XML parsing APIs. However, Android adds extra APIs as they recognize this as a common task for mobile applications that leverage web services. They added convenience APIs for SAX based parsing. They also leveraged a StAX-style parser by leveraging an Apache project. There is nothing like this in iPhone, despite the fact that the iPhone has been around longer and should be more mature. It may be an oversimplification, but I think this comes back to the corporate cultures that produced these platforms. Apple has always been a company that uses software to sell hardware, and this continues with the iPhone. Google is much more of a software company, with strong ties to open source software. It should be no surprise that Android is a much more development friendly platform. And don't even get me started when it comes to garbage collection (Android has it, iPhone does not, despite the fact that Objective-C 2.0 has it.)

So Android is great, why do we need Scala? First there is verbosity. Actually both Objective-C and Java are quite verbose -- Objective-C is even more verbose than Java. Scala is a much more concise language than either. Objective-C can actually be more dynamically typed than Java/Scala, but this rarely makes the code any more concise or readable, and often makes it harder to debug. Next, neither Objective-C or Java have much support for functional programming. This is a bigger deal than you might think at first. UI frameworks are full of events that you must write code to handle. Writing a function to handle those events is so much easier than writing a class. On the iPhone you can often set any type of object as an event handler, and you just have to know the magic methods to implement. On Android, you usually have an interface to implement, and that interface has a single method. This is actually better than what you do on the iPhone, but the interface and its implementation class are both classic cases of boilerplate.

Thus it is my goal to make it possible to write Android applications in a much more concise way by using Scala. In general, you can replace Java code with Scala that is much easier to read and understand. I think this will be especially true on Android, and the contrast will be even more true.

The last reason that I think Scala is a great language for mobile development is threading. Both iPhone and Android support multi-threading, as any UI framework must. Any good UI will perform any non-trivial operations on a background thread, to keep the main UI thread responsive to the user. However, threads are tricky, especially this kind of multi-threading. Updating your UI from a background thread is a recipe for disaster. Android enforces this by using a Handler on the main thread that your background threads send messages to. If you are familiar with Scala then message passing as a way to implement multi-threading should ring a bell. This is exactly how actors work in Scala. I haven't tried using Actors on Android yet, but it sure seems like a good fit.

So there is some of the motivation. I think Scala on Android could be really compelling. My side project is pretty small so far: a handful of wrapper classes and implicit conversions to bring some syntactic sugar and functional programming into the mix on some common Android classes. The project also has a special build of the Scala library that I found necessary to get the library to work on Android and its compiler. I am hoping to add more sugar and to add the actors ideas listed above. I am also hoping to come up with a nice way to integrate something like Proguard to optimize the bytecodes that get compiled for Android.

Sunday, June 07, 2009

Scala Concurrency

I got to see Scala creator Martin Odersy speak twice last week. First time was at the June BASE meeting, and the second was at the Scala LiftOff. At both meetings, he stated that he wants Scala to be the best language for concurrent programming. He also mentioned two types of concurrency. One is the type you hear about most often, where processes work in parallel and need to simultaneously affect some shared data or global state. This is the kind of problem that becomes hard to deal with using primitives common in Java and C++, but can be a little easier using Actors in Scala. The other kind is where a big process needs to be split up into smaller ones that can be run in parallel. It wasn't clear to me if Actors were a good fit for this or not, or maybe improvement is needed. So I decided to experiment with this.

I took one of the algorithms I wrote for my JavaOne talk on language performance, and I used Actors to do some processing parallel. In my original "driver" code, I took a list of files and sequentially passed each file to the sort class. In the new code, I created an Actor to pass the file to, and the Actor then fanned out the processing to an internal list of Actors that could sort the words in the file. When sorter Actors finish, they send back the results to the "master" Actor. Here is the code:

// messages
case class FileToSort(fileName:String)
case class SortResults(fileName:String, words:List[String])

class WordSortActor(master:Actor) extends Actor{
start
def sortedWords(fileName:String) = {
val dataFile = new File(fileName)
var words:List[String] = Nil
Source.fromFile(dataFile).getLines.foreach(
_.split(" ").foreach(words ::= _))
words.sort(_.toLowerCase < _.toLowerCase)
}
def act {
loop {
react {
case FileToSort(fileName:String) => {
println("Sorting file = " + fileName +
" on thread " + Thread.currentThread.getName)
val words = sortedWords(fileName)
master ! SortResults(fileName,words)
// why doesn't sender work instead?
//sender ! SortResults(fileName, words)
}
}
}
}
}

class SortMaster(docRoot:String, numActors:Int) extends Actor{
private
var sorted : List[List[String]] = Nil
var workers:List[ScalaWordSort] = Nil
var latch:CountDownLatch = null
start

def beginSorting {
workers = (for (i <- 0 until numActors)
yield new WordSortActor(this)).toList
val dir = new File(docRoot)
var cnt = 0
dir.list.foreach(file =>{
workers(cnt % numActors) ! FileToSort(docRoot + file)
cnt += 1
})
latch = new CountDownLatch(cnt)
}

def waitUntilDone = latch.await

def act {
loop {
react {
case SortResults(fileName:String, words:List[String]) => {
sorted ::= words
latch.countDown
println("Received results for file=" + fileName)
}
}
}
}
}


So I don't know if this is a good use of Actors or not. Seems like you might need to tune the number of threads in the thread pool sitting behind the pool of Actors. It seems like the pattern might be useful. It takes advantage of Actors for combining the results done in parallel. The results of each sort are sent back to the master, who could then do useful things. In this example, it only dumps the results into a list, but you could imagine some kind of coordination, like adding to a map, that would require locking in a typical Java implementation.

One other strange thing, in the WordSortActor, I tried to send the results back to the sender, but this did not seem to work. I had to keep an explicit reference to the master Actor, and then it worked fine. One other annoying thing that came out of this little exercise ... I discovered how hard it is to debug Actors. I was using IntelliJ, and it was completely useless to set a break point in an Actor. This is not the case with Java threads. I did not try Eclipse or NetBeans.

Friday, June 05, 2009

JavaOne Talk: Performance Comparisons of Dynamic Languages on the Java Virtual Machine

Below are the sldies. Major thanks to Charlie Nutter for great feedback and advice on the Ruby code and tuning JRuby performance. Thanks to Chouser and Timothy Pratley for help with the Clojure code. And major thanks to Brian Frank for help with the Fan code.

Thursday, June 04, 2009

Liveblogging of Scala Actors Talk from JavaOne

Today I was attending a talk on Scala Actor by Phillipp Haller and Frank Sommers. Tyler asked me to live blog it. I thought that was a good idea, and that Twitter would be the way to do it. This did not work out, as Twitter shut me down for too many tweets in a given hour. So now I am posting the full live blog, both the tweets that went through and the ones that Twitter rejected. However, pulling out the text of all of the good tweets is not totally straightforward. I figured the easiest was to do this was to use some Scala with the Twitter API via the REPL:


scala> import java.net._
import java.net._

scala> import scala.xml._
import scala.xml._

scala> val url = new URL("http://twitter.com/statuses/user_timeline.xml?since_id=2035817713&max_id=2036208882&screen_name=michaelg&url: java.net.URL = http://twitter.com/statuses/user_timeline.xml?since_id=2035817713&max_id=2036208882&screen_name=michaelg&count=75

scala> val conn = url.openConnection conn: java.net.URLConnection = sun.net.www.protocol.http.HttpURLConnection:http://twitter.com/statuses/user_timeline.xml?since_id=2035817713&max_id=2036208882&screen_name=michaelg&count=75

scala> val tweets = XML.load(conn.getInputStream) tweets: scala.xml.Elem =


scala> (tweets\\"text").foreach{ node => println(node.text) }


The result of the last call produces the following:

This allows for many more actors than worker threads #javaone
Work stealing makes this even more efficient. Let worker threads steal work from other worker threads #javaone
So decouple from threads using thread pools from java.util.concurrent. #javaone
A naive impl of thread-per-actor has overhead from thread creation, context-switching and memory use. #javaone
How event-based actors decouples threads and actors. How the execution enviro is setup. How the DSL works #javaone
Now time to go under the hood of Scala Actors #javaone
Only use receive when you must return a value from an Actor #javaone
Use react whenever possible and only use receive when necessary #javaone
When to use receive vs. react? #javaone
Replace while(true) { ... } with loop { ... } #javaone
Go from 5,000 actors/JVM to 1,000,000/JVM using event based Actors #javaone
Replace receive with react, receiveWithin with reactWithin, and while(condition) with loopWhile (condition) #javaone
To scale to a large number of Actors, go for event based Actors #javaone
The sender of a message is still sent across the network #javaone
Everything else works exactly the same as it would for local actors #javaone
Use the select function to get a proxy to the remote Actor #javaone
To make an actor remote is very easy. Call the alive function and the register function #javaone
So far everything has been within a single VM, but Scala Actors scales using remote actors #javaone
To specify timeouts, use receiveWithin. This takes a timeout and sends a TIMEOUT message #javaone
All of this makes it easy to keep track of subscribers and send them messages, like new chat messages #javaone
You can also wait for futures using receive function #javaone
When you try to access the future, it will then block if the reply has not been received #javaone
For async with assurance that the message was received using double-bang !! Returns a future representing the reply #javaone
Synchronous messages are allowed too using !? This blocks until reply received #javaone
Bang is asyc, returns immediately and passes a reference to the sender #javaone
To subscribe a user to the chat room, just send a message using the "bang" operator (method): ! #javaone
There is another API shortcut for creating an actor inline using the "actor" function #javaone
Creating a subscription is just a matter of sending message stating as much. Lets recipient capture the state #javaone
Common pattern of "child actors", benefits from asych communication of actors #javaone
Make each user an Actor as well to maintain state, such as who sent a message to the chat room #javaone
Now going back to the chat application to demonstrate Actor subscriptions #javaone
Patterns are tried in order, and the first match wins. No fall-through. Looks like a factory method call. #javaone
This is all based on Scala's pattern matching, think Java's switch statement but on steroids #javaone
If no message in mailbox, the actor will suspend until there is a message. Syntax for matching messages is super simple #javaone
Use the receive function to process messages within the act method of your Actor. Receive gets a message from the Actor's mailbox #javaone
Defining messages is made easy by using Scala's case classes. These are pure data classes typically. #javaone
Creating an actor is very similar to creating a thread in Java. #javaone
Creating an Actor is easy, just do 3 things. Extend Actor. Implement act method. Start the actor. #javaone
Frank showing a chat application architecture that uses Actors "the quintessential Actor system" #javaone
Actor implemented entirely as a DSL, but part of Scala std. library. Used in big systems such as Twitter #javaone
Actors can be local or remote, no change to programming model. #javaone
Actors are event based in Scala, not tied to Java threads. So much more lightweight. A single JVM can have millions of actors #javaone
Actors use several Scala language features: pattern matching, closures, traits/multiple inheritance, and Java interop. #javaone
Actors in Scala are probably the closest implementation of Erlang model on the JVM #javaone
OTP: library of Erlang modules for building reliable distributed systems #javaone
Erlang has support for Actors built-in. Erlang VM is process-oriented, makes Actor model very scalable. #javaone
Best known example of Actors though is from Erlang, "a pure message passing language" #javaone
Actors are an established pattern. Research dates to 70s at MIT in Smalltalk and Simula #javaone
Mutable messages are also possible. Actors can also be local or remote -- same programing model #javaone
Actors share nothing, private state. Thus no synchronization. Actors send messages, but the message are immutable #javaone
Actors are active objects and has a mailbox. Actors consume messages, perform actions but the code is sequential within an actor #javaone
There is an existing concurrency model that fits with the described solutions: Actors. #javaone
The solution: DSL + sequential programs. Shared nothing. Message passing. Asynchronous communication #javaone
Why learn about Scala actors? Concurrency presents opportunity and problems for developers, especially large teams of developers #javaone
Frank and Phillip are writing a book on Scala actors #javaone
Phillip Haller and Frank Sommers talking about Scala actors now #javaone

So that's the good tweets. Here are the ones that did not go through...


The actor method takes a closure creates an Actor
The react method saves pattern matching as a closure to be executed later
Uses exception for control flow, any code after react will not execute
Actors usually execute on different threads, but you can control what threads
This is useful for threadLocals and for Swing
A demo of SwingActor
Example of using link and trapExit, but not much detail of this shown
Scala Actors provide a simple programming model to write concurrent code on the JVM
Many projects already use actors: Twitter, Lift, Scala OTP (@jboner), partest
Future of Scala Actors in 2.8
Pluggable schedulers -- create actors with daemon-like behavior for example
Integrating exceptions and Erlang-style error handling. Get the best of both Java and Erlang error handling
Support for continuations. Currently requires optional compiler-plugin. Allows for more flexible control structures
Actor migrations (from machine to machine) could also be done with continuations
More static checking of messages, in particular an @immutable annotation
Actor isolution -- no accidentally sharing mutable state
This will allow for better passing mutable messages for performance benefits
No problem passing large methods, as they are passed by reference

Wednesday, May 20, 2009

JavaOne Talk: Scala Reversible Numbers

See this post about why this code is being shown. See this post for the Java version to compare against.

class ScalaReversible(max:Int){
import java.lang.Character._
lazy val numReversible = (11 to max).filter(reversible(_)).size
private
def reverse(n:Int)=n + parseInt(n.toString.reverse)
def allOdd(n:Int) = n.toString.map(digit(_,10)).forall(_ % 2 == 1)
def reversible(n:Int) = allOdd(reverse(n))
}

JavaOne Talk: Scala Word Sort

See this post about why this code is being shown. See this post for the Java version to compare against.

class ScalaWordSort(fileName:String){
lazy val sortedWords = {
Source.fromFile(dataFile).getLines.foreach(_.split(" ").foreach(words ::= _))
words.sort(_.toLowerCase < _.toLowerCase)
}
private
val dataFile = new File(fileName)
var words:List[String] = Nil
}

Saturday, May 09, 2009

JavaOne Talk: Scala Prime Sieve

See this post for an explanation on why this code is being shown and what kind of responses would be useful. See this post for the Java algorithm to compare against.

import java.lang.Integer._
import java.lang.Math._
import scala.collection.mutable.ArrayBuffer

class ScalaPrimes(val size:Int){
val primes = new ArrayBuffer[Int]
var cnt = 0
init
def last = primes(primes.size - 1)

private
def init {
var i = 2
val max = calcSize
val nums = new ArrayBuffer[Boolean]
for (i<-0 until max) nums += true
while (i < max && cnt < size){
val p = i
if (nums(i)){
primes += p
cnt += 1
(p until max/p).foreach((j) => nums.update(p*j,false))
}
i += 1
}
}
def calcSize = {
var max = 2
while ( (max/log(max)) < size && max < MAX_VALUE && max > 0) {
max *= 2
}
max
}
}