Thursday, October 22, 2009

Concurrency Patterns: Java, Scala, and Clojure

Today I was at The Strange Loop conference. The first talk I attended was by Dean Wampler on functional programming in Ruby. Dean brought up the Actor concurrency model and how it could be done in Ruby. Of course I am quite familiar with this model in Scala, though it is copied from Erlang (which copied it from some other source I'm sure.) The next talk I went to was on software transactional memory and how it is implemented in Clojure. I had read about Clojure's STM in Stuart Halloway's book, but I must admit that it didn't completely sink in at the time. I only understood the basic idea (it's like a database transaction, but with retries instead of rollbacks!) and the syntax. As I've become more comfortable with Clojure in general, the idea has made more and more sense to me.

Now of course, no one size fits all. There are advantages and drawbacks to any concurrency model. A picture of these pros and cons kind of formed in my head today during the STM talk. I turned them into pictures. Here is the first one:

This is meant to be a measure of how easy it is to write correct concurrent code in Java, Scala, and Clojure. It is also meant to measure how easy it is to undersand somebody else's code. I think these two things are highly correlated. I chose to use these various languages, though obviously this is somewhat unfair. It wold be more accurate to say "locking/semaphores" instead of Java, the Actor model of Scala, and software transactional memory instead of Clojure -- but you get the point. So what does this graph mean?
Well obviously, I think Java is the most difficult language to write correct code. What may be surprising to some people is that I think Clojure is only a little simpler. To write correct code in Clojure, you have to figure out what things need to be protected by a dosync macro, and make sure those things are declared as refs. I think that would be an easy thing to screw up. It's still easier than Java, where you have to basically figure out the same things, but you must also worry about multiple lock objects, lock sequencing, etc. In Clojure you have to figure out what has to be protected, but you don't have to figure out how to protect it -- the language features take care of that.
So Clojure and Java are similar in difficulty, but what about Scala and the Actor model? I think this is much easier to understand. There are no locks/transactions. The only hard part is making sure that you don't send the same mutable object to different actors. This is somewhat similar to figuring what to protect in Clojure, but it's simpler. You usually use immutable case classes for the messages sent between actors, but these are used all over the place in Scala. It's not some special language feature that is only used for concurrency. Ok, enough about easy to write/understand code, there are other important factors, such as efficiency:

Perhaps this should really be described as memory efficiency. In this case Java and the locking model is the most efficient. There is only copy of anything in such a system, as that master copy is always appropriately protected by locks. Scala, on the other hand, is far less efficient. If you send around messages between actors, they need to be immutable, which means a lot of copies of data. Clojure does some clever things around copies of data, making it more efficient than Scala. Some of this is lost by the overhead of STM, but it still has a definite advantage over Scala. Like in many systems, there is a tradeoff between memory and speed:

Scala is the clear king of speed. Actors are more lightweight than threads, and a shared nothing approach means no locking, so concurrency can be maximized. The Java vs. Clojure speed is not as clear. Under high write contention, Clojure is definitely slower than Java. The more concurrency there is, the more retries that are going on. However, there is no locking and this really makes a big deal if there are a lot more reads than writes, which is a common characteristic of concurrent systems. So I could definitely imagine scenarios where the higher concurrency of Clojure makes it faster than Java. Finally, let's look at reusability.

By reusability, I mean is how reusable (composable) is a piece of concurrent code that you write with each of these languages/paradigms? In the case of Java, it is almost never reusable unless it is completely encapsulated. In other words, if your component/system has any state that it will share with another component, then it will not be reusable. You will have to manually reorder locks, extend synchronization blocks, etc. Clojure is the clear winner in this arena. The absence of locks and automatic optimistic locking really shine here. Scala is a mixed bag. On one hand, the Actor model is very reusable. Any new component can just send the Actor a message. Of course you don't know if it will respond to the message or not, and that is a problem. The bigger problem is the lack of atomicity. If one Actor needs to send messages to two other Actors, there are no easy ways to guarantee correctness.
Back in June, I heard Martin Odersky says that he wants to add STM to Scala. I really think that this will be interesting. While I don't think STM is always the right solution, I think the bigger obstacle for adoption (on the Java platform) is the Clojure language itself. It's a big leap to ask people to give up on objects, and that is exactly what Clojure requires you to do. I think a Scala STM could be very attractive...


Zak said...

STM is one of three concurrency control mechanisms in Clojure. STM is for synchronous, coordinated state. Agents are a more actor-like mechanism for asynchronous and independent state. Atoms are somewhere between and are used for synchronous, independent state.

Anonymous said...

Clojure gives you more than one model of concurrency:

1) Refs (aka STM)
2) Agents (aka Actors)
3) Vars (aka thread-local state changes)
4) Atoms (similar to Agents but synchronous)

Stuart Halloway said...

I think Clojure and Scala will (fairly quickly) pick up useful concurrency idioms from each other. Certainly Clojure already has several more choices than STM.

Another important axis to consider is emphasis on/support for persistent data structures. Java does poorly here, and whether Clojure or Scala wins depends on whether you think supporting in-place mutation is a good idea.

Have you seen Rich's video on persistent data and managed references at I would be curious to know how/if this influences your assessment.

Michael Galpin said...

Just to quote myself: "It wold be more accurate to say ... software transactional memory instead of Clojure"

So yeah, I know that Clojure has more than just STM :-)

Steve McLeod said...

Great article on a complicated theme. One line I question though is this:

"If you send around messages between actors, they need to be immutable, which means a lot of copies of data."

Perhaps I misunderstood, but I think that having immutable messages means you _don't_ have to copy them, but can instead reuse them safely, knowing they can't be changed.

Roman Roelofsen said...

"To write correct code in Clojure, you have to figure out what things need to be protected by a dosync macro, and make sure those things are declared as refs. I think that would be an easy thing to screw up."

IMHO in Clojure it is exactly the opposite because its easy to do the right thing and hard to screw up:

- If the data is immutable, there is no reason to protect it

- If the data is mutable, you _must_ put the data in a ref

- Once the data is in a ref, you are only able to modify it in a transaction

Therefore Clojure guarantees that data is only mutated in a transaction and your are not able to to step out of this!

Travis Whitton said...

For your speed comparison chart, you'd be better off comparing Clojure's agents to Scala's actors. Comparing on that level, I'd imagine Clojure would actually be the fastest because "Clojure's agents are reactive not autonomous. There's no imperative message loop and no blocking receive."

In regard to the difficulty of getting things right, Scala uses objects everywhere which are riddled with mutable state. If you forget to protect access via actors, you're going to run into problems. In contrast, Clojure's native types are all immutable, so the same danger doesn't exist.

This being said, I did enjoy the article, but I feel like a little more research should have been done ahead of time.

Guy Korland said...

Java more than just locking for years comes with a very large concurrency package than covers most of common cases.

As for STM we're (The DeuceSTM team) working on adding STM to Java and already release a pretty stable 1.1 version.
For more details see:

Anna Nachesa said...

I have to be honest, Clojure scares me a bit with the amount of brackets... :) If there would be an "ease-of-use" gradation I'd put it at the far end. The main objection is code readability.

I would be glad to believe that this will change if one spends half a year programming in it though. But my sympathies are with Scala for now :)

ouertani said...

I remember you are scala fun ;) did you switch to Clojure ?

Roman Roelofsen said...


Switched? No. But certainly another tool in the toolbox :-)

alepuzio said...

Hi Michael,
I apprecie this article but I not understand your difference In "Efficiency" and "Speed": I thin kthat a speed language is efficient too.
Scala requires the copy of data between object with the actor's model: it's a very inefficient operation. How it's possibile Scala is speedest language?

Jeff said...

I'm curious about your speed comparison. Did you benchmark the three approaches and compare how they perform a similar operation, or are those placements on the axis based on your own assumptions and reasoning? Your line of reasoning makes sense to me, but I am skeptical about your conclusions in the absence of hard numbers. In particular I've read that Scala's actors are quite slow. If you *do* have any benchmarks, or know where we can find some, please let us know!

Viagra Online said...

good things,I think that I will get more information about it because java is one of the best system in the computer

alprazolam online said...

Superb work! The data supplied was very useful. I hope that you maintain the good work accomplished.

buy valium diazepam said...

Hi, This is really nice information here.... very interesting too... Thanks for the share....

Viagra kaufen said...

Very interesting. Java is making problems constantly so I am happy for every information.

Gambling in China said...

Nice Post .. I am always looking for such articles.. Which is really resourceful.Thanks

Chinese gambling

Video games said...

Nice Post .. I am always looking for such articles.. Which is really resourceful.Thanks


Gaming said...

Nice Post .. I am always looking for such articles.. Which is really resourceful.Thanks