Programming and politics: functional programming

Showing posts with label functional programming. Show all posts

Saturday, June 04, 2011

The Concurrency Myth

For nearly a decade now technology pundits have been talking about the end of Moore's Law. Just this week, The Economist ran an article about how programmers are starting to learn functional programming languages to make use of the multi-core processors that have become the norm. Indeed inventors of some of these newer languages like Rich Hickey (Clojure) and Martin Odersky (Scala) love to talk about how their languages give developers a much better chance of dealing with the complexity of concurrent programming that is needed to take advantage of multi-core CPUs. Earlier this week I was at the Scala Days conference and got to hear Odersky's keynote. Pretty much the second half of his keynote was on this topic. The message is being repeated over and over to developers: you have to write concurrent code, and you don't know how to do it very well. Is this really true, or is it just propaganda?

There is no doubt that the computer that we buy are now multi-core. Clock speeds on these computers have stopped going up. I am writing this blog post on a MacBook Air with a dual-core CPU running at 2.13 GHz. Five years ago I had a laptop with a 2.4 GHz processor. I'm not disputing that multi-core CPUs are the norm now, and I'm not going to hold my breath for a 4 GHz CPU. But what about this claim that it is imperative for developers to learn concurrent programming because of this shift in processors? First let's talk about which developers. I am only going to talk about application developers. What I mean are developers who are writing software that is directly used by people. Well maybe I'll talk about other types of developers later, but I will at least start off with application developers. Why? I think most developers fall into this category, and I think these are the developers that are often the target of the "concurrency now!" arguments. It also allows me to take a top-down approach to this subject.

What kind of software do you use? Since you are reading this blog, I'm going to guess that you use a lot of web software. Indeed a lot of application developers can be more precisely categorized as web developers. Let's start with these guys. Do they need to learn concurrent programming? I think the answer is "no, not really." If you are building a web application, you are not going to do a lot of concurrent programming. It's hard to imagine a web application where one HTTP request comes in and a dozen threads (or processes, whatever) are spawned. Now I do think that event-driven programming like you see in node.js will become more and more common. It certainly breaks the assumption of a 1-1 mapping between request and thread, but it most certainly does not ask/require/suggest that the application developer deal with any kind of concurrency.

The advancements in multi-core processors has definitely helped web applications. Commodity app servers can handle more and more simultaneous requests. When it comes to scaling up on a web application, Moore's Law has not missed a beat. However it has not required all of those PHP, Java, Python, Ruby web developers to learn anything about concurrency. Now I will admit that such apps will occasionally do something that requires a background thread, etc. However this has always been the case, and it is the exception to the type of programming usually needed by app developers. You may have one little section of code that does something concurrent, and it will be tricky. But this has nothing to do with multi-core CPUs.

Modern web applications are not just server apps though. They have a lot of client-side code as well, and that means JavaScript. The only formal concurrency model in JavaScript are Web Workers. This is a standard that has not yet been implemented by all browsers, so it has not seen much traction yet. It's hard to say if it will become a critical tool for JS development. Of course one of the most essential APIs in JS is XMLHttpRequest. This does indeed involve multiple threads, but again this is not exposed to the application developer.

Now one can argue that in the case of both server side and client side web technologies, there is a lot of concurrency going on but it is managed by infrastructure (web servers and browsers). This is true, but again this has always been the case. It has nothing to do with multi-core CPUs, and the most widely used web servers and browsers are written in languages like C++ and Java.

So is it fair to conclude that if you are building web applications, then you can safely ignore multi-core CPU rants? Can you ignore the Rich Hickeys and Martin Oderskys of the world? Can you just stick to your PHP and JavaScript? Yeah, I think so.

Now web applications are certainly not the only kind of applications out there. There are desktop applications and mobile applications. This kind of client development has always involved concurrency. Client app developers are constantly having to manage multiple threads in order to keep the user interface responsive. Again this is nothing new. This has nothing to do with multi-core CPUs. It wasn't like app developers used to do everything in a single thread, but now that multi-core CPUs have arrived, you need to start figuring out how to manage multiple threads (or actors or agents or whatever.) Now perhaps functional programming can be used by these kind of application developers. I think there are a lot of interesting possibilities here. However, I don't think the Hickeys and Oderskys of the world have really been going after developers writing desktop and mobile applications.

So if you are a desktop or mobile application developer, should you be thinking about multi-core CPUs and functional programming? I think you should be thinking about it at least a little. Chances are you already deal with this stuff pretty effectively, but that doesn't mean there's room for improvement. This is especially true if language/runtime designers started thinking more about your use cases.

I said I was only going to talk about application developers, but I lied. There is another type of computing that is becoming more and more common, and that is distributed computing. Or is it called cloud computing? I can't keep up. The point is that there are a lot of software systems that run across a cluster of computers. Clearly this is concurrent programming, so bust out the functional programming or your head will explode, right? Well maybe not. Distributed computing does not involve the kind of shared mutable state that functional programming can protect you from. Distributed map/reduce systems like Hadoop manage shared state complexity despite being written in Java. That is not to say that distributed systems cannot benefit from languages like Scala, it's just that the benefit is not necessarily the concurrent problems/functional programming that are often the selling points of these languages. I will say that Erlang/OTP and Scala/Akka do have a lot to offer distributed systems, but these frameworks address different problems than the multi-core concurrency.

It might sound like I am a imperative program loving curmudgeon, but I actually really like Scala and Clojure, as well as other functional languages like Haskell. It's just that I'm not sure that the sales pitch being used for these languages is accurate/honest. I do think the concurrency/functional programming angle could have payoffs in the land of mobile computing (desktop too, but there's not much future there.) After all, tablets have already gone multi-core and there are already a handful of multi-core smartphones. But these languages have a lot of work to do there, since there are already framework features and common patterns for dealing with concurrency in mobile. Event driven programming for web development (or the server in client/server in general) is the other interesting place, but functional languages have more to offer framework writers than application developers in that arena. My friend David Pollak recently wrote about how the current crop of functional languages can hope for no more than to be niche languages like Eiffel. I think that he might be right, but not just because functional programming has a steep learning curve. If all they can offer is to solve the concurrency problem, then that might not be enough of a problem for these languages to matter.

Tuesday, November 10, 2009

Clojures Primes Shootout

Back in May, I was working on my JavaOne talk on JVM language performance. The first comparison was a prime number sieve. If you look back at the slides, you might notice that I did not include a Clojure implementation. There were a couple of reasons for this. First, I was dumb. Not only dumb, but especially dumb when it came to Clojure. I had recently learned it -- mostly on an airplane to Israel earlier that month. Second, realizing that I was dumb, I was at least smart enough to reach out to the Clojure community for help. I got some great help on the other algorithms, but the prime sieve was unsatisfactory. The implementations I got were just too different from the algorithms used for the other languages, that it did not seem like a good comparison. I have been meaning to go back and come up with a satisfactory implementation. And before you ask, I know about this blazingly fast one:

(defn lazy-primes []
  (letfn [(enqueue [sieve n step]
            (let [m (+ n step)]
              (if (sieve m)
                (recur sieve m step)
                (assoc sieve m step))))
          (next-sieve [sieve candidate]
            (if-let [step (sieve candidate)]
              (-> sieve
                (dissoc candidate)
                (enqueue candidate step))
              (enqueue sieve candidate (+ candidate candidate))))
          (next-primes [sieve candidate]
            (if (sieve candidate)
              (recur (next-sieve sieve candidate) (+ candidate 2))
              (cons candidate
                (lazy-seq (next-primes (next-sieve sieve candidate)
                            (+ candidate 2))))))]
    (cons 2 (lazy-seq (next-primes {} 3)))))

and this one that is very close to what I wanted:

(defn sieve [n]
  (let [n (int n)]
    "Returns a list of all primes from 2 to n"
    (let [root (int (Math/round (Math/floor (Math/sqrt n))))]
      (loop [i (int 3)
             a (int-array n)
             result (list 2)]
        (if (>= i n)
          (reverse result)
          (recur (+ i (int 2))
                 (if (< i root)
                   (loop [arr a
                          inc (+ i i)
                          j (* i i)]
                     (if (>= j n)
                       arr
                       (recur (do (aset arr j (int 1)) arr)
                              inc
                              (+ j inc))))
                   a)
                 (if (zero? (aget a i))
                   (conj result i)
                   result)))))))

and of course the one from contrib:

(defvar primes
  (concat 
   [2 3 5 7]
   (lazy-seq
    (let [primes-from
   (fn primes-from [n [f & r]]
     (if (some #(zero? (rem n %))
        (take-while #(<= (* % %) n) primes))
       (recur (+ n f) r)
       (lazy-seq (cons n (primes-from (+ n f) r)))))
   wheel (cycle [2 4 2 4 6 2 6 4 2 4 6 6 2 6  4  2
   6 4 6 8 4 2 4 2 4 8 6 4 6 2  4  6
   2 6 6 4 2 4 6 2 6 4 2 4 2 10 2 10])]
      (primes-from 11 wheel))))
  "Lazy sequence of all the prime numbers.")

Instead I came up with my own bad one...

(defn primes [max-num](
  loop [numbers (range 2 max-num) primes [] p 2]
    (if (= (count numbers) 1)
      (conj primes (first numbers))
      (recur (filter #(not= (mod % p) 0) numbers) (conj primes p) (first (rest numbers))))))

Why is this so bad? Well it is horribly inefficient. It creates a list of integers up to the max-num that is the only input to the function. It then loops, each time removing all multiples of each prime -- just like a sieve is supposed to. However, most sieves take a prime p and remove 2p, 3p, 4p, etc. (often optimized to 3p, 5p, 7p, since all the even numbers are removed at the beginning.) This does the same thing, but it uses a filter to do it. So it goes through each member of the list to check if it is divisible by p. So it does way too many calculations. Next, it is constantly generating a new list of numbers to pass back into the loop. Maybe Clojure does some cleverness to make this not as memory inefficient as it sounds, but I don't think so.
So what is it that I did not like about the other many examples out there or suggested by the community? Most of them used a lazy sequence. That's great and very efficient. However, it's not a concept that is easy to implement in other, less functional languages (Go Clojure!) On the other hand, the Rich Hickey example is much closer to what I wanted to do. However, it uses Java arrays. There is no duplication of the list and the non-primes are eliminated efficiently. Using Java arrays just seems non-idiomatic to say the least. The other examples did for JavaOne all used variable length lists instead of arrays.
Anyways, I am satisfied to at least have a seemingly equivalent, idiomatic Clojure implementation. I looked at optimizing it through using type hints. I actually had to use a lot of type hints (4) to get a 10% decrease in speed. Anyways, not that this is out here, others will improve it, as that shouldn't be too hard to do!

Thursday, October 22, 2009

Concurrency Patterns: Java, Scala, and Clojure

Today I was at The Strange Loop conference. The first talk I attended was by Dean Wampler on functional programming in Ruby. Dean brought up the Actor concurrency model and how it could be done in Ruby. Of course I am quite familiar with this model in Scala, though it is copied from Erlang (which copied it from some other source I'm sure.) The next talk I went to was on software transactional memory and how it is implemented in Clojure. I had read about Clojure's STM in Stuart Halloway's book, but I must admit that it didn't completely sink in at the time. I only understood the basic idea (it's like a database transaction, but with retries instead of rollbacks!) and the syntax. As I've become more comfortable with Clojure in general, the idea has made more and more sense to me.

Now of course, no one size fits all. There are advantages and drawbacks to any concurrency model. A picture of these pros and cons kind of formed in my head today during the STM talk. I turned them into pictures. Here is the first one:

This is meant to be a measure of how easy it is to write correct concurrent code in Java, Scala, and Clojure. It is also meant to measure how easy it is to undersand somebody else's code. I think these two things are highly correlated. I chose to use these various languages, though obviously this is somewhat unfair. It wold be more accurate to say "locking/semaphores" instead of Java, the Actor model of Scala, and software transactional memory instead of Clojure -- but you get the point. So what does this graph mean?
Well obviously, I think Java is the most difficult language to write correct code. What may be surprising to some people is that I think Clojure is only a little simpler. To write correct code in Clojure, you have to figure out what things need to be protected by a dosync macro, and make sure those things are declared as refs. I think that would be an easy thing to screw up. It's still easier than Java, where you have to basically figure out the same things, but you must also worry about multiple lock objects, lock sequencing, etc. In Clojure you have to figure out what has to be protected, but you don't have to figure out how to protect it -- the language features take care of that.
So Clojure and Java are similar in difficulty, but what about Scala and the Actor model? I think this is much easier to understand. There are no locks/transactions. The only hard part is making sure that you don't send the same mutable object to different actors. This is somewhat similar to figuring what to protect in Clojure, but it's simpler. You usually use immutable case classes for the messages sent between actors, but these are used all over the place in Scala. It's not some special language feature that is only used for concurrency. Ok, enough about easy to write/understand code, there are other important factors, such as efficiency:

Perhaps this should really be described as memory efficiency. In this case Java and the locking model is the most efficient. There is only copy of anything in such a system, as that master copy is always appropriately protected by locks. Scala, on the other hand, is far less efficient. If you send around messages between actors, they need to be immutable, which means a lot of copies of data. Clojure does some clever things around copies of data, making it more efficient than Scala. Some of this is lost by the overhead of STM, but it still has a definite advantage over Scala. Like in many systems, there is a tradeoff between memory and speed:

Scala is the clear king of speed. Actors are more lightweight than threads, and a shared nothing approach means no locking, so concurrency can be maximized. The Java vs. Clojure speed is not as clear. Under high write contention, Clojure is definitely slower than Java. The more concurrency there is, the more retries that are going on. However, there is no locking and this really makes a big deal if there are a lot more reads than writes, which is a common characteristic of concurrent systems. So I could definitely imagine scenarios where the higher concurrency of Clojure makes it faster than Java. Finally, let's look at reusability.

By reusability, I mean is how reusable (composable) is a piece of concurrent code that you write with each of these languages/paradigms? In the case of Java, it is almost never reusable unless it is completely encapsulated. In other words, if your component/system has any state that it will share with another component, then it will not be reusable. You will have to manually reorder locks, extend synchronization blocks, etc. Clojure is the clear winner in this arena. The absence of locks and automatic optimistic locking really shine here. Scala is a mixed bag. On one hand, the Actor model is very reusable. Any new component can just send the Actor a message. Of course you don't know if it will respond to the message or not, and that is a problem. The bigger problem is the lack of atomicity. If one Actor needs to send messages to two other Actors, there are no easy ways to guarantee correctness.
Back in June, I heard Martin Odersky says that he wants to add STM to Scala. I really think that this will be interesting. While I don't think STM is always the right solution, I think the bigger obstacle for adoption (on the Java platform) is the Clojure language itself. It's a big leap to ask people to give up on objects, and that is exactly what Clojure requires you to do. I think a Scala STM could be very attractive...

Tuesday, September 15, 2009

Scala Word Clouds

Just before I went to bed last night, I got an interesting tweet from Dave Briccetti. You can see from the link in tweet what the problem was, and if you look at the bottom of the page, you can see the clever solution by Jorge Ortiz. Right after I read the tweet from Dave, I asked my wife how long until the end of the TV that she was watching. She said seven minutes. I decided that wasn't enough time, so I went to sleep.

I didn't look at the problem again until today during lunch. By then, I saw Jorge's solution and I saw no way to improve on it. However, this inspired a little experiment for me. Actually it mostly reminded me of a programming problem that I was given a few years back by a well known company that will remain anonymous as they are very touchy about people talking about their interviewing process. The problem was essentially the same as Dave's problem -- take a bunch of sentences (instead of tweets), figure out the most frequent words. There was a twist though -- I was not allowed to use any Java collection. Actually I could not use anything outside of the java.lang package, so I was pretty much stuck with arrays...

At the time, I figured that I would have to either implement my own hash table and a sort, but then I came up with a way to solve the problem with no hash table. So when I saw Dave's problem, I thought "wow my old solution will be so much more awesome in Scala":

def createCloud(strings:Seq[String])={
  val x:List[String] = Nil
  var s = ("",0)
  val cloud = new scala.collection.mutable.ArrayBuffer[Tuple2[String,Int]]
  strings.map(_.split("\\s").toList.sort(_ < _)).foldLeft(x){mergeSorted(_, _)}.foreach( (a) => {
    if (a == s._1) 
      cloud(cloud.length - 1) = (a, s._2 + 1)
    else
      cloud += (a,1)
    s = cloud(cloud.length - 1)
  })
  cloud.toList.sort((a,b) => a._2 > b._2)
}

Ok, so I cheated this time and used an ArrayBuffer instead of array. In my solution years ago, I made an array whose size was the total number of words (worst case scenario.) I didn't have tuples, so I had to use two arrays. Anyways, the idea here is to split each sentence into a list of words, and then sort that list of words. Now merge together each sorted list, and keep things sorted using a merge sort. Of course you could turn everything into a big list first using flatMap, similar to Jorge's solution, and then sort. Anyways, you then roll up the sorted list, taking advantage of the fact that it is sorted. You never have to look up a word to see if it has been encountered before, because of the sorting. You just keep track of the last word encountered, and just compare the next word to the last word to determine if the word has been previously encountered and thus needs to be incremented. Of course you must be wondering about the aforementioned merge sort. Here is a Scala implementation of it:

def mergeSorted[A](comparator: (A,A) => Boolean)(x:List[A], y:List[A]):List[A] = {
  if (x.isEmpty) y
  else if (y.isEmpty) x
  else if (comparator(x.head, y.head)) x.head :: mergeSorted(comparator)(x.tail, y)
  else y.head :: mergeSorted(comparator)(x, y.tail)
}
  
def mergeSorted[A <% Ordered[A]](x:List[A], y:List[A]):List[A] = mergeSorted((a:A,b:A) => a < b)(x,y)

While writing this, I decided that I should touch up my knowledge of currying and partially applied functions in Scala, so I read about this on Daniel Spiewak's blog. The first version of mergeSorted is very generic and takes a comparator. This is what lead me into the world of currying. Actually, my first version of the above looked like this:

class MergeSorter[A](comparator: (A,A) => Boolean){
  def apply(x:List[A], y:List[A]):List[A] = {
    if (x.isEmpty) y
    else if (y.isEmpty) x
    else if (comparator(x.head, y.head)) x.head :: apply(x.tail, y)
    else y.head :: apply(x, y.tail)    
  }
}

I created a class to encapsulate the sort. In some ways I like this syntax a little better, but it made the second version of the sort a little uglier. I wanted a "default" sort that could be used for anything implementing Scala's Ordered trait. It is this second version that I used in the createCloud function above, as strings can be implicitly converted to RichString which implements Ordered.

It is still easy to use mergeSorted with objects that do not implement Ordered, or to provide a different sort function for those that do:

// provide your own sort
val sList = 
  mergeSorted((s1:String, s2:String) => s1.toLowerCase < s2.toLowerCase)(List("aa","aB","Cb"), List("Ab", "ca"))
// create a reusable sort
val listSort = mergeSorted( (a:List[Any], b:List[Any]) => a.length < b.length) _
val z = listSort(List(List("a","b"), List(1,2,3)), List(List(true,true,true), List("x","y","z","w")))

In the first example, we create a new sort by providing different sorting function that is case insensitive. It is then applied to a list of strings. In the second example, we create the sorting function and assign to a val. Notice the mysterious underscore _ at the end of the statement. This creates a partially applied function. If you forget the underscore, the Scala compiler actually provides a pretty useful error message about this. With this, we can then apply the function however we want.

Wednesday, April 22, 2009

ActionScript Quickie: Check for Required Params

You ever need to test that a bunch of variables are not null/empty/true? Here is a quick and dirty way to do it in ActionScript 3.0:


private function validArgs(...args):Boolean{
 return args.every(function(arg:Object, ...crap):Boolean {
  return arg ? true : false; });
}

Usage:


var firstName:String = ...
var lastName:String = ...
var iq:int = ...
var friends:Array = ...
validArgs(firstName, lastName, iq > 100, friends.length);

Thursday, January 15, 2009

Syntax Matters: JavaFX Script

Last night I finished writing an article on JavaFX. It was a lot of fun, and it has convinced me that JavaFX has a strong future. A lot of its promise comes from its syntax. It is easy to get carried away with Rich Media! and Animation! and FPS! but then you forget that JavaFX is a new programming language on the JVM. Here are the highlights of its syntax.

The declarative constructors: When I first started looking at JavaFX, it looked like MXML but in a JSON format. Looks are deceiving. Essentially JavaFX supports name parameters in its constructors. The naming convention is foo : bar, where foo is the name of the parameter and bar is the value. You can put multiple parameters on one line by using a comma to separate, like in most other languages, or you can put each on its own line and eschew the commas. This leads to the JSON-ish syntax. It is probably more accurate to describe it as JavaScript-ish. This becomes even more apparent when you start nesting constructed objects. It really looks a lot like object literals in JavaScript. What is great is that this is not just something you only use for UI elements, like MXML. It is part of the language syntax, so you can use this syntax anywhere you want. Imagine being able to write an MXML expression directly inside a block of ActionScript code...

Speaking of JavaScript: A lot of the syntax of JavaFX looks like JavaScript's more sophisticated cousin, ActionScript 3.0. You use var to declare local variables. You put the type after the variable name, separated with a colon. You use function to define methods. The major difference is using def to denote a constant. In ActionScript, you use const, like you would in C. This is actually kind of bad in my opinion. The def keyword is used in many other languages, like Ruby and Scala, to denote a function. There is no obvious connection between def and a constant. I think they should have used const instead of def. It would have made more sense to do that and maybe use def instead of function.

Functional Programming: A lot of folks are disappointed that closures are not being added to Java, at least not in Java 7. JavaFX does have closures. You can define a var to have a function type and can specify the signature of that function. The syntax is pretty nice. For example, you could do var join:function(String[], String):String to define a variable called join that is a function that takes in an array of strings and a string as input parameters and returns a string. I would like to see this is in ActionScript. JavaFX also has support for list comprehensions. You could do var squares = for (i in [1..100]) { i*i} to get an array of the perfect sqaures up to 10,000. However, JavaFX does not make as much use of closures. You would think that its sequences would have common functional mehtods like filter, map, fold, etc. For filter and map, there are alternatives. For example, let's say you wanted the square of all of the odd integers less than 100. In Scala you would do something like val os = (1 until 100).filter(_ % 2 == 1).map(_^2) . In JavaFX it would be val os = for (x in [1..10][i | i % 2 == 1]){ x*x }. It's a case of syntax over API. The second set of curly braces is like a select clause. I want to like it because it uses mathematics insired symbols.

Other: There are a few other JavaFX syntax bits worth mentioning. First is bind and bound. These are for data binding expressions. These can be very powerful. You can bind variables together, so that the one changes when the other changes. Better is that you can bind to functions and list comprehensions. The other interesting syntax in JavaFX involve the keywords delete and insert. These give LINQ-ish syntax for sequences. In fact if you combine the mathematical style select syntax with insert/delete and with the declarative constructors, you get expressiveness that is pretty on-par with LINQ in my opinion. When you see everything working together, it kind of makes sense but it does seem kind of random at first.

Tuesday, December 16, 2008

Is Scala too hard or are you just stupid?

In case you missed it, Python's BDFL a.k.a. Guido van Rossum filed some major complaints about Scala. After you read that, read a good rebuttal and especially observe the ensuing debate. I especially liked Tony Morris's closing salvo. He brings some validity to the ad homenim response: Maybe you don't like Scala because you are too dumb to understand it. Please allow me to commit the most common fallacy of human thought and attempt to answer this question by only examining my own experience. In other words, if Scala seems too hard to me, then it is too hard. Period. If it seems simple enough to me, then anybody who finds it hard is just stupid.

Since I am going to make a statement of fact based entirely on my own experiences, maybe a little background would be relevant. I started programming 27 years ago when I was 7 years old. I programmed in BASIC, first on an Apple II at school and then on my very own Commodore 64. My first "real programming" was in Pascal when taking Advanced Placement Computer Science in high school. I probably still think in Pascal and translate everything into it first. I also learned FORTRAN and C while in high school.

In college, I learned a lot more C. For awhile I double-majored in math and computer science. Then I took the CS 10, the weed-out course for CS majors. For all of our programs in that class, we started with a predicate calculus definition of the problem, and then we had to logically derive the solution. We had a tool (I think the prof or somebody else in the department wrote) that we transform the problem statement to a computer program. But not just any program, a Lisp program. I hated everything about this, including Lisp. That class successfully weeded me out, and stuck to math.

In college I also learned Perl (for an economics class) and C++ (for a summer math research project.) After college, I programmed mostly in Java, with some C++, Perl, Ruby, and C# sprinkled in along the way. So in summary, I've programmed in a lot of language, but most of them are from the imperative cum OOP tree with C based syntax. I had one major experience with a functional language, and hated it so much that I change my major. So now on to my observations and emotions about Scala.

1.) Type inference sucks, but you (Java developers) gotta love it. I love having a statically typed language. I am definitely in that camp. However, I love not having to declare types very often. How often do you get to have your cake and eat it too? But this is, in my experience, the major source of pain in Scala. You are at the mercy of the cleverness of the compiler. In Java you are often at the mercy of the cleverness of the JVM, but it requires some esoteric Java (the kind that you only see in job interview problems or in books by Josh Bloch and Neal Gafter) to produce code that looks like it should compile, but will not. This is all too common in Scala. Here is an example.


// this compiles
object ItemMetaData extends Item with KeyedMetaMapper[Long, Item]{
   override def dbTableName = "items"
  // here is the problem line
   override def fieldOrder = List(name, description, reserve, expiration)
}


class Item extends KeyedMapper[Long, Item]{
 def getSingleton = ItemMetaData
 def primaryKeyField = id

 object id extends MappedLongIndex(this)
 object reserve extends MappedInt(this)
 object name extends MappedString(this, 100)
 object description extends MappedText(this)
 object expiration extends MappedDateTime(this)
}
// but this does not
object ItemMetaData extends Item with KeyedMetaMapper[Long, Item]{
   override def dbTableName = "items"
   // here is the problem line
   override def fieldOrder = name :: description :: reserve :: expiration :: Nil
}


class Item extends KeyedMapper[Long, Item]{
 def getSingleton = ItemMetaData
 def primaryKeyField = id

 object id extends MappedLongIndex(this)
 object reserve extends MappedInt(this)
 object name extends MappedString(this, 100)
 object description extends MappedText(this)
 object expiration extends MappedDateTime(this)
}

The two list expressions are equivalent, and yet one compiles and the other does not. Why? Is it a flaw in the compiler? Is it bad syntax? Is it a flaw in the language implementation (i.e. the List code and the :: code) ?

2.) Having no operators means that operators are everywhere! You can make the case that Scala's syntax is simpler than Java, C++, Ruby, or C++ for that matter. Why? Because it has no operators. There is no operator overloading, because there are no operators. The flip side of this is that it is easy to simulate operators and that everybody can do it. This is great for the wanna-be language designers in all of us, but can really suck. Why? It's like you are constantly encountering new operators. It can make Scala's syntax feel infinite is size, even though it is actually quite small and simple.

3.) Golf is built in and is known as _. Scala uses the underscore all over the place to allow for shorthand. Once you have become reasonably competent at Scala, you begin to like this. Until then, you hate it. In other words, it steepens the learning curve. It is a feature for power users, but it is prevalent. Thus you either become a power user, or you quit (insert snide comment about Guido or Cedric here.)

4.) Expressive is as expressive does. A lot of folks have thrown the term "readability" around. There are two aspects of readability. When you read code, do you know what it does? When you read code, do you know how it does it? The more expressive a language is, the easier it is for it to satisfy the first question. However, it may be harder for it to satisfy the second question. One could argue that if a language is better at both of these things than an alternative language, then you would probably rather program in that language. It is certainly easier to tell what idiomatic Scala will do than Java, for example. However, it is often harder to understand how it does it. You can substitute Python for Scala in the above two sentences and it is still true. But could you substitute Python for Java in those sentences? Probably not.

5.) Scala is not for creationists. Ok maybe that's a bad joke by an atheistic evolutionist. What I mean is that object creation can be confusing in Scala. Constructors are not obvious at first. Case classes add confusion, and then you throw in companion objects and you might start screaming "Make it stop!" Oh do I complain too much? Well in most of the other "mainstream" languages you need to learn exactly one thing to figure out object construction. Sure there may be other patterns (factories, singletons, builders, etc.) for augmenting this, but there is only "natural" way instantiate a class. In Scala you see standard stuff like val foo = new Foo("2ez") but also val foo = Foo("wtf"). The latter could be a case class, in which case the former won't compile. Or it could be a companion object, in which case both things compile.

So what is the conclusion? Well for all of my struggles, I have managed to find my way around Scala. So there, it must not be too hard. I am no smarter than any other programmer out there, so if I can learn it, they can too. Plus, my job did not even depend on it. Usually when a programmer has to learn a language, their job depends on it. A motivated programmer can definitely learn Scala is no time!

Tuesday, July 08, 2008

Functional Programming

Inspired by Python, here are a few tastes.