Programming and politics: 11/01/2009

Tuesday, November 24, 2009

Web Developers Are Stupid and Arrogant

The past couple of days has seen an amusing rant by PPK, that then turned into a retraction and call to action. The original rant included a condemnation of iPhone developers as being stupid and arrogant. Others have adequately refuted PPK, so I won't bother with any of that. His post made me realize that it is in fact web developers who are mostly commonly guilty of stupidity and arrogance. Here's what I mean.

It's easy to look at the world today and say that web applications have won. This is web developer arrogance. Stupidity is to think that web applications have won because web applications are superior to desktop applications. Smarter, but probably still arrogant developers would point to web applications as disruptive technologies. This involves admitting that web applications are inferior, but good enough, and present enough other "cheaper" advantages to compensate for their inferiority.

To understand why the "web apps have won" claim is dubious. There are definitely a lot of awesome web applications out there. Many of them were created back in the mid/late 90's, The "features" of these applications were the key to applications, not the user interface. Now these days, most of these web applications offer APIs/web services/RESTful interfaces/whatever you want to call them. In many cases it is possible to build desktop applications that tap into the same features as these web applications. However, this was certainly not the case 10-15 years ago.

So if APIs make it possible to build desktop apps that offer the same features as popular web applications, why haven't people switched? The first most obvious answer is inertia. If you are used to accessing Google or Amazon on the web, that's probably the way that you will always use it. Something else to consider is that for many web applications, it does not make sense for them to offer all of their features through APIs because it hurts their core business. This is most obviously true for advertising based companies like Google, Yahoo, and Facebook. Their web applications not only provide very useful features to end users, but they serve ads that make money for the companies. If all of their users switched to using desktop applications that only offered the features with no ads, then the companies would lose their revenue streams. Their business is connected at the hip with their UI, so it is in their best interest to make sure people use their UI -- which is a web application.

However, there are other very successful web applications whose main revenue does not come from ads. Their business is distinct from their UI. E-commerce companies like Amazon and my employer, eBay are obvious examples. For example, eBay offers trading APIs that provide almost all of the trading features of eBay. This is particularly true for selling on eBay. This makes sense, as eBay does not need a seller to use the eBay UI to sell something, as the UI is not what makes money for eBay. As a result, around 50% of all items for sale on eBay come through 3rd party applications built on top of the eBay trading APIs. The vast majority of these (especially the popular ones) are desktop applications. Give people a choice, and a lot of people choose desktop applications.

For another interesting example, just look at Twitter. This is a company that came into an existence after web APIs had become the norm. So Twitter has provided a comprehensive set of APIs since early in its existence. Further, they have not pursued an advertising model that would marry their web based UI to any revenue streams. So they have kept their APIs in sync with their features. For example, they recently added list and retweet features to their site, and added them to the APIs at the same time. As a result, there are a huge number of desktop applications for accessing Twitter. Indeed, Twitter says that 80% of their traffic is from APIs -- either desktop or mobile applications. For most Twitter users, there have always been feature-equivalent desktop alternatives to Twitter's web based UI, so many users chose desktop applications over the web.

Finally, let's look at one more example: existing desktop apps. There has been an incredible amount of money spent on creating web applications that provide similar functionality to traditional desktop applications: email, word processing, etc. Heck, Google has spent a lot in this space just by itself. These are useful applications, but it is rare for people to choose these apps over their desktop equivalents. In most cases these apps try to go the disruptive route, i.e. don't try to be as good, but good enough and cheaper. They have had little success so far. Of course, inertia is a valid argument here, too. The one case where there has been success is GMail. In my opinion its success is not because people like it's web UI over a desktop UI, or even that the web UI is "good enough" and cheaper. No, it's success is because it has offered innovative features over other web and desktop based alternatives: fast search, stars/labels, threaded conversations, etc. Even give all of that, many people still choose to use desktop clients to access their GMail (I'm definitely not one of them.) Anyways, once again it's the features, not the UI.

I am not going to sit here and claim that desktop wins over web all the time. I'm not arrogant or stupid enough to make such claims. However, be wary of claims that web apps win over desktop apps. One could argue that with the preponderance of APIs (especially spurned on by mobile apps) and the popping of the advertising based web 2.0 bubble, that the future will hold even more opportunities for desktop alternatives to web applications. Maybe web applications have jumped the shark. So don't put up with web developers who insist that web applications have won (especially if they try to extrapolate this flawed argument to the mobile world). They can go on and on about technology, standards, interoperability, etc. Just remind them that it's the users who matter, and when given a choice, the users do not always choose web applications. Time to polish off your MFC and Cocoa skills!

Saturday, November 21, 2009

Passing on the Duby

Last week I had a fun conversation with Charles Nutter. It was about "java.next" and in particular, Charlie's requirements for a true replacement for Java. He stated his requirements slightly before that conversation, so let me recap:

1.) Pluggable compiler
2.) No runtime library
3.) Statically typed, but with inference
4.) Clean syntax
5.) Closures

I love this list. It is a short list, but just imagine if Java 7 was even close to this. Anyways, while I like the list as a whole, I strongly objected to #2 -- No runtime library. Now, I can certainly understand Charlie's objection to a runtime library. It is a lot of baggage to drag around. All of the JVM langs have this issue. JRuby for example, has an 8 MB jar. Groovy has a 4 MB jar. Clojure has two jars totaling 4.5 MB. Scala is also around 4 MB. These become a real issue if you want to use any of these languages for something like Android development. I've written about this issue as well.

However, there is a major problem with this requirement. The building blocks of any programming language includes primitives and built-in types. Those built-in types are part of the runtime library of the language. So if your JVM based language cannot have a runtime library, then you will have to make do with primitives and Java's built-in types. Why is this so bad? Java's types (mostly) make sense in the context of Java, its design principles and syntax. I don't think they would make sense in the java.next hypothetical language described above. The easiest example of this are collection classes. Don't you want a list that can make use of closures (#5 requirement above) for things like map, reduce, filter, etc. ? Similarly, if you static typing, you probably have generics, and wouldn't you like some of your collections to be covariant? You can go even further and start talking immutability and persistent data structures, and all of the benefits these things bring to concurrent programming. This is just collections (though obviously collections are quite fundamental to any language,) but similar arguments apply to things like IO, threading, XML processing, even graphics (I'd like my buttons to work with closures for event handling thank you very much.)

One argument against this is that you can just include the runtime library of your choice. Pick your own list, hash map, thread pool, and file handler implementations. This is what I'd like to call the C++ solution -- only worse. At least in C++ you can generally count on the STL being available. The thing that is so bad about this is that it really limits the higher-order libraries that can be built. Java, Ruby, and Python all have a plethora of higher order libraries that have been built and widely used. These libraries make extensive use of the built-in types of those languages. Imagine building ORMs like Hibernate or ActiveRecord if you did not have built-in types (especially collection classes) that were consistent with the language. You could not do it. If all you could rely on was the Java built-in types, then at best your libraries would have a very Java-ish feel to them, and doesn't that defeat the purpose?

Charlie gave an alternative to this -- leverage the compiler plugins. Now it is certainly true that with a pluggable compiler, you could do this like add a map method to the java.util.List interface, and all of the JDK classes that implement this interface. It can be done. However, if you are going to build higher order libraries on top of this language, then you need to be able to count on these enhancements being present. In other words, the compiler plugin needs to be a required part of the compiler. Fine. Now what is it that we were trying to avoid? Extra runtime baggage, right? Well if you have a compiler that is going to enhance a large number of the classes in the JDK, hello baggage. Maybe it won't be as much baggage as 4 MB runtime library, but it might not be a lot less either. Also, it raises the potential problem of interoperability with existing Java libraries. If I import a Hibernate jar that gives me a java.util.List, that list is not going to be enhanced by my compiler -- because it wasn't compiled with my javac.next, it was compiled with javac.old. What's going to happen here?

Now there is an obvious happy way to deal with the runtime library question: rt.jar. If java.next was Java, and the super revamped collections, IO, threading, etc. classes were part of the JDK, then there is no need for a runtime library since now it has been included with the JDK. However, given the incredibly uncertain state of Java 7, not to mention the long term prospects of Java, does this seem remotely possible to anyone? I don't think so. I think the heir apparent to Java cannot be Java, and I think because of that, it's gotta have a runtime library of its own.

Monday, November 16, 2009

Cross Browser Geolocation

You might have heard that Joe Hewitt has given Apple the finger, and is moving on (or more accurately moving back to) mobile web development. You can do a lot on mobile web browsers -- more than you can do on desktop browsers. This can seem counterintuitive at first. However, flavors of WebKit have a huge share of mobile web use, and it supports a lot of features that you can't count on in the IE-dominated world of desktop browsers. One of those coveted features is geolocation. However, geolocation support is far from standardized even among the high end WebKit based browsers.

Geolocation on Mobile Safari is nice, or more accurately it is "standardized." It follows the W3C standard. All you have to do is access the navigator.geolocation object. Here is a super simple example:

var gps = navigator.geolocation;
    if (gps){
        gps.getCurrentPosition(successHandler);
    } 
}

function successHandler(position){
  var lat = position.coords.latitude;
  var long = position.coords.longitude;
  doSomethingUseful(lat, long);
}

Pretty easy, huh? This will work on Mobile Safari running on iPhone OS 3+. As mentioned, it is standard. It will also work on newer versions of Firefox, which presumably includes Mobile Firefox, a.k.a. Fennec. However, that is the limit of the portability of this code. It will not work in Android's browser.
Android's flavor of WebKit, does not support the standard. However, it does support geolocation via Google Gears. That's right, instead of supporting the open standard, it implements the same feature using a proprietary technology. Shocking, I know. So if you are going to run JS that should access geolocation on both the iPhone and one of the zillion Android phones out there, you will need to include the gears_init.js bootstrap file. Then you can re-write the above like this:

function load(){
    var gps = navigator.geolocation;
    if (gps){
        gps.getCurrentPosition(successHandler);
    } else {
        if (window.google && google.gears){
            gps = google.gears.factory.create('beta.geolocation');
            gps.getCurrentPosition(function (position){
               successHandler({coords : position});
            }) ;
        }
    }
}

function successHandler(position){
  var lat = position.coords.latitude;
  var long = position.coords.longitude;
  doSomethingUseful(lat, long);
}

So this is pretty close to the standard, but not quite. The Gears variant also supports the getCurrentPosition API. However, the object that it passes to the success function is different. It is a flatter structure. So in the above example, we wrap it inside another object so that we can reuse the same handler.
I haven't tried the webOS browser, yet or the Nokia WebKit variant yet. Here's hoping they are close to the standard used by Safari.

Tuesday, November 10, 2009

Clojures Primes Shootout

Back in May, I was working on my JavaOne talk on JVM language performance. The first comparison was a prime number sieve. If you look back at the slides, you might notice that I did not include a Clojure implementation. There were a couple of reasons for this. First, I was dumb. Not only dumb, but especially dumb when it came to Clojure. I had recently learned it -- mostly on an airplane to Israel earlier that month. Second, realizing that I was dumb, I was at least smart enough to reach out to the Clojure community for help. I got some great help on the other algorithms, but the prime sieve was unsatisfactory. The implementations I got were just too different from the algorithms used for the other languages, that it did not seem like a good comparison. I have been meaning to go back and come up with a satisfactory implementation. And before you ask, I know about this blazingly fast one:

(defn lazy-primes []
  (letfn [(enqueue [sieve n step]
            (let [m (+ n step)]
              (if (sieve m)
                (recur sieve m step)
                (assoc sieve m step))))
          (next-sieve [sieve candidate]
            (if-let [step (sieve candidate)]
              (-> sieve
                (dissoc candidate)
                (enqueue candidate step))
              (enqueue sieve candidate (+ candidate candidate))))
          (next-primes [sieve candidate]
            (if (sieve candidate)
              (recur (next-sieve sieve candidate) (+ candidate 2))
              (cons candidate
                (lazy-seq (next-primes (next-sieve sieve candidate)
                            (+ candidate 2))))))]
    (cons 2 (lazy-seq (next-primes {} 3)))))

and this one that is very close to what I wanted:

(defn sieve [n]
  (let [n (int n)]
    "Returns a list of all primes from 2 to n"
    (let [root (int (Math/round (Math/floor (Math/sqrt n))))]
      (loop [i (int 3)
             a (int-array n)
             result (list 2)]
        (if (>= i n)
          (reverse result)
          (recur (+ i (int 2))
                 (if (< i root)
                   (loop [arr a
                          inc (+ i i)
                          j (* i i)]
                     (if (>= j n)
                       arr
                       (recur (do (aset arr j (int 1)) arr)
                              inc
                              (+ j inc))))
                   a)
                 (if (zero? (aget a i))
                   (conj result i)
                   result)))))))

and of course the one from contrib:

(defvar primes
  (concat 
   [2 3 5 7]
   (lazy-seq
    (let [primes-from
   (fn primes-from [n [f & r]]
     (if (some #(zero? (rem n %))
        (take-while #(<= (* % %) n) primes))
       (recur (+ n f) r)
       (lazy-seq (cons n (primes-from (+ n f) r)))))
   wheel (cycle [2 4 2 4 6 2 6 4 2 4 6 6 2 6  4  2
   6 4 6 8 4 2 4 2 4 8 6 4 6 2  4  6
   2 6 6 4 2 4 6 2 6 4 2 4 2 10 2 10])]
      (primes-from 11 wheel))))
  "Lazy sequence of all the prime numbers.")

Instead I came up with my own bad one...

(defn primes [max-num](
  loop [numbers (range 2 max-num) primes [] p 2]
    (if (= (count numbers) 1)
      (conj primes (first numbers))
      (recur (filter #(not= (mod % p) 0) numbers) (conj primes p) (first (rest numbers))))))

Why is this so bad? Well it is horribly inefficient. It creates a list of integers up to the max-num that is the only input to the function. It then loops, each time removing all multiples of each prime -- just like a sieve is supposed to. However, most sieves take a prime p and remove 2p, 3p, 4p, etc. (often optimized to 3p, 5p, 7p, since all the even numbers are removed at the beginning.) This does the same thing, but it uses a filter to do it. So it goes through each member of the list to check if it is divisible by p. So it does way too many calculations. Next, it is constantly generating a new list of numbers to pass back into the loop. Maybe Clojure does some cleverness to make this not as memory inefficient as it sounds, but I don't think so.
So what is it that I did not like about the other many examples out there or suggested by the community? Most of them used a lazy sequence. That's great and very efficient. However, it's not a concept that is easy to implement in other, less functional languages (Go Clojure!) On the other hand, the Rich Hickey example is much closer to what I wanted to do. However, it uses Java arrays. There is no duplication of the list and the non-primes are eliminated efficiently. Using Java arrays just seems non-idiomatic to say the least. The other examples did for JavaOne all used variable length lists instead of arrays.
Anyways, I am satisfied to at least have a seemingly equivalent, idiomatic Clojure implementation. I looked at optimizing it through using type hints. I actually had to use a lot of type hints (4) to get a 10% decrease in speed. Anyways, not that this is out here, others will improve it, as that shouldn't be too hard to do!

Tuesday, November 03, 2009

Persistent Data Structures

A few weeks ago I blogged about concurrency patterns and their relative tradeoffs. There were some really poor comments from Clojure fans -- the kind of abusive language and trolling that reminded me of Slashdot in its heyday. In fact, for the first time in a very long time, I had to actually delete comments... A lot of people get confused by the fact that I used Clojure as a representative for software transactional memory, and thought that I was talking about concurrency in Clojure. Anyways, I was flattered that Stuart Halloway commented on my blog. He said I needed to watch Rich Hickey's talk on persistent data structures. So I did. I wanted to embed it on my blog, but it didn't look like InfoQ supports that. So I stole his slides instead:

Persistent Data Structures And Managed References

View more documents from michael.galpin.

A little after all of this, I learned that persistent data structures were coming to Scala. So a lot of folks seem to think that these are pretty important. So why is that exactly? It was time to do some homework...

You can read some basic info on Wikipedia. You can go to the original source, Phil Bagwell's paper. The latter is very important. What's amazing is that these are relatively new ideas, i.e. the science behind this is less than ten years old. Anyways, I would say that the important thing about persistent data structures is that they are data structures designed for versioning. You never delete or for that matter change anything in place, you simply create new versions. Thus it is always possible to "rollback" to a previous version. Now you can see how this is a key ingredient for software transactional memory.

Going back to the original ideas... Bagwell's VLists are the foundation for persistent arrays and thus persistent hashtables in Clojure. These form the foundation of Clojure's Multi-Version Concurrency Control, which in turn is the basis of its STM.

These are not just persistent data structures, they are a very efficient implementation of the persistent data structure concept. Each incremental version of a list shares common data with the previous version. Of course you still pay a price for getting the built-in versioning, but this is minimized to some degree.

Scala fans should also be interested in persistent data structures. They are coming to Scala, which is supposed to be available in beta by the end of this month. The aforementioned Bagwell did his work at EPFL -- the same EPFL that is the epicenter of Scala. Indeed, Bagwell himself has contributed to improving the implementations of VLists in Scala. With VLists in tow, Scala STM is just around the corner.