Today TechCrunch "reported" that
Blaine Cook has left Twitter. I had the pleasure of interviewing Blaine and his fellow Twitter developer
Alex Payne last year. I was working on a book about Ruby on Rails, so they made the perfect people to talk to. After all, they had written a Ruby on Rails application that was being truly pushing the scalability limits of Rails. I was most interested in those limits and how they were attacking them. Let's get back to the present though.
Michael Arrington pins a lot of Twitter's notorious instability squarely on Blaine. He has a point. He points out that Blaine did a presentation at a Rails conference on how Twitter had scaled Rails. If you go out and say "here's how to scale Rails", your app is part of your proof. Facebook can go out and tout the scalability of PHP, MySQL, and memcache. You can dispute their rationale, but their results are hard to argue against. Blaine could have a great rationale behind how they scaled Rails, but the instability of Twitter discredits any argument.
Some of TechCrunch's reader object to this. They point out that Twitter only had three developers. Others say only ignorant non-programmers would blame Rails or Blaine. Maybe they are right. Here goes my explanation.
Take a look at
that presentation that Blaine did. In particular, look at slide 3. I quote "180 Rails instances (Mongrel). Growing fast." That is essentially 180 web servers. That is 180 instances of the Twitter application serving requests made by Twitter users. Keep in mind that this was a year ago, so you can only imagine what the numbers are like now. Now take a look at the next line in that slide "1 Database Server". No mention of that growing. I won't claim to know the intricacies of Twitter's operations, but this is an obvious bottleneck.
Since then Blaine & co. did a lot to alleviate the pressure on their bottleneck. They created an innovative messaging system called
Starling. They made heavier user of memcache. To my knowledge, they still have that 1 Database Server.
If you are familiar with Rails, then you know that this is a flaw in Rails (a flaw in ActiveRecor to be more precise.) RoR associates a class to a database connection. You can write code that alters this behavior, but it is very hard to do this efficiently. Now let's be fair. This is true of a lot of frameworks, maybe all of them. Java practitioners like me know that our favorite technology with similar functionality, Hibernate, has the same flaw. Google has kindly produced an extension,
Shards, to address this. Certainly the general JavaEE specification does not address this.
So it's not just a Rails problem, and we shouldn't blame Rails, right? Not so fast. Rails is the poster child for rapid, high productivity development frameworks. Remember how TC readers pointed out that Twitter only had three develoeprs? Rails is a big part of that story. In general, Rails is one of many technologies that seek to simplify web development as much as possible. This allows three developers to build such a popular site, but that's also the problem.
It's easier than ever to build a website. Social media makes it easier than ever to gain eyeballs. But it is just as hard to scale your site to handle massive traffic. Actually, maybe it is harder. When you rely heavily on frameworks and out-of-the-box goodness, it is that much harder to rip it out or reject it when you outgrow it. Rails magnifies this problem, because it is not just a technology, it is a philosophy. Rails developers pride themselves on the elegance and terseness of their code. If you are already writing ugly code, it's a lot easier to throw it away and write code that is an order of magnitude uglier but solves your scale problems. When you write beautiful code and don't know of a beautiful way to solve your scale problem, what do you do?
Finally, it is easy to trivialize Twitter's problems. "Oh they just need to use PHP" or whatever. Think about their application a little more before you say that. Let's think about a really naive approach to their application. You have a User who creates Updates. One User has many Updates. For example, I am a Twitter
User and today I have made
three Updates. I follow 65 other Users. The naive approach (also the one a DBA would tell you to use) is to have a join table to associate who you follow. So what has to happen when I load my home page? First, we need to figure out the 65 people I follow. Next we need to get their Updates. How many of their Updates? Well there are 20 Updates shown, but they are the 20 newest across all of my 65 friends. So do you grab all of the updates from all of my friends and then sort it? That is certainly the most naive approach. In that naive approach, we need one query to get my friends and then N queries to get all of their updates. We can put a sort on those N queries and then do a merge sort, quitting once we get the first 20 potentially. Still each of those N queries could be returning a lot of data. We all follow
@Scobleizer, right?
So even if we only needed "1 Database Server", things are non-trivial. What about splitting your database? Obviously you want to split the Updates table. Let's think about your home page. Ideally all of the Updates on that page would be in the same database, so all of your friends need to be the same database server. But what about the other people following your friends? You see where this is going. The
Kevin Bacon six degrees of separation problem is going to kick your ass.
The point is that even though Twitter might seem simple to the outsider, it is not. It is complex. Even if they didn't use Rails, it is non-trivial. Rails definitely does not help, at least in some regards, and that's the real story. It may be easier than ever to build something and, perhaps because of social networking, easier to get an audience. But all of the syntactic sugar in the world doesn't make it any easier to scale an application.