Tuesday, November 03, 2009

Persistent Data Structures

A few weeks ago I blogged about concurrency patterns and their relative tradeoffs. There were some really poor comments from Clojure fans -- the kind of abusive language and trolling that reminded me of Slashdot in its heyday. In fact, for the first time in a very long time, I had to actually delete comments... A lot of people get confused by the fact that I used Clojure as a representative for software transactional memory, and thought that I was talking about concurrency in Clojure. Anyways, I was flattered that Stuart Halloway commented on my blog. He said I needed to watch Rich Hickey's talk on persistent data structures. So I did. I wanted to embed it on my blog, but it didn't look like InfoQ supports that. So I stole his slides instead:
A little after all of this, I learned that persistent data structures were coming to Scala. So a lot of folks seem to think that these are pretty important. So why is that exactly? It was time to do some homework...

You can read some basic info on Wikipedia. You can go to the original source, Phil Bagwell's paper. The latter is very important. What's amazing is that these are relatively new ideas, i.e. the science behind this is less than ten years old. Anyways, I would say that the important thing about persistent data structures is that they are data structures designed for versioning. You never delete or for that matter change anything in place, you simply create new versions. Thus it is always possible to "rollback" to a previous version. Now you can see how this is a key ingredient for software transactional memory.

Going back to the original ideas... Bagwell's VLists are the foundation for persistent arrays and thus persistent hashtables in Clojure. These form the foundation of Clojure's Multi-Version Concurrency Control, which in turn is the basis of its STM.

These are not just persistent data structures, they are a very efficient implementation of the persistent data structure concept. Each incremental version of a list shares common data with the previous version. Of course you still pay a price for getting the built-in versioning, but this is minimized to some degree.

Scala fans should also be interested in persistent data structures. They are coming to Scala, which is supposed to be available in beta by the end of this month. The aforementioned Bagwell did his work at EPFL -- the same EPFL that is the epicenter of Scala. Indeed, Bagwell himself has contributed to improving the implementations of VLists in Scala. With VLists in tow, Scala STM is just around the corner.

3 comments:

  1. Hi Michael,

    Thanks for covering Clojure. I'm sorry you had some inappropriate comments first time around, that's not indicative of the Clojure community in general.

    A clarification - Clojure hash maps are not based upon Bagwell's VLists, but on his Array Mapped Hash Tries:

    http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.6279

    Also the link to the presentation video should be:

    http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey

    Regards,

    Rich

    ReplyDelete
  2. Anonymous2:46 AM

    FWIW, I've created a number of persistent data structures for Ruby: http://github.com/harukizaemon/hamster

    ReplyDelete
  3. We are a third party technical support service. Avast Customer Support is here to help you out with the whole procedure to Download Avast Antivirus online, We not only fix your Avast Support related issues but will guide with how to get started with your new Avast product once it gets installed successfully.We at Avast Tech Support provides service to protect your PC from potential online threats and external attacks like viruses, Trojans, malwares, spywares and phishing scams. And Avast Refund. Call on our Avast Phone Number.

    Gmail Customer service is a third party technical support service for Gmail users when they face any technical issue or error in their Gmail account. Our Gmail Customer Support team solves issues like forgot Gmail account password, Gmail configuration or Sync issues, recover deleted emails and many more. Toll Free number (800) 986-9271
    How you install or reinstall Office 365 or Office 2016 depends on whether your Office product is part of an Office for home or Office for business plan. If you're not sure what you have, see what office com setup products are included in each plan and then follow the steps for your product. The steps below also apply if you're installing a single, stand-alone Office application such as Access 2016 or Visio 2016. Need Help with office setup Enter Product Key? Call 1-800-000-0000 Toll Free
    Norton Tech Support is a third party service provider and not in any way associated with Norton or any of its partner companies. We offer support for Norton products and sell subscription based additional warranty on computer and other peripheral devices. Call our Toll Free number 1 855 966 3855
    Other Services

    ReplyDelete