Tuesday, October 13, 2009

Scala Manifests FTW

I came across a piece of code written by a colleague. It was a flexible XML/JSON parser. It would turn an XML or JSON structure into a map. The keys were strings. The values were either strings, lists, or maps. The lists could be lists of strings, lists, or maps. The maps had strings as keys and value as (wait for it) strings, lists, or maps. We had run across a bug recently. Usually a particular web service returned data that looked something like:
{ "details" : { "a" : "x", "b" : "y" } }
So we had code that looked like :
val response = // code that called the parser
val foo = response("details").asInstanceOf[Map[String,String]]("a")
However, one day we got some bad data:
{ "details" : "" }
So of course the earlier code blew up. I wanted to have something like this:
trait SafeMapTrait {
    def getString(key:String):String
    def getList(key:String):List[AnyRef]
    def getMap(key:String):Map[String, AnyRef]
Now this can be accomplished pretty easily:
class EasySafeMap(val map:Map[String, AnyRef]){
  def getString(key:String):String = {
    if (map.contains(key)){
      if (map(key).isInstanceOf[String]) map(key).asInstanceOf[String] else null
    } else null
  // etc.    
There would be similar methods for lists and maps. I didn't like this, and thought I should be able to do better. Looking at the final solution, I'm not sure that I did. But I did learn some things about Scala Manifests... Before we get there, let's look at me first naive attempt to do better:
class NotSoSafeMap(val map:Map[String,AnyRef]){

  def getString(key:String):String = getType(key)
  def getList(key:String):List[AnyRef] = getType(key)
  def getMap(key:String):Map[String,AnyRef] = getType(key)

  private def getType[T](key:String):T  = {
    val value = map.getOrElse(key, null)
    if (value != null && value.isInstanceOf[T]) value.asInstanceOf[T] else null

That would have been, huh? I really wanted to use a parameterized method for the extraction, comparison, casting. The problem with this is that there is no way to know the type T. You could explicitly add the parameter, i.e. getType[String](key) but it doesn't help because of erasure. I tried this instead:
class NotSoSafeMap(val map:Map[String,AnyRef]){

  def getString(key:String):String = getType(key,null)
  def getList(key:String):List[AnyRef] = getType(key,null)
  def getMap(key:String):Map[String,AnyRef] = getType(key,null)

  private def getType[T](key:String, default:T):T  = {
    val value = map.getOrElse(key, default)
    if (value.isInstanceOf[T]) value.asInstanceOf[T] else default

I thought that this might be better because of the type information being given in the default value. This didn't work. Using the null default seemed dumb, but even adding defaults like the empty string, an empty list, etc. did not help. Erasure was once again kicking my ass. So it was time to learn about Manifests.
I had heard Jorge Ortiz talk about manifests previously. He has also written an excellent blog post about them. He told me that these were still "experimental" (i.e. undocumented) in Scala 2.7.x, but were officially part of the upcoming 2.8 release. Sounded good to me. Here is the solution I came up with:
class SafeMap(val map:Map[String,AnyRef]){
  import scala.reflect.Manifest

  def getString(key:String):String = getType[String](key) match {
    case Some(s:String) => s
    case _ => null

  def getMap(key:String):Map[String, AnyRef] = getType[Map[String,AnyRef]](key) match {
    case Some(m:Map[String, AnyRef]) => m
    case _ => null

  def getList(key:String):List[AnyRef] = getType[List[AnyRef]](key) match {
    case Some(list:List[AnyRef]) => list
    case _ => null

  private def getType[T](key:String)(implicit m:Manifest[T]):Option[T] = {
    map.getOrElse(key, null) match {
      case a:AnyRef => if (m >:>  Manifest.classType(a.getClass)) Some(a.asInstanceOf[T]) else None
      case null => None
Ok, a few things to note here. First the local import of scala.reflect.Manifest. Again it's not a documented class, but it's in there. Now my getType method. Notice that it uses the function_name (param:type) (param:type) syntax. Also notice the implicit Manifest parameter. The callers don't add this, the compiler adds it for you. Next notice that it returns an Option class. I wanted it to just return T. However, I could not have a case where it returned null if T was the declared return type of the method. So I went with Option. Finally, notice the Manifest magic. That's the m >:> Manifest.classType(a.getClass). The right hand side of the call uses a factory method in the Manifest singleton object, to create a Manifest for the (class of the) value coming back from the map. The >:> operator checks to see if the right hand side represents a subclass of the left hand side. This is important. For the getMap method, the manifest will represent the Map trait (actually a Java interface in this case.) The call to a.getClass gives you the runtime class of a. Of course this runtime class implements the Map trait, but you can't do equality comparison. Hence the >:> operator. One last thing, notice that the getString method uses the explicit getType[String]. You would think that the compiler could infer this since the left hand is explicitly declared as a String. It doesn't. When I tried it without the explicit type parameter, my manifest would always Manifest[Nothing].

No comments: