Bytes.Codes

  • Understanding Scala's Type Classes

    Over the last year or so, I have found myself making more and more use of Scala’s Type Class system to add flexibility to my code. This is especially evident in the MongoDB Scala Driver, http://github.com/mongodb/casbah, where the most recent work has been to simplify many features by migrating them to type classes.

    During this work however, I’ve found during that many otherwise adroit Scala engineers seem befuddled or daunted by the Type Class. It does me no good to take advantage of clever features that my users don’t understand, and many will benefit from introducing these concepts to their own code. So let’s take a look at what type classes are, as well as how & why we can utilize them.

    Wikipedia defines a Type Class as “… a type system construct that supports ad-hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types”. Admittedly, a bit of a mouthful – and not very helpful to those of us who are self taught and lack the benefit of a comprehensive academic Computer Science education (myself included). Surely, there must be a way to simplify this concept?

    In evaluating these ideas, I’ve found it easiest to think of a Type Class (in Scala, at least) as a special kind of adapter, which can impart additional capabilities upon a given type or set of types. In Scala the Type Class is communicated through implicits, and imparts one, or both, of two behaviors. First, a Type Class can be to utilized to filter what types are valid for a given method call (which I detailed in this earlier post). Second, a Type Class can impart additional features and behaviors upon a type at method invocation time. This latter is much along the lines of an enhanced kind of composition, rather than the weaker inheritance which often plagues similar behaviors in a language like Java.

    To better understand what I am describing, let’s compare a few concepts around the creation and interaction of custom domain objects. I have several sets of tasks I have had to accomplish in Scala in the past – and Scala solutions show some elegant Type Class oriented approaches which are rooted in the Standard Library. While this may seem a bit contrived, it is exactly the kind of problem through which I initially came to understand Type Classes –– and is thus an ideal lesson.

  • Forcing Scala Compiler 'Nothing' Checks

    Since early in its history, Casbah has had a helper method called getAs[T], where T is “Some type you’d like to fetch a particular field as”. Because of type erasure on the JVM, working with a Mongo Document can be annoying – the representation in Scala is the equivalent of a Map[String, Any]. If we were to work with the Map[String, Any] in a standard mode, fetching a field balance which is a Double would require manual casting.

    val doc: DBObject = MongoDBObject("foo" -> "bar", "balance" -> 2.5)
    
    val balance = doc.get("balance") 

    We have already hit another issue here – in Scala, invoking get on a Map returns Option[T] (Where, in this case, T is of type Any). Which means casting has become more complex: to get a Double we also have to unwrap the Option[Any] first. A lazy man’s approach might be something hairy like so:

    balance.getOrElse(null).asInstanceOf[Double]

    In the annals of history (when men were real men, and small furry creatures from Alpha Centauri were real small furry creatures from Alpha Centauri), the above became an annoyingly common pattern. A solution was needed - and so getAs[T] was born. The idea was not only to allow a shortcut to casting, but take care of the Option[T] wrapping for you as well. Invoking getAs[Double] will, in this case, return us an Option[Double].

    But not everything is perfect in the land of getAs[T] – if the type requested doesn’t match the actual type, runtime failures occur. Worse, if the user fails to pass a type, the Scala compiler substitutes Nothing, which guarantees a runtime failure. Runtime failures are bad – but fortunately, Miles Sabin & Jon-Anders Teigen came up with an awesome solution.

  • Casbah 2.3.0-RC1 Released

    Today, I published the first Release Candidate of Casbah 2.3.0, available for SBT users as "org.mongodb" % "casbah" % "2.3.0-RC1". My release announcement to implicit.ly contains the details on all of the bugs fixed – I will also be posting another set of blog entries shortly outlining the specific improvements to the code and demoing fetaures.

    It has been just about a year since the last major release of Casbah, which was version 2.1.5-1. A number of factors led to the delay in getting a major update out the door, for which I apologize. Amongst other things I have spent much of the last year since Casbah’s prior release on the road doing training, consulting and evangelization of MongoDB to users around the globe; I had less time for code among all these things than I expected! Additionally, after releasing the 2.1.x series of Casbah I embarked on what quickly morphed from Casbah 2.2.0 to 3.0.0 – a major refactoring and cleanup of 2+ years of API cruft and “I’m Gonna Learn Me Some Scala!” detritus. In all the excitement to release a perfect release to end all releases, I did a poor job of making it easy to backport and maintain a compatibility series for 2.x users – a harsh lesson in the importance of creating small, bite sized git commits that can be cherry picked.

    So What Happened to Casbah 3.0? And 2.2?

    Casbah 2.2.x is dead – 3.0.x is certainly not! When the work following 2.1.x was begun, I had published a number of early snapshots as Casbah 2.2.0-SNAPSHOT. During this development cycle I found a lot of the aforementioned detritus such as overloaded methods (Casbah was begun as a Scala 2.7.x project and I never fully moved its core APIs over to use named and default arguments - some of these are corrected in Casbah 2.3.0 but in the interest of backwards compatibility with prior releases of 2.x, not completely). As I worked on coding improvements around these things the API drifted further and further away from compatibility and I chose to kill off the 2.2.x series, planning the next release of Casbah as 3.0.0. In addition to that, I intended Casbah 3.0.0 to coincide with MongoDB 2.2.0 which will have additional features such as the New Aggregation Framework. As MongoDB 2.2 hasn’t been released yet, it became clear I needed to provide an updated Casbah release with many of the improvements but without many of the API breakages introduced in 3.0 - including a vastly improved build of the Casbah Query DSL which has stronger type safety and compiler checks thanks to the inimitable Jon-Anders Teigen.

    Casbah 3.0 is still very much alive and in development, with 2.3.0 representing a backporting of many of the changes and improvements from 3.0. Because of the abandonment of the original 2.2 development series, I felt it was saner to kill 2.2.x dead and bring the backports into a 2.3.x series. You can, if you wish, think of this as Casbah 2.3 - The Search for Casbah 2.2 (The long rumored sequel to Spaceballs has been said to be called Spaceballs 3: The Search for Spaceballs 2).

    I will continue to support and improve Casbah 2.3.x moving forward as well as completing Casbah 3.0 (still intended to coincide with the release of MongoDB 2.2). If you have any questions, please don’t hesitate to contact me!

    Later tonight or tomorrow I will post an entry or two detailing all of the wonderful changes in Casbah 2.3.0 and how to take advantage of them.

  • Immutability and Clever Variable Usage in the Land of Blocks and Branches

    Last night, I found myself unconciously refactoring some Scala code (I don’t recall if it was something I wrote or someone else did at this point). As I looked at what I was doing I realized that many Scala developers don’t seem entirely aware of one of my favorite features. What I’m talking about is effectively capturing values from multibranch block statements in Scala. Used correctly they can greatly decruft complicated code as well as helping us use mutability in places we might not expect an easy way to do so.

    In typical C-like languages (such as C, C++, and Java) we are restricted in our syntax should we wish to capture a value when running many branching blocks such as if-else statements, switch statements and even for/foreach constructs. When we find ourselves wanting to set the value of a variable within each possible condition or iteration, we need to declare a mutable variable before the block. We then mutate this variable within each condition or iteration. Take this example from Java:

    boolean valid = false;
    
    String status = null;
    if (valid) {
        status = "VALIDATED";
    } 
    else {
        status = "INVALID";
    }
  • User Configurable Type Filtering with Scala Type Classes

    When I woke up this morning and looked through my twitter mentions, I found this gem sitting there from the middle of the night:

    @rit when using "_id" $lt new ObjectId(timestamp) it throws ValidDateOrNumericType, but we might want to select records after id timestamp (from @justthor)

    The user in question is complaining that when using Casbah’s DSL, it doesn’t allow a MongoDB ObjectId as a valid type to the $lt operator. But as @justthor points out, it is entirely possible to use ObjectId with the $lt operator since it contains timestamp information (See the documentation for ObjectId if you want nitty gritty detail). When I wrote the code for $lt however, I needed to decide what types were valid and weren’t valid; I can’t exactly guarantee type safety wih a DSL like Casbah’s, but I can enforce type sanity. Whether I forgot that you can use ObjectId in $lt or just decided that most people wouldn’t need to is irrelevant — I had in this case blocked a user from accomplishing something valid that they needed to.

    It is a more than reasonable problem, and my initial reaction was “oh crap, I guess I need to patch that”. But what I forgot is that a few releases back, I rearchitected Casbah to obviate this kind of problem. Casbah now allows for a user definable (or, if you prefer, “adjustable”) type filter on any of its DSL operators. This is accomplished through a very simple application of Scala Type Classes, a term which gets batted around a lot in the Scala community, but few seem able to understand or articulate its meaning to us lesser mortals. Over the last few months I’ve come to understand Type Classes much more deeply than I think I ever expected, and applied these lessons to the design of my code. As I failed to document the power and usage of these features at the time, I am going to be writing some additional detailed articles about my understanding of Type Classes in the next few weeks, and this is the first of such explanations.

    So the question at hand is, how exactly does Casbah allow us to do this magical type filtering that I just mentioned, without patching the driver or creating a new release? First, let’s look at how Casbah used to do things before the introduction of the as-yet unexplained Type Class introduction.