I'll start with the story of how I got saved, since it's kind of relevant. Back when I was an
English Ph.D. student, I worked on a number of projects that involved natural language
processing, which meant doing a lot of counting trigrams or whatever in tens of thousands of text
files in giant messy directory trees. I was working primarily in Ruby at the time, after years
of Java, and at least back in 2008 it was a pain in the ass to do this kind of thing in
either Ruby or Java. You really want a library that provides the following features:
- Resource management: you don't want to have to worry about running out of file handles.
- Streaming: you shouldn't ever have to have all of the data in memory at once.
- Fusion: two successive mapping operations shouldn't need to traverse the data twice.
- Graceful error recovery: these tasks are all off-line, but you still don't want to have to
restart a computation that's been running for ten minutes just because the formatting in one file
Maybe there was such a library for Ruby or Java back then, but if there was I didn't know about it.
I did have some experience with Haskell, though, and at some point in 2010 I heard about
iteratees, and they were exactly what I'd always wanted. I didn't really
understand how they worked at first, but with iteratee (and later
John Millikin's enumerator) I was able to write code that did what I wanted
and didn't make me think about stuff I didn't want to think about. I started picking Haskell
instead of Ruby for new projects, and that's how I accepted statically-typed functional programming
into my life.
I've always really liked this passage from On the Genealogy of Morals:
[T]here is a world of difference between the reason for something coming into
existence in the first place and the ultimate use to which it is put, its
actual application and integration into a system of goals… anything which
exists, once it has come into being, can be reinterpreted in the service of
new intentions, repossessed, repeatedly modified to a new use by a power
superior to it.
A couple of months ago at LambdaConf I had a few conversations
with different people about why we like (or at least put up with) Scala when
there are so many better languages out there. Most of the answers were the usual
ones: the JVM, the ecosystem, the job market, the fact that you don't have to
deal with Cabal, etc.
For me it's a little more complicated than that. I like Scala in part because
it's a mess. It's not a "fully" dependently typed language, but you can get
pretty close with singleton types and path dependent types. It provides
higher-kinded types, but you have to work around lots of bugs and gaps and
underspecified behaviors to do anything very interesting with them. And so
on—it's a mix of really good ideas and a few really bad ideas and you can put
them together in ways that the language designers didn't anticipate and probably
don't care about at all.
The rest of this blog post will be a long story about one example of this kind of
thing involving Scalaz's
Suppose we've got a simple representation of a user:
case class User(id: Long, name: String, email: String)
Now suppose we're writing a web service where we allow clients to post some
JSON to a resource to create a new user. We get to pick the
id, not the client,
so we might accept something like this:
"name": "Foo McBar",
If we're using a type class-based JSON library like Argonaut, we'll
probably have written a codec instance for
User (or we may be using a library
like argonaut-shapeless that derives instances for our
case classes automatically).
The problem is that our
User codec won't work on JSON like the
example above (since it's missing the
People say that
Validation is Scalaz's gateway drug,
which might be accurate if you ignore the suggestion that there's
anything even remotely fun about validation. In my book, making sure that
your program doesn't choke on bad input is always a chore.
Applicative validation is at least a step in the right direction—it makes it easier to
write less code, introduce fewer bugs, and draw clearer lines
between data models and validation logic. Suppose for example that we have the
following case class in Scala:
case class Foo(a: Int, b: Char, c: String)
Suppose also that we have a form with three fields that we want to use to
create instances of
Foo. We receive input from this form as
strings, and we want to be sure that these strings have certain properties.