Sunday, 10 June 2012

Unthreading

Multi-threading is hard. To fully understand a piece of multi-threaded code the reader must mentally interleave two or more competing timelines. Our poor primate brains have enough trouble managing one thread of execution.

But what if we didn't have to carry around the idea of time at all? What if we could move about the codebase without having to mentally pause, play, fast-forward and rewind a hypothetical execution instance of the program?

The venerable and venerated Structure and Interpretation of Computer Programs comments with sadness on the entrance of the mutation-snake into the Lisp Garden of Eden: 
With objects, we must be concerned with how a computational object can change and yet maintain its identity. This will force us to abandon our old substitution model of computation in favor of a more mechanistic but less theoretically tractable environment model of computation. The difficulties of dealing with objects, change, and identity are a fundamental consequence of the need to grapple with time in our computational models. These difficulties become even greater when we allow the possibility of concurrent execution of programs.
Immutability makes code easier to understand because it allows the reader to use a simpler mental model.

Code that exclusively employs immutable (immortal) variables is effectively unthreaded from the reader's point of view. The meaning of the code is no longer dependent on time at all. The value of a symbol is a function of its definition - history is irrelevant.

Single-threaded code is easier to understand than multi-threaded code. But unthreaded code is best of all.

6 comments:

  1. I would nitpick a little here: I'd argue it's referential transparency, rather than immutability, that allows unthreading. Referential transparency doesn't imply immutability, but it severely limits where it can occur. If a function is referentially transparent, it doesn't matter which thread of execution it runs on, even if internally it uses a little mutable state (as everything does at the hardware level).

    The same is true for referential transparency and side effects -- they're not mutually incompatible, but side effects are severely limited where referential transparency is to be upheld.

    Nevertheless, it's still a pretty high correlation between programs with lots of immutability and programs which don't care about concurrency.

    ReplyDelete
  2. Good point. Reading input isn't (normally) considered mutability, though it is referentially non-transparent and does force the reader to consider time.

    There's definitely a deep relationship between the two concepts. Perhaps you could consider immutability to be a special case of referential transparency for constant functions?

    ReplyDelete
  3. What's a constant function? Do you mean a function which takes no parameters (or Unit)?

    Reading input becomes referentially transparent in a lazy language which restricts access to standard input through something like an IO monad. The monad enforces an idea of time and sequentiality in a pure environment.

    ReplyDelete
  4. I mean a value is basically a function that takes no arguments. In Lisp, they're syntactically distinct because you need () for function invocation, but in Haskell there's really no difference. So I meant that an immutable value is indistinguishable from a function that takes no arguments and always returns the same answer.

    It's a good point about the I/O monad. My understanding of Haskell I/O is that it models I/O as a function from one state of the world to another, and that the world is threaded through I/O actions by the monad.

    In the end, Haskell I/O functions are referentially transparent because they don't do I/O - they just describe it. The non-referentially transparent stuff is delegated to the evaluation environment that invokes the main function.

    ReplyDelete
  5. In Haskell, there's a difference between Unit -> Int and Int, though not a major one.

    In Lisp, there are nullary functions which don't correspond to values -- (fn [] (rand-int 3)) or (fn [] (ref 4)) spring to mind.

    I keep finding new mental models for haskell I/O. I've been thinking recently about the fact that in "do line <- getLine; f line" the expression "f line" is necessarily evaluated *after* I/O has been performed - so Haskell programs manage to interleave evaluation of functions with performing I/O while retaining referential transparency at all points. I started thinking about this sort of thing while reading Wadler's paper "Comprehending Monads" -- highly recommended: readable, well articulated, metaphor-free.

    ReplyDelete
  6. Oh wait, I understand you now -- a referentially transparent nullary function is equivalent to an immutable value. Yes, I agree.

    ReplyDelete