Wednesday, May 14, 2008

Oh! my acking monad!

So, we just saw, after reams of paper on proving in inscrutable detail the three monadic laws, that after all that, we see that using them is pretty easy. Fancy that, Hedda!

Haskell does provide several different kinds of monads for various uses, but sometimes one must roll one's own to fit the current need. I was exploring the ackermann function, curious to know what A4,2 looked like [oh, yes, 19730 digits in all its (gory) glory -- I went there!] (other than the cop-out of 265536-3 that is listed in Wikipedia), and whether my computer could calculate and display that value [obviously it could, but it took some work (see below) to arrive at that point]. The ackermann function written in Haskell looks very much like its representation in standard mathematical notation:

a :: IntegerIntegerInteger
a m n | m ≡ 0 = n + 1
| m > 0 ∧ n ≡ 0 = a (m - 1) 1
| m > 0 ∧ n > 0 = a (m - 1) (a m (n - 1))



The "problem" with the Ackermann function is that even though it is easy to write, the recursion required to arrive at a solution from even small (single digit) inputs is staggering. My computer staggered and then killed the computation in mid-process (too bad Redjak didn't have that option). The Ackermann function, being not primitively recursive, is also highly redundantly recursive, so it would be nice to provide "hints" to the program, something like: "Oh, you've already computed A3,1 to be 13, so you don't need to recompute that entire branch any more."

This "hinting" has a name in computer science; it's called "memoization", and Haskell provides the State monad that can be used to cover that functionality quite nicely. As we're computing the values that build up the solution, we put the smaller-indexed solutions into a dictionary, something like:




A3,1=13
A2,2=7
A2,1=5
... and so forth ...


Such a dictionary in Haskell is called a Map, because we are mapping from the "word" (the indices of the Ackermann function) to the "definition" (the solution)...

type Index = (Integer, Integer)
type AckMap = Map Index Integer


... and we wrap that Map with the State monad ...

type AckMapS = State AckMap Integer


... where AckMapS uses the AckMap dictionary to provide the preexisting (partial) solution, or, given there's none yet, populates the solution calculated (the Integer) into the dictionary at the current Index. Simple, yes? So, all we need is an utility function that does the lookup or updates the state ...

ackify :: IntegerIntegerAckMapS
ackify m n = get >>= λma . maybe (a' m n >>= update m n)
return
(Map.lookup (m, n) ma)


... the update function that actually does the housekeeping ...

update :: IntegerIntegerIntegerAckMapS
update m n ans = do mappo ← get
put (Map.insert (m, n) ans mappo)
return ans


... so that our monadic version of the ackermann function is as follows:

a' :: IntegerIntegerAckMapS
a' m n | m ≡ 0 = return (n + 1)
| m > 0 ∧ n ≡ 0 = ackify (m - 1) 1
| m > 0 ∧ n > 0 = ackify m (n - 1) >>= ackify (m - 1)


which looks very much the same as the original function, with calls to a being replaced by calls to the lookup-or-update ackify function and function composition (e.g. a (m - 1) (a m (n - 1))) being replaced by the monadic composer, >>=. So, from the above demonstration we see that monads are not only easy to use, but also easy to "roll your own" and integrate into preexisting non-monadic code without much fuss.

Critique

Although the incorporation of the State monad dramatically speeds processing solutions and makes some solutions computable that were unreachable under the naïve approach, there are at least two better generalizations:


  1. Use the fixpoint (or the Y-combinator) of the ackermann function so that one can now decorate the engine of computation without altering its structure at all! Incredible! So instead of using a monad to abstract stateful decoration of computations, the fixpont can be used to abstract any decoration of computation!

  2. Use the compiler to change the code, instead of having it programmed in. Many functional language compilers have the option to memoize computations automagically. So, instead of writing code that memorizes partial results, the compiler intercepts the computation, replacing computation with previously calculated values (or running the computation if no such value exists and then storing that result into the runtime). No extra coding; no "troublesome" monads.



Critique on the critique

On the other hand ("there are five fingers", as my favorite Mother-in-law enjoys rejoining),


  1. The fixpoint solution is more general and more elegant, but being more general suffers in a slight decrease in efficiency: it more than likely will run more slowly than the specialized monadic state solution

  2. With an explicit encoding of state, the programmer has direct say into what is memoized and what is not, but with a compiler directive, everything is memoized. Since the Ackermann function can be represented by simple functions for m < 4, the map can be replaced by a faux-map for those values, reducing the memory footprint of memoization to only a couple of cells for m == 4 (as opposed to over 150,000 cells for m == 4 and n == 1 using the naïve memoization scheme).



... so the above described solution may be "good enough" for the task at hand, after all. Put another way, when I said I was wrong, I might have been wrong. Caveat programmer.

No comments: