Wednesday, May 28, 2008

Monoid use

I've been reading a bit about Monoids, which are a generalization of the MonadPlus specialization of the Monad class. The generalization of Monoids is that whereas a particular Monad type carries some (hidden) value, a Monoid type has no such requirement.

Firstly, what is a Monoid type? It is something that has an empty representation (mempty :: a) and has the property that allows two values to be joined (appended) to form a new value (mappend :: a → a → a). This protocol sounds very much like the one for MonadPlus types, doesn't it? It does, indeed, and with good reason, for every MonadPlus type is also a Monoid type:

instance MonadPlus m ⇒ Monoid (m a) where
mempty = mzero
mappend = mplus


What are some examples of Monoid? Well, of course, the types that are MonadPlus types, like Maybe and specified list types (but with an important caveat: the generalized list container must be specialized to contain some particular type a). But besides these, Monoid gives us more types that do not fit within the MonadPlus protocol, for example, and importantly, integrals under addition or multiplication.

Ah! So if one were, say, doing an operation with some kind of list and then doing the exact same kind of thing, but simply marking results (counting, as [brilliantly!] demonstrated by the Peano series, is operating on integers under addition) without the Monoid abstraction, we would need to write two separate procedures (in the infamous style of copy-and-waste, er, -paste) to work with the two separate data types, but with Monoid we can have simply one procedure working with the Monoid type, giving us some measure of polytypic expressiveness. This brings us to where we were at the end of the previous entry on factorization, where we had developed a unifying function for factorization (where its rôles changed depending on what seed value it was given: it returned the factors when given a list or returned a count of the factors when given an integer). Recall that our helper function, mkadder, needed to be supplied with a translator function for the individual values processed and a concatenator function to tally those translated values:

showfactors x = factors x (mkAdder return mplus) []
countfactors x = factors x (mkadder (const 1) (+)) 0


The first using function, showfactors seized the advantage that lists are MonadPlus types, but the second, countfactors, could not do the same because integers are not monadic (they are not (polymorphic) containers), so this function had to provide its own versions of translation and concatenation.

This problem goes away, however, since both these particular types are Monoid types, right? Yes, well, there's the issue of how to enforce this relation — a unspecialized list is not a Monoid type, nor are integers, in general. For this particular case, we must demonstrate that our particular data are of the Monoid class (a list of integer for the former and integers under addition for the latter).

So, what needs to be done is that these types need to be declared instances of the Monoidic domain by injected their values into that domain (thanks to Dirk Thierbach's explanation):

class Injector m where
inject :: Integral a ⇒ a → m a

instance Injector [] where
inject x = [x]

instance Injector Sum where
inject x = Sum 1


The above instance declarations do just that, making an injector on generalized lists one that translates some integral value into a specified (i.e. monoidic) list and making an injector on (generalized) sums one that translates some integral value (again) into a specified (again, monoidic) sum -- in this case the value is always translated into 1, because we are summing the number of factors (each factor adding 1 to the total number of factors), not the value of factors.

With these instances, we can now employ the protocol of Monoid to generalize the mkadder function. Recall its definition ...

mkadder :: Integral a ⇒ 
(a → b) → (b → b → b) → a → a → (b → b)
mkAdder f g y z = g (if y ≡ z
then f y
else g (f y) (f z))


... where f is the translation function and g is the concatenation function. So, what mkadder does is to provide a concatenator for either just one of the values (if both values are the same) or for both values.

Now, we add the properties of the Injector as well as those of the Monoid to get a new mkadder function that can stand alone needing neither a translator nor a concatenator to be provided from using functions ...

mkadder :: (Injector m, Monoid (m a), Integral a) ⇒
a → a → (m a → m a)
mkAdder y z = mappend (if y ≡ z
then inject y
else inject y `mappend` inject z))


... where the generic functions f and g are replaced, respectively, by the inject function from the Injector class and the mappend function from the Monoid class. Note, also that the type-relation that was unspecific in the previous version, (a → b), now has an explicit relation, (a → m a), thanks to the relation of the types between the Injector and Monoid classes. This relationship gives us a weak equivalent of a functional dependency. With this change in place, the using functions now no longer need to specify these functions, so they, in turn, simplify to:

showfactors x = factors x mkAdder []
countfactors x = getSum (factors x mkadder (Sum 0))


So, if you find yourself doing the same thing to very different types, and they all are not monadic, then perhaps the monoidic form may give the generality to encode those disparate types under one solution.

3 comments:

Peter Berry said...

"class MonadPlus m ⇒ Monoid (m a)" should of course start "instance".

geophf said...

@peter berry, oops! Shame on me and good catch. Fixed.

Edward Kmett said...

A quick clarification:

Every instance of MonadPlus does not give rise to an instance of Monoid in Haskell, because of two concerns.

1.) Many other Monoids would require overlapping instances because the instance head (m a) is very general.

2.) instance MonadPlus m => Monoid (m a) isn't valid Haskell 98 since it requires flexible instances.

The definition above is admissable; whenever you have a particular MonadPlus you can define an instance of that form, but its not baked in by default as a general rule.