In an excellent piece of pedagogy, Gabriel Gonzalez explains how to implement composable streaming (space leak free) folds in Haskell. In this article I’ll explain how to make these composable folds even more composable!
Gabriel’s key idea is captured in his foldl package, where the notion of a left fold is abstracted into the
data Fold a b = forall x . Fold (x -> a -> x) x (x -> b)
Fold a b represents a left fold on a list of
a, updating a state of (existential) type
x, and eventually returning a final value of type
x -> a -> x) reads a value of type
afrom the list and updates the state
x) is the start state.
x -> b) maps the final state of type
xto a final state of type
There is an
Applicative instance for
Fold a which combines two separate folds into a single fold that combines their state. In this way we can avoid space leaks when running two separate folds on the same list. For example
average = (/) <$> L.sum <*> L.genericLength L.fold average [1..10000000]
only walks the list once and does not leak space.
Suppose I have a value of type
[(Bool, Double)] and I want to run
and on the first component (returns
True if all elements are
False otherwise), and
average (as defined above) on the second component. The simple approach of projecting out each component and running the folds separately is flawed.
myList :: [(Bool, Double)] L.fold L.and (map fst myList) L.fold average (map snd myList)
This exhibits exactly the kind of space leaks we are trying to avoid. The spine of
myList is fully forced and kept around between the two calls to
L.fold. There is a solution, and that is to use a
p is like a
Functor with two arguments, but the first is contravariant meaning that it acts as a receiver of values. The class definition for
class Profunctor p where lmap :: (a -> a') -> p a' b -> p a b rmap :: (b -> b') -> p a b -> p a b' -- rmap is just like fmap on the right-hand type variable
(For a more in-depth look at
Profunctor see my 24 Days of Hackage post on the subject).
Fold is a
Fold a b receives values of type
a and emits a value of type
b. Using the
Profunctor instance we can combine two
Folds of different argument types into one:
(***!) :: (Applicative (p (a, a')), Profunctor p) => p a b -> p a' b' -> p (a, a') (b, b') p ***! p' = (,) <$> lmap fst p <*> lmap snd p' andWithAverage :: L.Fold (Bool, Double) (Bool, Double) andWithAverage = L.and ***! average
Now we can run
L.fold andWithAverage myList without space leaks!
One could write a typeclass to capture the essence of what we have just implemented
class Profunctor p => ProductProfunctor p where (***!) :: p a b -> p a' b' -> p (a, a') (b, b') empty :: p () ()
(***!) can be implemented exactly as given above, and
empty is just
pure (). So what is the point of using a typeclass rather than simply using the
Profunctor instances? The class ensures that the
Applicative instance for
p a is independent of a. I’m not certain this is necessary, but it does seem to be a useful sanity check.
Profunctor instance for
Fold increases the composability of Gabriel’s composable streaming folds. The
ProductProfunctor is an interesting concept.
foldl library doesn’t actually supply a
Profunctor instance, but Edward Kmett’s similar
folds library does have