In an excellent piece of pedagogy, Gabriel Gonzalez explains how to implement composable streaming (space leak free) folds in Haskell. In this article I’ll explain how to make these composable folds even more composable!
Gabriel’s key idea is captured in his foldl
package, where the notion
of a left fold is abstracted into the Fold
datatype.
data Fold a b = forall x . Fold (x -> a -> x) x (x -> b)
A Fold a b
represents a left fold on a list of a
, updating a state
of (existential) type x
, and eventually returning a final value of type b
.
x -> a -> x
) reads a value of type a
from the
list and updates the state x
.x
) is the start state.x -> b
) maps the final state of type
x
to a final state of type b
.There is an Applicative
instance for Fold a
which combines two
separate folds into a single fold that combines their state. In this
way we can avoid space leaks when running two separate folds on the
same list. For example
average = (/) <$> L.sum <*> L.genericLength
L.fold average [1..10000000]
only walks the list once and does not leak space.
Suppose I have a value of type [(Bool, Double)]
and I want to run and
on the first component (returns True
if all elements are True
,
False
otherwise), and average
(as defined above) on the second
component. The simple approach of projecting out each component and
running the folds separately is flawed.
myList :: [(Bool, Double)]
L.fold L.and (map fst myList)
L.fold average (map snd myList)
This exhibits exactly the kind of space leaks we are trying to avoid.
The spine of myList
is fully forced and kept around between the two
calls to L.fold
. There is a solution, and that is to use a
Profunctor
instance.
A Profunctor
p
is like a Functor
with two arguments, but the
first is contravariant meaning that it acts as a receiver of
values. The class definition for Profunctor
is
class Profunctor p where
lmap :: (a -> a') -> p a' b -> p a b
rmap :: (b -> b') -> p a b -> p a b'
-- rmap is just like fmap on the right-hand type variable
(For a more in-depth look at Profunctor
see my 24 Days of Hackage
post on the
subject).
Fold
is a Profunctor
, because Fold a b
receives values of type
a
and emits a value of type b
. Using the Profunctor
instance
we can combine two Fold
s of different argument types into one:
(***!) :: (Applicative (p (a, a')), Profunctor p) =>
p a b -> p a' b' -> p (a, a') (b, b')
p ***! p' = (,) <$> lmap fst p <*> lmap snd p'
andWithAverage :: L.Fold (Bool, Double) (Bool, Double)
andWithAverage = L.and ***! average
Now we can run L.fold andWithAverage myList
without space leaks!
One could write a typeclass to capture the essence of what we have just implemented
class Profunctor p => ProductProfunctor p where
(***!) :: p a b -> p a' b' -> p (a, a') (b, b')
empty :: p () ()
(***!)
can be implemented exactly as given above, and empty
is
just pure ()
. So what is the point of using a typeclass rather than
simply using the Applicative
and Profunctor
instances? The class
ensures that the Applicative
instance for p a
is independent of
a. I’m not certain this is necessary, but it does seem to be a
useful sanity check.
The Profunctor
instance for Fold
increases the composability of
Gabriel’s composable streaming folds. The ProductProfunctor
is an
interesting concept.
The foldl
library doesn’t actually supply a Profunctor
instance,
but Edward Kmett’s similar folds
library does have
Profunctor
instances.