Every so often someone bemoans the space
leaks
that can arise due to Haskell’s laziness. A frequently touted remedy
is to make data stricter by turning on `BangPatterns`

, by defining
data structures with explicitly strict fields, or by creating
implicitly strict fields with the `StrictData`

extension. Each of
these approaches leaves something to be desired. In this article I’ll
explain how the approaches work, what they leave to be desired, and a
suggest a reasonably general alternative. The alternative seems
lightweight enough for Haskell programmers to adopt when they define
strict data structures.

Consider the function `pairFoldBad`

:

```
pairFoldBad :: (Integer, Integer)
= foldl' f (0, 0) [1..million]
pairFoldBad where f (count, theSum) x = (count + 1, theSum + x)
```

Strict `foldl'`

was the correct thing to use here, rather than lazy
`foldl`

, so why is the function bad? Because each pair component
(`count`

and `theSum`

) is not necessarily an evaluated `Integer`

,
merely a *thunk which can be evaluated to an Integer*. Each time

`f`

processes a list element `x`

the thunk `count`

has a `+ 1`

operation added on top of it and the thunk `theSum`

has a `+ x`

operation added on top of it. After `foldl'`

has finished, the return
value of `pairFoldBad`

will be a pair of two thunks, each one million
elements deep! In other words, it has a space leak.A typical solution is to use bang patterns to make sure `count`

and
`theSum`

are evaluated on the way in to `f`

, as in `pairFoldBangs`

.

```
pairFoldBangs :: (Integer, Integer)
= foldl' f (0, 0) [1..million]
pairFoldBangs where f (!count, !theSum) x = (count + 1, theSum + x)
```

Each time around the loop `f`

returns two thunks of depth 1. The
subsequent call to `f`

takes them as arguments. The bang patterns
(i.e. the `!`

symbols) evaluate each of the thunks to evaluated
`Integer`

s. The overall return value of the `foldl'`

is a pair of
depth 1 thunks.

This does the job. It’s a little bit weird that `f`

produces thunks
of depth 1 because that means the `foldl'`

produces thunks of depth 1
and we really want evaluated `Integer`

s. They are evaluated
immediately to `Integers`

as soon as we use them and there’s no space
leak but it feels like we’re doing something not exactly right.

An alternative that produces a pair of evaluated `Integer`

s is
`pairFoldBangsAwkward`

. It ensures that the pair components are
evaluated on the way *out* (i.e. when the pair is created) not on the way
*in* (i.e. when the pair is inspected).

```
pairFoldBangsAwkward :: (Integer, Integer)
= foldl' f (0, 0) [1..million]
pairFoldBangsAwkward where f (count, theSum) x = let !count' = count + 1
!theSum' = theSum + x
in (count', theSum')
```

This form is rather unwieldy though. No less unwieldy is use of the
strict function application operator `$!`

:

```
...
where f (count, theSum) x = ((,) $! count + 1) $! theSum + x
```

The major drawback of using `BangPatterns`

to solve this problem is
that we have to remember to do so! The type system does not guide us
to write our program correctly. The program is type correct even if
we omit all the bang patterns.

To get some help from the type system we can switch out Haskell’s standard pair type for one we define ourselves, with strict fields:

`data StrictPair a b = StrictPair !a !b`

Then when we write `pairFoldStrictPair`

as below there is no space
leak.

```
pairFoldStrictPair :: StrictPair Integer Integer
= foldl' f (StrictPair 0 0) [1..million]
pairFoldStrictPair where f (StrictPair count theSum) x = StrictPair (count + 1) (theSum + x)
```

Why is there no space leak? This code looks exactly the same as the
original problematic code `pairFoldBad`

, except we are using the
`StrictPair`

type we defined ourselves instead of Haskell’s built-in
pair. Why is it different? It is different because whenever a value
is constructed using a constructor with a strict field (i.e. a field
with a `!`

in front of it in the `data`

declaration, such as the
fields of `StrictPair`

above) the compiler inserts code to evaluate
that
field.
In the case of `pairFoldStrictPair`

the code that is generated is the
same as if we had written the desugared form
`pairFoldStrictPair_Desugared`

below.

```
pairFoldStrictPair_Desugared :: StrictPair Integer Integer
= foldl' f (StrictPair 0 0) [1..million]
pairFoldStrictPair_Desugared where f (StrictPair count theSum) x = let !count' = count + 1
!theSum' = theSum + x
in StrictPair count' theSum'
```

This is helpful: we now cannot avoid being strict. If we use the
`StrictPair`

type then we can’t forget to evaluate the components.

The major drawback of defining strict data types to replace the more
familiar lazy ones is that they really are completely different types
with completely different associated libraries (if any). We can’t use
the standard `fst`

and `snd`

functions on `StrictPair`

, for example
(although libraries do
exist
that provide this functionality). It is necessary to explicitly
convert back and forth between `(,)`

and `StrictPair`

.

A further problem with strict data fields is that users often think
that they provide more benefit than the reality. For example, from
the above we know not to write `maybeFoldBad`

:

```
maybeFoldBad :: (Integer, Maybe Integer)
= foldl' f (0, Nothing) [1..million]
maybeFoldBad where f (i, Nothing) x = (i + 1, Just x)
Just j) x = (i + 2, Just (j + x)) f (i,
```

Perhaps we should try writing it with a `StrictPair`

instead:

```
maybeFoldStillBad :: StrictPair Integer (Maybe Integer)
= foldl' f (StrictPair 0 Nothing) [1..million]
maybeFoldStillBad where f (StrictPair i Nothing) x = StrictPair (i + 1) (Just x)
StrictPair i (Just j)) x = StrictPair (i + 2) (Just (j + x)) f (
```

This is still no good! The problem is that although the `Maybe Integer`

in the second component of the `StrictPair`

is strictly
evaluated that only means that evaluating the constructor of the
`StrictPair`

evaluates the *constructor* of the `Maybe`

. The payload
of the `Just`

is not evaluated so we build up a layer of thunk each
time around the loop.

It is common in the Haskell world to see strict data field definitions like

```
data MyData = MyData { field1 :: !String
field2 :: ![Double]
, field3 :: !(Maybe Bool)
, }
```

Those strict fields probably don’t do what the author hoped! They are
almost entirely pointless. The bang annotations on the `String`

and
list mean that those fields are only evaluated one cons cell deep.
The tail of the data structure is left unevaluated, as is the first
element. Similarly the `Maybe Bool`

is only evaluated to a `Nothing`

or `Just`

. If it’s the latter then its payload is unevaluated.

Having noted this caveat we can find a way of addressing the problem
in our case. `maybeFoldBangs`

is just too painful to write by hand,
and besides, we might leave out a bang accidentally. Instead we can
repeat the strict data type creation process and define `StrictMaybe`

(indeed this has already been done for
us)
and write `maybeFoldStrictMaybe`

, a function without space leaks.

```
maybeFoldBangs :: (Integer, Maybe Integer)
= foldl' f (0, Nothing) [1..million]
maybeFoldBangs where f (!i, Nothing) x = (i + 1, Just x)
!i, Just !j) x = (i + 2, Just (j + x))
f (
data StrictMaybe a = StrictNothing | StrictJust !a
maybeFoldStrictMaybe :: StrictPair Integer (StrictMaybe Integer)
= foldl' f (StrictPair 0 StrictNothing) [1..million]
maybeFoldStrictMaybe where f (StrictPair i StrictNothing) x = StrictPair (i + 1) (StrictJust x)
StrictPair i (StrictJust j)) x = StrictPair (i + 2) (StrictJust (j + x)) f (
```

This works fine, but we’re going down a path where we will have to deal with two universes of data types: one lazy universe and one strict universe.

Can we do better than two distinct universes? Yes, I think we can!
Let’s define a `newtype`

`Strict`

with which we will represent the
invariant: “when it is evaluated all its immediate children are evaluated
too”. By way of convenience we can define a typeclass `Strictly`

to
allow us to create `Strict`

values and a pattern synonym `Strict`

to
allow us to extract values from the `newtype`

(we should be careful
with the actual constructor because it should be used only in ways
which preserve the invariant).

```
-- Any value of `Strict` should satisfy the invariant that when it is
-- evaluated then all its immediate children are evaluated too.
--
-- The constructor is "unsafe" in the sense that if you don't ensure
-- the invariant holds when you use it then you will violate the
-- expectations of the consumer.
newtype Strict a = MkStrictUnsafe a
pattern Strict a <- MkStrictUnsafe a
class Strictly a where
strict :: a -> Strict a
instance Strictly (a, b) where
-- This is a safe use of MkStrictUnsafe because it satisfies the
-- invariant! We know a and b are evaluated at the point we
-- construct the pair.
!a, !b) = MkStrictUnsafe (a, b) strict (
```

Now let’s see an example of using our “`Strict`

pair” to write a pair
fold. In `pairFoldStrict`

the `Strict`

type guides us to write
correct, space leak free, code, which was the benefit of `StrictPair`

.
On the other hand we don’t have the downside of `StrictPair`

: there is
no new data type. We can interoperate freely with the existing
ecosystem!

```
pairFoldStrict :: Strict (Integer, Integer)
= foldl' f (strict (0, 0)) [1..million]
pairFoldStrict where f (Strict (count, theSum)) x = strict (count + 1, theSum + x)
```

We can also freely compose `Strict`

types. After defining a standard
`Strictly`

instance for `Maybe`

the fold with `Maybe`

can be written,
space leak free, as `maybeFoldStrict`

.

```
instance Strictly (Maybe a) where
= \case
strict -- This is a safe use of MkStrictUnsafe because it satisfies
-- the invariant. Nothing has no children. Just has one child
-- which we know is evaluated when we construct the Strict Maybe.
Nothing -> MkStrictUnsafe Nothing
Just !x -> MkStrictUnsafe (Just x)
maybeFoldStrict :: Strict (Integer, Strict (Maybe Integer))
= foldl' f (strict (0, strict Nothing)) [1..million]
maybeFoldStrict where f (Strict (i, Strict Nothing)) x = strict (i + 1, strict (Just x))
Strict (i, Strict (Just j))) x = strict (i + 2, strict (Just (j + x))) f (
```

`Strict`

buy us in practice?`Strict`

could buy us the ability to conveniently define strict nested
data types without requiring a parallel universe of strict types. We
now know that writing

```
data MyData = MyData
field1 :: !(Either Int Bool)
{ field2 :: !(Maybe Double, Data.Map.Strict.Map Int Float) ,
```

doesn’t make a data type as strict as we probably hoped. Instead of the parallel universe version

```
data MyData = MyData
field1 :: !(StrictEither Int Bool)
{ field2 :: !(StrictPair (StrictMaybe Double)
,Data.Map.Strict.Map Int Float)) (
```

we can use `Strict`

with the existing universe of data types

```
data MyData = MyData
field1 :: !(Strict (Either Int Bool))
{ field2 :: !(Strict (Strict (Maybe Double),
,Data.Map.Strict.Map Int Float))
```

If `strict`

is inlined then the compiler ought to be able to determine
whether constructor arguments have already been evaluated and thus
avoid redundantly evaluating them again.

*New 2020-11-04*

Although inlining `strict`

allows us to avoid redundant evaluations
when constructing I don’t think the simple form above avoids redundant
evaluation when destructing. For example, if we write

```
case strictMaybe of
Strict (Just x) -> let !x' = x in f x'
...
```

then we would like the compiler to be able to elide the evaluation of
`x`

, as below, because, given our implementation, `x`

has already been
evaluated.

```
case strictMaybe of
Strict (Just x) -> f x
...
```

To achieve efficient destruction we need to use a more complicated, and somewhat hairy, setup. The class becomes

```
class Strictly a where
data Strict a
strict :: a -> Strict a
matchStrict :: Strict a -> a
unstrict :: Strict a -> a
```

and the `Maybe a`

instance becomes

```
instance Strictly (Maybe a) where
newtype Strict (Maybe a) = StrictMaybe (Data.Strict.Maybe a)
= unsafeCoerce $ case x of
strict x Nothing -> x
Just !_ -> x
StrictMaybe x) = case x of
matchStrict (Data.Strict.Just j -> Just j
Data.Strict.Nothing -> Nothing
= unsafeCoerce
unstrict
pattern Strict :: Strictly a => a -> Strict a
pattern Strict x <- (matchStrict->x)
```

Note that `Strict (Maybe a)`

is now a separate data type to `Maybe a`

,
but representationally equivalent, so we can `unsafeCoerce`

between
them at zero run time cost. How do we use this class and its methods?

To create a

`Strict (Maybe a)`

we use`strict :: Maybe a -> Strict (Maybe a)`

. The contents (if any) of the`Maybe a`

are evaluated but nothing new is allocated. If`strict`

is inlined and the compiler knows that the contents are*already*evaluated, then the`strict`

call should compile to a no-op!To use a

`Strict (Maybe a)`

we do one of two things:If we simply want to unwrap the

`Maybe a`

and pass it to another function then we can use`unstrict`

, which is a no-op!If we want to match on the contents of the

`Maybe a`

then we use the`Strict`

pattern, for example:`case sm of Strict (Just a) -> ... Strict Nothing -> ...`

The

`Strict`

pattern is implemented in terms of the function`matchStrict`

. If`matchStrict`

is inlined then a case-of-case transformation will eliminate the allocation and make`matchStrict`

a no-op!

The new setup is hairy because it requires us to be very careful with
`unsafePerformIO`

. However, I believe it gives us a convenient API to
strict data types at zero additional run time cost.

`Strict`

buy us?I don’t see how `Strict`

could help much with large lazy data
structures such as lists (including `String`

s). The only way that I
can see to use `Strict`

with standard lists whilst satisfying its
invariant would be to walk the whole list, which is prohibitively
inefficient. Instead I recommend not using large lazy data structures
anywhere one desires strictness. Instead use strict data structures
such as strict `Text`

, `ByteString`

, `Map`

, `Vector`

or `Array`

(I’m
not sure of the strictness characteristics of `Seq`

and I haven’t
validated the strictness guarantees of `Vector`

or `Array`

. That will
have to be another article.)

Defining strict fields that contain lazy types is almost completely pointless:

```
data MyData = MyData { field1 :: !String
field2 :: ![Double]
, field3 :: !(Maybe Bool)
, }
```

As an alternative, it is an open question whether `Strict`

, as defined
in this article, can prove general enough to achieve widespread use or
whether the solution is a parallel universe of strict data types.

Have you seen or used anything like `Strict`

before? If so please
contact me.