# Linearity, Uniqueness, and Haskell

Posted on January 8, 2017

There is a group of people consisting of Tweag I/O, ghc head-quarters and JP Bernardy working on a proposal for adding linear types to Haskell. In this blog post I will attempt to put this proposal in some context. I will explain what linearity is and how it relates to its close cousin uniqueness typing, and I will compare the proposal to some other ways of doing things. My hope is that this will make the work more accessible.

Note on syntax: I will attempt to use a consistent syntax throughout this article. Although I will be citing examples from various programming languages (Haskell, Clean, Idris, the Haskell linearity proposal), my syntax will not necessarily match the concrete syntax used in any of those languages. My goal is not to provide a tutorial on using any of those specific languages, but rather to provide the necessary background to understand all of them and the differences between them.

# Substructural type systems

Consider the following two Haskell functions:

``````wasteful :: a -> b -> a
wasteful x y = x

frugal :: a -> (a, a)
frugal x = (x, x)``````

Although I might have given them unusual names, they have perfectly normal Haskell types. However, they illustrate two properties of the Haskell type system:

• Functions don’t have to use all their arguments: `wasteful` throws `y` away
• Functions can reuse their arguments: `frugal` uses `x` twice

From a formal type system perspective, this corresponds to the structural rules weakening and contraction, and we call type systems in which these rules do not apply substructural type systems.

If weakening does not apply, `wasteful` would not be type correct: functions must use all their arguments. If contraction does not apply, `frugal` would not be type correct: functions cannot use their arguments more than once. If neither weakening nor contraction applies, functions must use their arguments exactly once.

# Examples

To give you an intuition why substructurality is useful, we will consider two examples in this section. Note that although these two examples may look quite similar, we will see later that they are actually of quite a different nature.

An update to a mutable array in Haskell is a monadic operation:

``write :: IOVector Double -> Int -> Double -> IO ()``

But why exactly? Why can’t we just have

``write :: Int -> Double -> Vector Double -> Vector Double``

with the understanding that `write` updates the vector in-place (i.e., really modifying it rather than returning a new array)? Well, if we had such a function, we could then define

``````haveAndEat :: Vector Double -> (Vector Double, Vector Double)
haveAndEat arr = (write 0 2.3 arr, arr)``````

We can’t have our cake and eat it: the definition of `haveAndEat` suggests that it returns two arrays, one of which is updated and one of which is not; but since `write` updates the array in-place, both of those two arrays are the same array: before the call to `write` is evaluated, both would be the original array and after the call to `write` is fully evaluated both would be modified. The value of that “second” array `arr` now depends on when another expression (`write 0 2.3 arr`) is evaluated, and hence it has become a stateful value: we have lost purity and all its benefits.

Here’s the thing: trouble started because `haveAndEat` used its argument `arr` twice (it’s like `frugal`): if we had a type system in which contraction does not apply, we would not be able to define `haveAndEat` and all would be well.

## Protocols

The basic way to open a file in Haskell is

``openFile :: FilePath -> IOMode -> IO Handle``

There are two problems with this function, as illustrated by the following two programs:

``````careless :: FilePath -> IO ()
careless fp = do
handle <- openFile fp ReadMode
return ()

anxious :: FilePath -> IO ()
anxious fo = do
handle <- openFile fp ReadMode
hClose handle
hClose handle``````

In `careless` the handle returned by `openFile` is never closed, and in `anxious` it is closed twice. We can deal with the first problem by having the garbage collector close handles (through finalizers), and we can deal with the second problem by maintaining some per-handle state even after its closed (alternatively, for applications where lexical scoping is not too restrictive, we can use a function such as `withFile` which itself takes care of closing the `Handle`).

But wouldn’t it be nicer if we could solve these problems at the type level? As you might have realized, `careless` is a bit like `wasteful`: it throws away the handle. Similarly, `anxious` is a bit like `frugal`: it uses the handle twice. In a type system in which weakening and contraction do not apply, both `careless` and `anxious` would be type incorrect. Thus, a substructural type system allows us to express protocols: the handle must eventually be closed, it cannot be used anymore after it has been closed, etc.; we will see another example later.

# Limiting the applicability of the structural rules ..

Typically we want to restrict weakening and contraction for some arguments, but not for others. For example, consider

``````writeTwoValues :: Int -> Int -> Double -> Vector Double -> Vector Double
writeTwoValues i j x arr = write i x (write j x arr)``````

Although `writeTwoValues` uses argument `x` twice, this should be unproblemantic: the important thing is that it doesn’t use argument `arr` twice. In other words, the structural rules (contraction and weakening) should be applicable to `x`, but not to `arr`.

Let’s agree to call values to which the structural rules do not apply substructural values; this nomenclature is not standard but I think it will help.

## .. through types

To a Haskell programmer, the most obvious way to separate structural values from substructural values is through types. This is the approach taken in the functional programming language Clean.

As an example, we might give `write` the following type:

``write :: ω:Int -> ω:Double -> 1:(Vector ω:Double) -> 1:(Vector ω:Double)``

A value of type `ω:a` is an “ordinary” value which we can reuse or throw away at will (i.e., the structural rules apply as normal); by contrast, a value of type `1:a` is a substructural value which we can not reuse or throw away (the structural rules do not apply). (Technically speaking we are missing some annotations here; here and elsewhere when that happens you can assume a `ω:` annotation in such cases, unless otherwise indicated. Note that I’m not using Clean syntax to keep the syntax uniform throughout this article.)

This means that we cannot define

``````haveAndEat :: 1:(Vector ω:Double) -> (1:(Vector ω:Double), 1:(Vector ω:Double))
haveAndEat arr = (write 0 2.3 arr, arr)``````

because `write` insists that the array is substructural, and hence the structural rules do not apply.

Similarly, we can define

``````frugal :: ω:a -> (ω:a, ω:a)
frugal x = (x, x)``````

but not

``````frugal :: 1:a -> (1:a, 1:a) -- incorrect
frugal x = (x, x)``````

## .. through kinds

Another closely related approach, used in the functional programming language Idris, is to separate structural from substructural values through kinds.

Specifically, we split the universe of types into two: the types of structural values (types like `Int` or `Double`), and the types of substructural values (types like `Vector`). In this case, the type of write is just

``write :: Int -> Double -> Vector Double -> Vector Double``

but with the implicit understanding that the kind of `Int` and `Double` is different from the kind of `Vector Double`. The type of `frugal` becomes

``````frugal :: forall (a :: Type). a -> (a, a)
frugal x = (x, x)``````

where `Type` is the Idris name for the kind of “normal” types (structural rules do apply). If we then tried to apply `frugal` to an array, we would get a kind error rather than a type error.

## .. through variable bindings

There is alternative possibility: we can instead mark variable bindings with whether or not the substructural rules should apply to those variables; this is the approach that the Haskell linearity proposal takes.

For example, `write` would turn into something like

``write (i ::ω Int) (x ::ω Double) (arr ::1 Vector Double) = ...``

We can introduce a convenient shorthand as follows:

``````write :: Int -> Double -> Vector Double ⊸ Vector Double
write i x arr = ...``````

This `⊸` operator is pronounced “lollipop”; a function of type `(a ⊸ b)` is really a function `(\(x ::1 a) -> ...)`; a “normal” function of type `(a -> b)` would then be shorthand for `(\(x ::ω a) -> ...)`.

This means we cannot define

``````haveAndEat :: Vector Double ⊸ (Vector Double, Vector Double) -- incorrect
haveAndEat arr = (write 0 2.3 arr, arr)``````

and the type of `wasteful` becomes

``````wasteful :: a ⊸ b -> a
wasteful x y = x``````

### Let bindings

Of course, if we mark only variables this raises the question what happens to larger expressions. One rule is that when we let-bind a variable

``let x = expr in ...``

and in `expr` we use some variable `(arr ::1 ...)`, the the binding of `x` must also become `(x ::1 ...)`. For example the following function would also be rejected:

``````writeThenDup :: Vector Double ⊸ (Vector Double, Vector Double) -- incorrect
writeThenDup arr = let x = write 0 2.3 arr in (x, x)``````

Note that this applies no matter what the result type is; for example, the following function is just as incorrect:

``````dupLength :: Vector Double ⊸ (Int, Int) -- incorrect
dupLength arr = let len = length arr in (len, len)``````

### Function results

If we define

``````dup :: a -> (a, a)
dup x = (x, x)``````

then clearly we should also reject

``````writeThenDup :: Vector Double ⊸ (Vector Double, Vector Double) -- incorrect
writeThenDup arr = dup (write 0 2.3 arr)``````

as it is semantically equivalent to the version in the previous section. The rule is here that we can apply a function of type `(a -> b)` only to arguments constructed without using substructural values. One intuition here is that the result of applying a function to a substructural value is itself a substructural value.

We will see more rules later.

# Uniqueness versus Linearity

So far I have avoided the terms unique or linear, but we are now at a point where we should start to distinguish.

## Uniqueness

A unique value is one which has not been shared: uniqueness is a guarantee. We relied on uniqueness in the mutable array example: we can only safely apply `write` to an array if that array is unique. It should be safe to forget guarantees, and in uniqueness typing this is formalized through a subtyping relationship; this means that the following function is type correct:

``````forgetUnique :: 1:(Vector ω:Double) -> ω:(Vector ω:Double)
forgetUnique arr = arr``````

Of course, once we forget that an array is unique we can no longer modify it.

## Linearity

Conversely, a linear value is one which may not be shared: linearity is a restriction. We relied on linearity in the `Handle` example: if `openFile` returns a linear `Handle`, then functions which close that `Handle` twice are not type correct.

In many ways linearity is dual to uniqueness typing, and we can see this quite clearly in the subtyping relation. Although we cannot forget restrictions, we can impose them. For example:

``````restrict :: ω:(Chan ω:Int) -> 1:(Chan ω:Int)
restrict chan = chan``````

In `restrict` we take a “normal” channel (with no restrictions) and return a linear channel. As an example use case, consider

``````master :: IO Int
master = do
chan <- newChan
forkIO \$ slave1 (restrict chan)
forkIO \$ slave2 (restrict chan)
x <- readChan chan
y <- readChan chan
return \$ x + y``````

Here we have some `master` process which creates a new (unrestricted) channel. It then uses `restrict` to pass this channel as a linear channel to two slave processes `slave1` and `slave2`, and is therefore guaranteed that the two slave processes will each send precisely one value on this channel (we might need to `newtype` the channel so that sending a value is the only applicable operation).

# Pairs

I hope that by now you have a reasonable intuition of the usefulness of limiting the applicability of the structural rules contraction and weakening, and of the difference between uniqueness and linearity.

However, the examples we considered so far were carefully constructed to avoid the complications that arise from propagation. This is where things get a bit more subtle.

## Uniqueness

Consider updating a pair of arrays. In standard Haskell we’d write something like

``````writeTwoArrays ::    (Vector Double, Vector Double)
-> IO (Vector Double, Vector Double)
writeTwoArrays (arr1, arr2) = do
arr1' <- write 0 2.3 arr1
arr2' <- write 1 4.5 arr2
return (arr1', arr2')``````

The use of the `IO` monad forced us to decide whether we want to write `arr1` or `arr2` first. The compiler is not free to rearrange these two calls, much less execute them in parallel. This is unfortunate: after all, updates to two arrays are independent.

In a language with uniqueness typing we can write something like this (we shall see shortly that this type is not entirely correct):

``````writeTwoArrays :: (1:(Vector ω:Double), 1:(Vector ω:Double)) -- not quite correct
-> (1:(Vector ω:Double), 1:(Vector ω:Double))
writeTwoArrays (arr1, arr2) = (write 0 2.3 arr1, write 1 4.5 arr2)``````

Apart from the additional information in the type, this is just ordinary functional programming. However, now consider

``haveAndEatTwo arrs = (writeTwoArrays arrs, arrs)``

Just like `haveAndEat`, we should reject `haveAndEatTwo` for much the same reasons: we cannot the updated pair of arrays and the non-updated pair of arrays, because the arrays are updated in-place.

But how can we see this from the types? After all, `write` wants a unique array, but `writeTwoArray` gets a pair of two unique arrays and so it seems type correct; `haveAndEatTwo` duplicates the pair, but not the arrays and hence seems type correct as well.

This is where uniqueness propagation comes in: if we want to extract a unique element from a pair (like `writeTwoArrays` does), then that pair itself must be unique. After all, if two people have a reference to the pair, then even if the only references to the elements of that pair are from that pair, there are still multiple references to those elements:

This means that the type of `writeTwoArrays` is actually

``````writeTwoArrays :: 1:(1:(Vector ω:Double), 1:(Vector ω:Double)) -- correct
-> 1:(1:(Vector ω:Double), 1:(Vector ω:Double))
writeTwoArrays (arr1, arr2) = (write 0 2.3 arr1, write 1 4.5 arr2)``````

recording that the pair itself must be unique, too. This means that `haveAndEatTwo` is no longer type correct, because it would need to duplicate a unique array, which is not allowed.

Note that it’s perfectly fine to construct a non-unique array with unique elements:

``````mkPair :: 1:a -> 1:b -> ω:(1:a, 1:b) -- correct but useless
mkPair x y = (x, y)``````

However, such a pair is just not very useful as we cannot extract those unique elements.

## Linearity (in types)

In linearity the situation is dual: there isn’t necessarily a problem with destructing a pair with linear elements (as long as those elements are then used linearly, of course). This means the following function is fine:

``````swap :: ω:(1:a, 1:b) -> 1:(1:b, 1:a) -- correct but useless
swap (x, y) = (y, x)``````

However, we cannot construct such pairs; the following is incorrect:

``````mkPair :: 1:a -> 1:b -> ω:(1:a, 1:b) -- incorrect
mkPair x y = (x, y)``````

We are not allowed to construct a non-linear pair from linear arguments.

## Linearity (in variable bindings)

In the linearity-in-types world, the elements of the pair have their own linearity annotations; this means that a function such as

``````first :: 1:(1:a, 1:b) -> 1:a -- incorrect
first (x, y) = x``````

is type incorrect, because the type of `y` is `1:b` and hence weakening does not apply: we cannot throw `y` away.

When we track linearity only through variable bindings, we have to be more explicit out this propagation: the rule for pattern matching is that in a case statement such as

``case expr of (x, y) -> ...``

if we use any linear variables inside `expr` (i.e., variables with binding `::1`) then `x` and `y` must also be linear: an obligation to use a pair linearly translates to two linear obligations to use the elements of the pair linearly. Note that this rule talks about the variables inside `expr`, not the type of `expr`; there is no other choice, because types do not track linearity.

This means that

``first :: (a, b) ⊸ a -- incorrect``

is type incorrect, though

``first :: (a, b) -> a -- correct``

is fine. As a more concrete example, consider the `Chan` use case again: if we pass a pair of channels to a slave process, the slave must send exactly one value on both channels.

(This means that these pairs are multiplicative, not additive. The Haskell linearity does not consider additive pairs at all, so I will not discuss them in this article either.)

# Higher order functions

In some ways the situation for higher order functions is very similar to the situation for pairs; however, there are some additional complications.

## Uniqueness Typing

Consider the following attempt to outsmart the type system by partially applying `const`:

``````const :: 1:a -> ω:b -> 1:a -- not quite correct
const x y = x

sneakyDup :: 1:a -> 1:(1:a, 1:a)
sneakyDup x = let f = const x in (f (), f ())``````

Obviously we should rule out `sneakyDup` as it claims it can duplicate unique values; but how? Here’s the rule: just like a pair must be unique if we want to extract a unique value from it, a function with unique elements in its closure must be unique if we want to execute it.

However, since we don’t typically record the types of the elements in a function closure, we cannot know whether or not a function has unique elements in its closure when we apply it. Therefore, we have to approximate and instead say that when we construct a function with unique elements in its closure that function itself must be unique. Hence, the type of `const` becomes

``````const :: 1:a -> 1:(ω:b -> 1:a) -- correct
const x y = x``````

This allows us to rule out `sneakyDup` because it uses `f` twice.

You might object at this point that `sneakyDup` could use subtyping to forget the uniqueness guarantee of `f`, and you would be right. For this reason subtyping does not apply to functions. This non-uniformity of the subtyping relation causes all kinds of trouble, not least of which loss of principal types. A large part of my PhD thesis is dedicated to solving this problem, but further discussion of this topic is beyond the scope of this article.

## Linearity (in types)

By analogy to the rule for pairs, you might expect that in linearity there isn’t necessarily a problem with applying non-linear functions with linear values in their closure, but instead we cannot construct such functions. Indeed, this is the case. This means that in a linear type system, we get

``````const :: 1:a -> 1:(ω:b -> 1:a) -- correct
const x y = x``````

and `sneakyDup` gets rejected for the same reason as in the uniqueness typing example.

This might not look dual to the case for uniqueness typing, but that’s only because in uniqueness typing we had to make some unpleasant conservative approximations, making the uniqueness case identical rather than dual to the linear case. Again, see my thesis for details.

## Linearity (in variable bindings)

Let’s consider the `sneakyDup` example one more time. If we track linearity through bindings only, we get

``````const :: a ⊸ b -> a
const x y = x

sneakyDup :: a ⊸ (a, a) -- incorrect
sneakyDup x = let f = const x in (f (), f ())``````

In this case we are rescued by the rule for let-bindings which we discussed above: any let-bound variable must be used linearly if we use any linear variables inside the body. No need to take special precautions with partial application in this case.

But there is a subtlety in a different case. Consider this alternative attempt to outsmart the type system:

``````(.) :: (b -> c) ⊸ (a -> b) ⊸ a -> c -- incorrect
(f . g) x = f (g x)

dup :: a -> (a, a)
dup x = (x, x)

sneakyDup :: a ⊸ (a, a)
sneakyDup x = (dup . const x) ()``````

Something clearly went wrong here, but it’s somewhat subtle. It went wrong in the definition of `(.)`; as we saw before, the result of applying a linear function (a function of type `(a ⊸ b)`) must itself be treated linearly. However, the same goes for the result of a non-linear function if we have a linear assumption `g ::1 (a -> b)`. This is somewhat comparable to the rule in Clean that functions with unique elements in their closure must themselves be unique, except that here it extends to the results of those functions too. In the typing rules this is simply a consequence of the rule we saw already: we cannot apply a non-linear function (in this example, `f`) to an argument which is constructed using linear assumptions (in this case, `g`).

# Polymorphism

One of the main obstacles to adding support for uniqueness or linearity to a language like Haskell is that the most general (principal) types of many functions get very complicated. For example, in `Clean` we have

``````const :: u:a -> w:(v:b -> u:a), [w <= u]
const x = \y -> x``````

After you stare at it for a while, this type makes sense: `const` receives two arguments of type `a` and `b`, and returns something of type `a`; moreover, the first argument may or may not be unique (`u`), and similarly for the second argument (`v`); but the result will be unique only if the first argument is; and moreover, if that first argument is unique, then if we partially apply `const` to just one argument then the resulting function must itself be unique (`w <= u`), due to the rule about functions with unique elements in their closure, discussed above.

Sensible or not, it certainly is a lot more complicated than the type of `const` in a regular functional programming language

``````const :: a -> b -> a
const x y = x``````

As another example, the type of `fst` is

``````fst :: w:(u:a, v:b) -> u:a, [w <= u]
fst (x, y) = x``````

which similarly reflects that the pair itself must be unique if we want to extract a unique element from it.

In Idris these two functions look like

``````const : {a : AnyType} -> {b : AnyType} -> a -> b -> a
const x y = x

fst : {a : AnyType} -> {b : AnyType} -> UPair a b -> a
fst (MkUPair x y) = x``````

where `AnyType` is the Idris kind which covers both unique and non-unique types. These functions look much simpler than the functions in Clean, but that’s just because the kinding rules are not spelled out (function space has kind `UniqueType` if argument has; pair has kind `UniqueType` if either of the elements has).

One of the benefits of tracking linearity in bindings instead, like in the Haskell linearity proposal, is that these types become much simpler:

``````const :: a ⊸ b -> a
const x y = x

fst :: (a, b) -> a
fst (x, y) = x``````

However, we don’t get away completely scot-free. For example, consider composition one last time:

``(f . g) x = f (g x)``

Does `(f . g)` use `x` linearly or not? Well, this clearly depends on `f` and `g`: `(f . g)` uses `x` linearly only if both `f` and `g` do. We can express this as follows:

``(.) :: (b ->:u c)  ⊸  (a ->:v b)  ->:u  a  ->:uv  c``

where `(a ->:v b)` is a generalization of both `(a -> b)` (letting `v = ω`) and `(a ⊸ b)` (letting `v = 1`), and `uv` is type-level multiplication of these variables such that `uv = ω` if either `u` or `v` is (I don’t know what concrete syntax is proposed here; the above syntax is my own).

# Conclusions

If we want to retrofit uniqueness typing or linearity to Haskell, we need to find a way to do this without changing everything about the language. Both the uniqueness-through-kinds approach used in Idris and the linearity-through-bindings approach used in the Haskell linearity proposal are reasonable ways to do this.

Nonetheless, if we want to give functions their most general type we’d still end up with different, more complicated, types than the types we had before the introduction of uniqueness/linearity. Syntactic conventions can go some way to hide this complexity. Moreover, we don’t of course have to give functions their most general type. We can still use

``````const :: a -> b -> a
(.)   :: (b -> c) -> (a -> b) -> a -> c``````

and so on; this would limit the applicability of these functions in code that does rely on linearity, but I think that’s a reasonable tradeoff. It does raise some questions about type inference, and principal types, but those are not insurmountable problems.

As I have argued elsewhere (for example, see my paper Modelling Unique and Affine Typing using Polymorphism), I do prefer a substructural type system with no subtyping relation. As we have seen, without subtyping linearity and uniqueness become pretty much the same thing, and this gives library authors the freedom to use the type system to express what they want. That said, a linear type system can encode the absence of sharing. For example, we can do something like

``withMutableArray :: (1:(Vector Double) -> a) -> a``

This function creates a new mutable array, and then passes that as a linear object to the callback. In this setup, the definition of `withMutableArray` (though not its type) guarantees that we start with a single (unique!) reference to a mutable array, and once we pass that array to the callback linearity guarantees that the array will never be duplicated. In a way we are using this negative occurrence of a linear type to model uniqueness typing.

To me a system based on types, rather than variable bindings, is easier to understand; but I’ll readily admit I am biased, having worked on systems in that style for many years. That said, Haskell programmers are used to thinking in terms of types, so a type or kind based approach would seem to fit better with the language. Moreover, we can abstract over types, but we cannot abstract over the syntactic shape of arguments, so if we track linearity in bindings only we will lose some expressiveness.

It’s hard to evaluate a type system without an implementation however. We will need more practical experience to be able to really compare the approaches. Either way, I hope this article has managed to place the Haskell linearity in some context and make it a bit easier to understand. No matter what the final system looks like, I look forward to having support for substructural types in Haskell!

### Acknowledgements

Thanks to Arnaud Spiwack of Tweag I/O, one of the primary authors of the Haskell linearity proposal, for valuable feedback on an early draft of this article.