Comprehensive Haskell Sandboxes

Serious Haskell developers will want to test their packages against different versions of ghc, different versions of libraries, etc. It is therefore useful to have multiple Hackage development environments, or sandboxes. A comprensive sandbox contains:

  1. An installation of ghc
  2. A package database
  3. cabal configuration (for instance, so that we can pin packages)

Moreover, good sandboxes are

  1. isolated from each other: only the active sandbox can be modified
  2. relocatable: it should be possible to copy the sandbox ~/env/ghc762 to ~/env/client1 to use as the basis for a new sandbox, or to copy the entire set of sandboxes (~/env) from one machine to another (suitably similar) machine (this means that paths in config files or baked into executables should never mention the name of the sandbox)
  3. self-contained: there should be no need for a system wide install of ghc
  4. transparent: we should be able to use the standard tools (ghc, cabal) as they are

There are existing tools for Haskell sandboxes (such as cabal-dev or hsenv), but none satisfy all of the above requirements (to the best of my knowledge). However, it is easy enough to set up our own.

The basic approach is:

  1. We are going to keep our sandboxes in ~/env/. For instance, we might have ~/env/ghc742 and ~/env/ghc762.
  2. The active sandbox is a symlink ~/env/active → ~/env/ghc742
  3. We set up three “root” symlinks:
    • ~/.cabal → ~/env/active/dot-cabal for cabal
    • ~/.ghc → ~/env/active/dot-ghc for ghc
    • ~/local → ~/env/active/local for software installed in this sandbox (for instance, we will have ghc as ~/local/bin/ghc).

That’s basically it. The devil, of course, is in the details. In the discussion below we assume that the user’s home directory is /Users/dev (we can’t use ~ everywhere because some tools require an absolute path).

Bash prompt

I find it useful if my bash prompt tells me which development environment is active. I therefore add

to my ~/.profile. This will give me a prompt such as

[ghc762 ~/project/cloud-haskell]
#

Procedure for adding a new development environment

  1. Create a new directory in ~/env/foo, and create subdirectories ~/env/foo/local, ~/env/foo/dot-cabal and  ~/env/foo/dot-ghc.
  2. Activate the new sandbox by symlinking ~/env/active to ~/env/foo.
  3. Install ghc from bindist
  4. Install cabal-install from source, making sure to install profiling libraries (if you want them)
  5. Run cabal update
  6. Modify ~/.cabal/config ; enable library profiling and pin ghc dependencies (see below)
  7. Use cabal install to install standard useful tools such as alex, happy, cabal-dev (cabal-dev 0.9.1 does not compile with ghc 7.6.1; install from github instead), hlint, hasktags, haddock.

It is also possible to copy the “cabal” binary from a different sandbox, so that you don’t have to build it from scratch. This is not necessarily true for other binaries though, unless you also copy some other data; for instance, utilities like haddock and alex rely on data installed in ~/.cabal.

Pinning ghc dependencies

Since I often want to link against the ghc package I find it useful to pin all of ghc’s direct dependencies to avoid the diamond dependency problem. For instance, suppose that we have ghc 7.6.1, linked against containers-0.5.0.0. Without pinning ghc’s dependencies, if we later installed package X, and package X got linked against containers-0.5.2.1, we would not be able to install package Y which requires both X (compiled against containers-0.5.2.1) and ghc-api (compiled against containers-0.5.0.0). Of course we can reinstall packages at this point, but simply pinning ghc’s dependencies avoids these problems.

If there is a handy way to find all of a package (transitive) dependencies using ghc-pkg, then I am not aware of it. I use the following script instead (AllDeps.sh):

Running AllDeps.sh ghc-7.4.2 will give you a big long list of all of ghc-7.4.2′s transitive dependencies, with lots of duplication in it. Run the result through sort and then uniq to remove these duplicates, and then pin these dependencies. For ghc-7.4.2 that means adding these lines to ~/.cabal/config (which is now really ~/env/ghc742/dot-cabal/config, so that we can have different cabal configurations in different development environments):

Or for ghc 7.6.2:

Even if you never link anything against the ghc package, you might still want to pin template-haskell, because upgrading that without upgrading ghc will almost certainly not work.

The Haskell Platform

If you want to install a sandbox for the Haskell platform, start by creating a sandbox with the appropriate ghc compiler and the appropriate cabal-install (the Haskell Platform changelog should tell you what the appropriate versions are). When installing cabal-install, modify the bootstrap.sh script so that it installs the appropriate versions of the libraries needed for cabal-install itself. For instance, for HP 2012.4.0.0, we have

Next we want to pin all the platform packages — not much point in having a sandbox for a Haskell Platform version to test our package if cabal installs different versions of the platform packages. Download the .cabal file for the Haskell Platform (with a bit of luck the Haskell Platform page itself will give you a link to the appropriate .cabal file; otherwise you might find it in the github repository) and copy its dependencies as constraints to your cabal config (like we did above) — don’t forget to enable library profiling too, if you want the profiling libraries available. Next I would use cabal install to install the appropriate versions of alex and happy (the HP cabal file tells you which versions to use). Finally, you can install all platform packages by running

(or you install them “lazily” as they as needed, relying on the cabal config configuration to install the right versions). If you have cabal-install 1.16 or higher to might want to use the “-j” flag to enable parallel builds.

Switching quickly

To switch quickly between two environments, you could create the following script and save it as “chenv” somewhere in your PATH:

And if you wanted to get really fancy you could add bash autocompletion support by adding this to your ~/.profile:

Compiling GHC

If you want to create an additional build environment with a ghc compiled from source, there is a little twist you need to take care of, since building ghc requires ghc. Here’s how you can do it:

  1. Make sure ~/env/active points to a working ghc compiler, let’s say ~/env/ghc762
  2. Create a new environment, let’s say ~/env/ghc762src, and create the necessary subdirectories (local, dot-cabal and dot-ghc). Do not activate the new sandbox.
  3. Unpack ghc in ~/env/ghc762src/local/src. Follow the usual build instructions, but configure with

    This makes sure that we bake in the right paths, even though ~/local is currently pointing to ~/env/ghc762/local, which is not where we want to install our new compiler.
  4. Once everything has been built, install with

    This puts the files in the right place but all the paths refer to ~/local. (You might think you can run the “make install” without overriding “prefix” by activating the sandbox at this point — that will not work, however; make will try and fail to rebuild stuff.)
  5. At this point you can change ~/env/active to ~/env/ghc762src to start using your new compiler.

If you want to hack on ghc you probably don’t want to do the install step at all, and just use ~/local/src/ghc-7.6.2/inplace/bin/ghc-stage2 directly; either use cabal’s –with-ghc option or (more conveniently) create a handful of symlinks (mutatis mutandis):

  • ~/local/bin/ghc → ~/local/src/ghc-7.6.2/inplace/bin/ghc-stage2
  • ~/local/bin/ghc-7.6.2 → ~/local/src/ghc-7.6.2/inplace/bin/ghc-stage2
  • ~/local/bin/ghc-pkg → ~/local/src/ghc-7.6.2/inplace/bin/ghc-pkg
  • ~/local/bin/hsc2hs → ~/local/src/ghc-7.6.2/inplace/bin/hsc2hs

If you want to hack on ghc now, you can simply go to ~/local/src/ghc-7.6.2, make changes, and as long as you don’t need the stage 0 compiler (the bootstrap compiler) you can just compile without changing your sandbox. For example, if you wanted to make a change to the ghc package itself, run “make 2″ in the compiler/ directory (which should be very quick) and then just relink your own package and you’ll have the modified ghc package (as long as you are using the in-place ghc).

Note that if you use the in-place compiler the sandbox is not relocable (the ghc script will include explicit references to the full path of the sandbox).

Buliding GTK

(This section is Mac OS X specific.) The GTK libraries themselves are independent of our Haskell environment, of course, but I don’t like polluting my top-level directory with all the directories that the standard installation procedure creates. To avoid too much pollution, you can make the following changes.

  1. Create a directory ~/env/gtk and download gtk-osx-build-setup.sh there. Then modify that script and change SOURCE to $HOME/env/gtk/Source
  2. Create a directory ~/env/gtk/dot-local and a corresponding symlink ~/.local. Add ~/.local/bin to your PATH.
  3. Run the script
  4. Modify ~/.jhbuildrc-custom and add
  5. Run the jhbuild incantations from the GTK build instructions to complete your GTK installation. For Haskell purposes it is also useful to install librsvg

This gives us the GTK libraries, but not yet their Haskell bindings. The Haskell bindings will of course be relative to whatever development environment is currently active. Now inside a jhbuild shell:

At this point you should have the Haskell GTK libraries available, and you should be able to exit the jhshell and install other Hackages that rely on Gtk2Hs (such as threadscope) without the jhshell. (If you do install svgcairo, you’ll want to install graphviz too.)

Known Limitations

The sandboxes are designed to completely isolated from each other. If you activate sandbox A, you will not be able to use package DB from another sandbox, even if you specify the right paths. The problem is that the package registrations (use ghc-pkg describe to see them) point to ~/.cabal ; so if you activate a different sandbox (i.e., change what ~/.cabal points to) then these paths in the package registrations will be incorrect. You can change this by setting

in your ~/.cabal/config for someSandbox; this will make sure that the package registrations will use absolute paths instead. Of course, this means that the sandbox is no longer relocatable because you are now hardcoding paths (it might be possible to resolve these conflicting goals with the use of $pkgroot, but Cabal support for that is still limited).