This is another piece of news in my quest for hermetic, ephemeral, and reproducible builds (“HER”). If you read my articles in the past months or so, you may have noticed that I am looking for good ways of creating completely hermetic, ephemeral, reproducible, but also practical builds.
In the past, I offered two approaches to HER builds that I believe take a stab at providing HER properties, while also remaining practical to everyday build engineers. I’ll discuss them below, as well as a proposal for improving them.
Before that, let’s reiterate some definitions first, so we know what problem we’re trying to solve.
Definitions
This is adapted from an earlier article for convenience.
Hermetic
A build is said to be hermetic, if each build step only has access to the set of input artifacts that are minimally required for the build to succeed.
- Hermeticity is useful to ensure build correctness – as it allows us to confirm that each build dependency step is well captured, and that none are omitted.
- Hermeticity also makes a build parallelizable, because the minimum set of build dependencies is known and can be farmed out to multiple build servers if needed.
Ephemeral
A build is said to be ephemeral, if the build relies only on the tooling that the build process installs or builds itself.
- No pre-installed build-specific system state is necessary. When a build is invoked, it provides all the resources it needs.
Reproducible
A build is said to be reproducible, if two identical instantiations A and B of the same build process P produce bit-identical artifacts.
- For example, build systems that rely on pre-installed tools may end up not being reproducible. This could be due to inadequate version granularity of build dependencies, coupled with non-ephemeral build installations.
Practical
A build is said to be practical,if its setup does not require extreme adaptation of the build process to satisfy any of the previous properties.
- A build is not practical if it, for example, requires bringing in dependencies in a bespoke fashion one by one, to a not well defined depth.
Prior art
This is a short inventory of prior art intended to contrast conventional approaches to building software with HER-oriented build processes. I am rushing through the history of build systems to get to
Make builds
I use the term “make builds” for all “conventional” build approaches. This includes the original make, its cousins and descendants, today most typically GNU Make, and similar utilities, such as cmake, ninja, gn, scons and others, including the clever hacks do, redo and similar.
All the above focus on building software artifacts (most immediately C and C++ programs), assuming an ambient development environment which is provided outside of the build process.
Container builds
These are builds which use the hermeticity of Docker and similar confined environments.
Self-sufficient builds
Self-sufficient builds are builds with partial or full provision of development environment, such as one can achieve using bazel, buck{,2}, pants and similar.
These builds do a partial to full setup of the build environment. However, full HER builds are not a given: usually involved bespoke setup is required to bring in all the needed dependencies in a HER manner.
The role of Bazel
Bazel is my tool of choice for achieving HER builds. I am sure that with the right amount of care anything I do here can be achieved with any of the tools mentioned above. However, I chose Bazel for somewhat subjective reasons, which I believe still to be good:
- I know Bazel somewhat well, and have experience with it from my daily work.
- I believe that Bazel provides good tooling for multi-languaged self-sufficient builds out of the box.
- Bazel works uniformly well for builds which contain many non-uniform parts. If you need to generate files, if you need multi-language builds, all the tooling is there.
- Bazel has proven very useful for my work on programmable hardware. Working on programmable hardware brings in an entire zoo of tools, all of which are wrangled in exactly the same way by bazel. A single command line causes a minimal rebuild of all software and programmable hardware components, in the correct ordering, from gate layer to application layer. Doing this in any other build system that I’m aware of would require an immense amount of work.
My work on HER builds
As hinted in the intro, I provided two approaches that to an extent provide the ideals espoused here. These are, ordered from older to newer:
- Build-in-docker (“BID”): this approach allows running a bazel build action inside a predefined Docker container. This allows bringing in any binaries into the build by adding a Docker container.
- Nix+Bazel (“N+B”): this approach uses an ephemeral installation of Nix to set up a hermetic build environment.
Both approaches work, and I use them today for various parts of my workflow. I had a few years of running both of them side by side, depending on the project in question and the sophistication of the setup required. However, during that time, I also noticed some issues which threaten the longevity of both approaches.
BID
Pros
- Easy to prepare binaries. You can build your favorite Docker container, or take any of the predefined ones from various docker registries.
- Once done, the container can be reused many times.
- Practical, once set up.
Cons
- Build rules are complicated and can be brittle in face of updates and changes.
- Not everyone likes to run Docker.
- Not everyone is allowed to run Docker.
- Building docker containers, while hermetic and ephemeral, are not really reproducible. If you build a container, you better hold on to it if you want it not to change.
- Can not easily support gigantic containers. Programmable hardware dev packages clock in at hundreds of gigabytes. Such packages are not servable from regular registries.
- The approach started to struggle against seemingly new limitations on docker environments on Github.
N+B
Pros
- Fully hermetic.
- The largest repository of free software on the planet.
- Practical once set up.
Cons
- Nix is an ecosystem unto itself, with its own intricacies and complexities. Bringing nix into a bazel environment which is complex on its own, results in a very complex setup.
- Nix is great when everything works. Good luck if there is a typo somewhere in your setup.
- Ephemeral nix has proven to be somewhat flaky. The
nix-portable
approach I use sometimes causes the entire build process to block forever. I traced this to bazel spawning its daemon in aproot
environment, which then sometimes causes the build process not to terminate. - Some software that I want to use is not available in Nix repos, and are more or less impossible to bring in.
- Requires running bazel in a chrooted environment, which sometimes fights with bazel’s native sandboxing.
Enter Bazel Modules
Despite the shortcomings listed above, I was quite pleased with the dev setups I achieved using the above two techniques. I currently use a combination of both as appropriate. I enjoyed the setup I had, until it became obvious that a migration from Bazel workspaces to Bazel modules was imminent. At this moment, Bazel 9 which is scheduled to be released in late 2025 will completely disable the use of workspaces.
This means, anyone who wishes to keep up with the latest features of Bazel (which should be most of us), will want to migrate their setups to Modules.
The upside of Modules is the promise of greatly simplified dependency management. The downside of Modules is the required “boil the ocean” migration, and the introduction of the Bazel Central Registry (“BCR”). In theory, this is the place where you would be able to get the latest and greatest of all the dependencies you need.
In practice, however, someone has to build this registry out, and that one is you (among others). With a sufficiently niche interest, chances are that BCR will not have what you need. In the best of worlds, if your work has a unique value proposition, chances are that you will need something that nobody else has needed before. At which point, it will be up to you to create it. Unfortunately, due to how open source software builds on itself, this means that you might find yourself in a situation where you need to unravel your dependencies all the way to the very basic of modules. And most likely this means creating them yourself.
Congratulations, you are now a software repository manager.
As luck would have it, both of my approaches, BID and N+B are somewhat annoying to port, so much so that instead of making that change, I started thinking how to not have to do it at all.
Which led me to the natural next step… stay tuned.