Summary
The bazel build system has rules available to build C or C++ code from external repositories that use other more “conventional” build tools. By this I mean the usual suspects: GNU autotools, GNU make and cmake.
This rules repository is called rules_foreign_cc. While it is somewhat documented, and even has examples, both leave a lot to be desired.
I think it would be really helpful to have at least some sort of a guide for how exactly they work, so that we can use them more effectively. Such a guide, however, does not seem to exist. At least, I was not able to find one.
So I decided to write it.
If you think talk is cheap, you can skip right away to the example code. However, I think it would be useful to spend some time also understanding the mechanism by which the rules work.
The worked example is only for the make
rule, i.e. for builds that don’t use autotools
, nor cmake
. The approach in other rules seems similar. But, I haven’t tested all of them so this text is limited to the use of the make
rule. You may be able to extrapolate it so that it is useful for other rules.
What’s this about?
One of the main issues I had with bazel is that, awesome as it is, hardly anyone uses it in the open source world. This means, if would like to use code from external repositories, they will most likely not be written with bazel compilation in mind.
This means, to make the library available to bazel, you have one of several options.
- You could build the library externally, then lug the files into bazel in some form, either as an external repository or a
//third_party
bit of code. - You could write a BUILD file by hand and compile everything purely in bazel.
- Or, if you think the previous two approaches are brittle or not reproducible enough, you can use the “foreign” C++ build rules to have your library be compiled from source with the tools they are intended to be built with.
The rule set rules_foreign_cc
are written to allow bazel to ‘absorb’ C or C++ artifacts from non-bazel code repositories and make them available as bazel targets. However, the documentation is sparse, and examples do not seem to clarify how the rules work. Recently I spent some time figuring out specifically how the make()
rule works in rules_foreign_cc
. This text is the result of that code spelunking. However, nothing here assumes any particular compilation toolchain.
The example uses non-hermetic builds. I.e. if your make uses nonstandard tools, you’d need to provide them either as preinstalled binaries on your machine, or using some other means. That makes it perhaps less useful in “production” settings, but I suppose that if you are looking into this for work, you already have another expert handy for your toolchains.
Prerequisites
Here are some things that would be useful to do before you attempt to use rules_foreign_cc
.
Try a local compilation first
The rules in rules_foreign_cc
are very sensitive to the actual underlying build process that they are invoking. For a make(...)
rule invocation to be effective you will need to know exactly how the underlying make process behaves, what flags it takes and their effects. Knowing this, it will be easier for you to figure out what you need to do.
For this reason, I recommend trying to compile the program you wanted to compile locally, using the regular build approaches, outside of bazel.
Some things to note here are:
- It is very hard to debug compilation errors in a bazel sandbox. This is why you want to make a practice run outside of bazel first. Make sure you can compile the program on its own before attempting to add another layer of complexity.
- Does your program have dependencies? If yes, you will need to provide them to bazel. Refer to the rules documentation on how to do that. Providing all the dependencies may prove to be very onerous, so this step will be very tricky to achieve. Some programs, with deep dependency trees, will be a nightmare to compile in bazel.
- Does your program require to be built in the same directory as the source? Building in the same directory is not a good idea for large programs, so many large programs do not do that. Bazel also expects that build happens in a separate dir, but many smaller programs don’t really do that. Refer to the rules documentation on how to insist for the compilation to happen in the same directory.
- Does your program have an equivalent of
make install
? If not, you must provide one, otherwise your attempts to build the program will fail. See below for details. - Once you compile the program, do
make install
and note the structure of the install directory very well. You will have to rely on your knowledge of that directory tree to make sure that bazel can find the resulting build artifacts. - Beware the dynamically linked programs. Such programs may only operate from within the bazel environment, and may not be portable even to the same machine outside of bazel, unless separate packaging steps are done after the fact. Explaining how to do this correctly is out of scope for this article.
- Check where your different file types end up. For example, by default, bazel expects all binary outputs to be in the directory
$TOP/bin
, where$TOP
will be the directory that bazel will tell you to install the build artifacts in. Similar approach is taken for other types of artifacts, such as static libraries, dynamic libraries, headers and so forth. If you project has a standard structure all of this will work out of the box for you. From this perspective, theconfigure_make
rule is much better if you have one, because it almost guarantees this standardized directory structure. However, there still are programs that do not use GNU Autoconf, and just provide various make files. You will need to figure out what to do with such programs. If worse comes to the worst, you can create patch files, and apply them after the program is downloaded. I do not recommend doing this, but I have been known to do it if my back was against a wall and I was out of ideas. - If your files are in standard locations, or can be coerced into being there, and especially if you have a C or C++ program that you just compiled, bazel will ensure that later uses of the
make
target provide appropriate sources, headers and libraries to any downstream attempts to use them. This is very nice when it works. - Pay special care to files in nonstandard locations. Those might be hard to coerce into being where they are supposed to be. I personally had major issues with C++ include directories that had sub-directories.
Initializing a bazel repository
Create a new directory that will be initialized as a bazel workspace.
mkdir test
cd test
Bazel needs a WORKSPACE
file, so we fill this into it. Most of the file is importing the required rules. The dtc
git repository is a sample external library that has a Makefile
based build. There is nothing special about this particular library. It was the library I needed to compile recently and it makes for a nice case study.
load("@bazel_tools//tools/build_defs/repo:git.bzl", "new_git_repository")
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
name = "rules_foreign_cc",
strip_prefix = "rules_foreign_cc-0.9.0",
url = "https://github.com/bazelbuild/rules_foreign_cc/archive/0.9.0.tar.gz",
sha256 = "2a4d07cd64b0719b39a7c12218a3e507672b82a97b98c6a89d38565894cf7c51",
)
load("@rules_foreign_cc//foreign_cc:repositories.bzl", "rules_foreign_cc_dependencies")
rules_foreign_cc_dependencies()
# Device Tree Compiler
new_git_repository(
name = "dtc",
commit = "ccf1f62d59adc933fb348b866f351824cdd00c73",
remote = "https://github.com/dgibson/dtc",
build_file = "//third_party/dtc:dtc.BUILD.bazel",
shallow_since = "1686217671 +1000",
)
The WORKSPACE
file is fairly standard. We add the library rules_foreign_cc
using its archive, and check out a library called dtc
from Github. This library uses a custom make
file for compilation, so is a fairly good vehicle to explain how foreign compilation works in bazel.
Note that we define a custom build file with build_file = "//third_party/dtc:dtc.BUILD.bazel"
. This file does not exist yet, we will add it in later steps. The build file named in this parameter to new_git_repository
will be inserted into the top level directory of the dtc
external repository once the repository is downloaded. Its full label will then be @dtc//:BUILD.bazel
. In general, bazel will allow you to insert build files into external repositories you download, and will also allow you to apply patch files if you need to make local changes.
We also initialize the top level BUILD.bazel
file:
echo > BUILD.bazel
This file is empty, but is used to denote the top of the bazel package hierarchy.
Creating the files for DTC compilation
mkdir -p third_party/dtc
echo > third_party/dtc/BUILD.bazel
Again, the BUILD.bazel
will be used only to create a package at //third_party/dtc
, so that we can refer to the file dtc.BUILD.bazel
as //third_party/dtc:dtc.BUILD.bazel
. If this file was not there, then we’d have to use //:third_party/dtc/dtc.BUILD.bazel
, provided that we don’t run into other issues that would possibly prevent us from doing so. Placing BUILD.bazel
files in directories to convert them into packages is an easy way to stop worrying about these ambiguities.
We now create the file //third_party/dtc/dtc.BUILD.bazel
. This was the file we referred to earlier in the WORKSPACE
file. Here are its contents in full (except for comments), and we’ll discuss each snippet in turn.
load("@rules_foreign_cc//foreign_cc:defs.bzl", "make")
filegroup(
name = "all_files",
srcs = glob(["**"]),
)
make(
name = "libdtc",
targets = [
"PREFIX=$$INSTALLDIR$$ " +
"WARNINGS=\"-Wall -Wpointer-arith -Wcast-qual -Wnested-externs " +
"-Wsign-compare -Wstrict-prototypes -Wmissing-prototypes " +
"-Wredundant-decls -Wshadow -Wwrite-strings\" " +
" install",
],
lib_source = ":all_files",
out_binaries = [ "dtc", "fdtdump", ],
visibility = ["//visibility:public"],
)
filegroup(
name = "dtc",
srcs = [":libdtc" ],
output_group = "dtc",
visibility = ["//visibility:public"],
)
The snippet below makes the make
rule available from the external repository rules_foreign_cc
.
load("@rules_foreign_cc//foreign_cc:defs.bzl", "make")
If you work with bazel, you have probably seen the load
statement before.
Filegroups and caveats
Next, we will make all source files available for use in bazel.
filegroup(
name = "all_files",
srcs = glob(["**"]),
)
The glob
will cause all files in all subdirectories of the external repository to be grouped into the filegroup all_files
.
There could be situations in which you would not want to add all source files into
a single filegroup
target. For example, this could be the case if the source you
are trying to build already has files with names relevant to bazel. For example, if the source has a file named BUILD.bazel
, such a file will chop off any sources in its directory and subdirs from the glob(["**"]
. This is because of how glob
works: it will gather all the files from the current bazel package, where packages are defined as a subtree starting from a directory where a BUILD.bazel
file exists, then extending into all the subdirs, until any other BUILD.bazel
files are found. As a specific example, the ICU library for international text support has a vestigial bazel build that will hit you this way if you are not careful. Don’t ask how I know.
But this is not one of such situations, so we just dump all the files recursively into there. Since the file dtc.BUILD.bazel
will become @dtc//:BUILD.bazel
and will be the only BUILD
file in the external repository named dtc
, all files recursively will be relative to the package @dtc//
. So for example, if the project dtc
had a file with a path libfdt/fdt.c
(relative to its top level directory), then you’d refer to that file in bazel as @dtc//:libfdt/fdt.c
. I mention this as you may find this useful in some other situations; however at the moment we don’t need to do anything of sorts.
Finally the make
rule
The workhorse of the entire build is the invocation of the make
rule.
make(
name = "libdtc",
targets = [
"PREFIX=$$INSTALLDIR$$ " +
"WARNINGS=\"-Wall -Wpointer-arith -Wcast-qual -Wnested-externs " +
"-Wsign-compare -Wstrict-prototypes -Wmissing-prototypes " +
"-Wredundant-decls -Wshadow -Wwrite-strings\" " +
" install",
],
lib_source = ":all_files",
out_binaries = [ "dtc", "fdtdump", ],
visibility = ["//visibility:public"],
)
The issue that cost me much time to figure out is that the make
rule works off of a few undocumented conventions. Not knowing them (though they are undocumented!) will make it very hard for you to use the make
rule effectively.
The make
rule will provide the build with a bazel-created directory that is expected to contain the results of the build once the build completes. You can refer to it as INSTALLDIR
in the targets
list, as was done above. You need to put it between two sets of two dollar signs so that the value is properly substituted by bazel.
This is not the directory where the make
rule will build this external repository. This is the directory where the rule will copy the files it built at the very end of the build, and from where it will hand them off to bazel. The build directory is completely separate, which means that your make process must do an equivalent of make install
into INSTALLDIR
.
In the example above, the dtc
library uses the environment variable PREFIX
to pass the installation directory name. We use this knowledge to pass this prefix directory generated for us by bazel into the make rule by specifying:
targets = [
"PREFIX=$$INSTALLDIR$$ "
# ...other stuff here
]
Here, targets
is one long string of text that has been split up into lines
for readability only. Each entry in the targets list is the argument list for a make
invocation. A hermetic build of GNU make will be used for building, this is hard-coded in rules_foreign_cc
. This means that you do not need to have GNU Make already installed on your system. Instead, bazel will download its own copy and use that one to build.
The targets
specification above translates roughly to a make
invocation that looks something like this:
make \
PREFIX=/some/directory/that/bazel/creates \
WARNINGS="<that long string of options>" \
install
The intention of this command line is to install the build artifacts into that PREFIX
directory. (The WARNINGS
env variable was used to remove a compiler option that my compiler didn’t seem to support for some reason.)
If you don’t know what the structure of the directory is after make install
, I recommend that you try building your library outside of bazel by hand and note the contents of the installation directory. You can assume that the contents and the structure of the installation directory will be the same as in the stand-alone installation example.
Here’s a session transcript of such a stand-alone build of the dtc
library. We build into the subdirectory dtc/foo
$ cd $HOME
$ git clone git@github.com:dgibson/dtc # creates subdir $HOME/dtc
$ cd dtc
$ make PREFIX=foo install # installs into $HOME/dtc/foo
# (wait wait wait)
You may need to install a few prerequisites if you don’t have them on your system already, such as bison
.
Now, you can check out the contents of the installation directory:
$ cd foo
$ tree
.
├── bin
│ ├── convert-dtsv0
│ ├── dtc
│ ├── dtdiff
│ ├── fdtdump
│ ├── fdtget
│ ├── fdtoverlay
│ └── fdtput
├── include
│ ├── fdt.h
│ ├── libfdt.h
│ └── libfdt_env.h
└── lib
├── libfdt-1.7.0.so
├── libfdt.a
├── libfdt.so -> libfdt.so.1
└── libfdt.so.1 -> libfdt-1.7.0.so
The bin/
directory contains the built binaries. The include/
directory contains the include files, and lib
contains the shared and the static libraries respectively.
The following parameter just lists all files that are part of the build.
lib_source = ":all_files",
It is important to tell bazel rules where all the input files are, because otherwise they will not be visible to the build rule in the build sandbox that bazel creates.
Excavating build artifacts from the install directory
This was the most under-documented part of this process.
out_binaries = [ "dtc", "fdtdump", ],
The out_binaries
parameter specifies the names of the binaries of interest. But, from the available documentation, it was unclear how bazel knows exactly where these files are, since they could in principle be arbitrarily nested within the output directory.
The answer is in a bit of (undocumented) convention. As we’ve seen before, bazel will look for all build artifacts from the make
rule in the directory INSTALLDIR
. And we’ve seen how that directory’s file tree looks in the specific case of the dtc
library.
It turns out that the parameter out_bin_dir
tells bazel which directory to look for the binaries; and its default value is bin
. This means that when you have
out_binaries = [ "dtc", "fdtdump", ],
then bazel will look for these files under the paths:
$INSTALLDIR/<out_bin_dir>/dtc
$INSTALLDIR/<out_bin_dir>/fdtdump
If your build process puts binaries into some different directory, you may need to figure out the correct value of out_bin_dir
such that bazel would know where your built binaries are. Or, if you have an option, you could tweak the Makefile
contents of your external repository (either upstream, or via a patch) to match the convention used in rules_foreign_cc
.
Making the binaries available to bazel rules
Now that you built your binaries, how do you refer to them in a bazel build. Presumably you want to mention them in a genrule
, or a custom build rule. You need a label that refers to each binary.
Here is how it is done for the dtc
binary.
filegroup(
name = "dtc",
srcs = [":libdtc" ],
output_group = "dtc",
visibility = ["//visibility:public"],
)
Here the undocumented part is the output_group
bit. Each binary output produced by the make
rule invocation is placed into a separate output group, named after that binary. You can use that convention to excavate the dtc
binary from the build artifacts. The dtc
binary is now available to the rest of the bazel build as @dtc//:dtc
.
You can repeat this exercise to expose any other binaries that you need. You do not need to expose all of them if you don’t need all of them.
Making the include files and libraries available to bazel rules
We’ve seen how to excavate the binaries from bazel rules. How do we do the same for include files and libraries?
It turns out, you do not need to do that. The make
rule invocation will generate a C++ code provider which will already have all the include files and the libraries. This means you will need to do:
cc_binary(
name = "your_binary,
srcs = [...],
deps = [
"@dtc//:libdtc",
],
)
and bazel will know what to do from there. That means the downstream users of the target @dtc//:libdtc
will have the appropriate compiler flags (e.g. -L
and -I
) set correctly by bazel.
The silent part here is that the includes and the libraries need to be findable in INSTALLDIR
. This is achieved similarly to how it is done for binaries.
Except, in this case, the make
parameters out_include_dir
(default value include
), and out_lib_dir
and out_shared_libs
govern.
Conclusion
I hope that this text, and the example explain how you can use the make
build rule in bazel yourself.
The bazel examples didn’t quite help me understand the mechanisms at play here, so I wrote this explanation up in hope it is useful to someone in the future.
The above rules will use the default C++ toolchain to compile. Which means, if you use a non-hermetic C++ toolchain, your system’s C++ compiler will be used. If you use a hermetic toolchain, then that toolchain will be used.
While it may be of interest to explore how you can build with exactly the right toolchain, that explanation is out of scope of this post.
References
- https://bazelbuild.github.io/rules_foreign_cc/main/make.html
- https://bloggerbust.ca/post/adding-a-dependency-based-on-autotools-to-a-bazel-project/
- https://groups.google.com/g/bazel-discuss/c/q3hMXvv2zh8
- https://stackoverflow.com/questions/75248131/why-does-bazels-rules-foreign-cc-make-not-find-create-the-artifacts
- https://www.google.com/search?q=how+the+make()+build+rule+works+in+rules_foreign_cc