How to use the make rule from rules_foreign_cc repository for bazel

2023/06/13

Summary

The bazel build system has rules available to build C or C++ code from external repositories that use other more “conventional” build tools. By this I mean the usual suspects: GNU autotools, GNU make and cmake.

This rules repository is called rules_foreign_cc. While it is somewhat documented, and even has examples, both leave a lot to be desired.

I think it would be really helpful to have at least some sort of a guide for how exactly they work, so that we can use them more effectively. Such a guide, however, does not seem to exist. At least, I was not able to find one.

So I decided to write it.

If you think talk is cheap, you can skip right away to the example code. However, I think it would be useful to spend some time also understanding the mechanism by which the rules work.

The worked example is only for the make rule, i.e. for builds that don’t use autotools, nor cmake. The approach in other rules seems similar. But, I haven’t tested all of them so this text is limited to the use of the make rule. You may be able to extrapolate it so that it is useful for other rules.

What’s this about?

One of the main issues I had with bazel is that, awesome as it is, hardly anyone uses it in the open source world. This means, if would like to use code from external repositories, they will most likely not be written with bazel compilation in mind.

This means, to make the library available to bazel, you have one of several options.

The rule set rules_foreign_cc are written to allow bazel to ‘absorb’ C or C++ artifacts from non-bazel code repositories and make them available as bazel targets. However, the documentation is sparse, and examples do not seem to clarify how the rules work. Recently I spent some time figuring out specifically how the make() rule works in rules_foreign_cc. This text is the result of that code spelunking. However, nothing here assumes any particular compilation toolchain.

The example uses non-hermetic builds. I.e. if your make uses nonstandard tools, you’d need to provide them either as preinstalled binaries on your machine, or using some other means. That makes it perhaps less useful in “production” settings, but I suppose that if you are looking into this for work, you already have another expert handy for your toolchains.

Prerequisites

Here are some things that would be useful to do before you attempt to use rules_foreign_cc.

Try a local compilation first

The rules in rules_foreign_cc are very sensitive to the actual underlying build process that they are invoking. For a make(...) rule invocation to be effective you will need to know exactly how the underlying make process behaves, what flags it takes and their effects. Knowing this, it will be easier for you to figure out what you need to do.

For this reason, I recommend trying to compile the program you wanted to compile locally, using the regular build approaches, outside of bazel.

Some things to note here are:

Initializing a bazel repository

Create a new directory that will be initialized as a bazel workspace.

mkdir test
cd test

Bazel needs a WORKSPACE file, so we fill this into it. Most of the file is importing the required rules. The dtc git repository is a sample external library that has a Makefile based build. There is nothing special about this particular library. It was the library I needed to compile recently and it makes for a nice case study.

load("@bazel_tools//tools/build_defs/repo:git.bzl", "new_git_repository")
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
    name = "rules_foreign_cc",
    strip_prefix = "rules_foreign_cc-0.9.0",
    url = "https://github.com/bazelbuild/rules_foreign_cc/archive/0.9.0.tar.gz",
    sha256 = "2a4d07cd64b0719b39a7c12218a3e507672b82a97b98c6a89d38565894cf7c51",
)
load("@rules_foreign_cc//foreign_cc:repositories.bzl", "rules_foreign_cc_dependencies")
rules_foreign_cc_dependencies()

# Device Tree Compiler
new_git_repository(
    name = "dtc",
    commit = "ccf1f62d59adc933fb348b866f351824cdd00c73",
    remote = "https://github.com/dgibson/dtc",
    build_file = "//third_party/dtc:dtc.BUILD.bazel",
    shallow_since = "1686217671 +1000",
)

The WORKSPACE file is fairly standard. We add the library rules_foreign_cc using its archive, and check out a library called dtc from Github. This library uses a custom make file for compilation, so is a fairly good vehicle to explain how foreign compilation works in bazel.

Note that we define a custom build file with build_file = "//third_party/dtc:dtc.BUILD.bazel". This file does not exist yet, we will add it in later steps. The build file named in this parameter to new_git_repository will be inserted into the top level directory of the dtc external repository once the repository is downloaded. Its full label will then be @dtc//:BUILD.bazel. In general, bazel will allow you to insert build files into external repositories you download, and will also allow you to apply patch files if you need to make local changes.

We also initialize the top level BUILD.bazel file:

echo > BUILD.bazel

This file is empty, but is used to denote the top of the bazel package hierarchy.

Creating the files for DTC compilation

mkdir -p third_party/dtc
echo > third_party/dtc/BUILD.bazel

Again, the BUILD.bazel will be used only to create a package at //third_party/dtc, so that we can refer to the file dtc.BUILD.bazel as //third_party/dtc:dtc.BUILD.bazel. If this file was not there, then we’d have to use //:third_party/dtc/dtc.BUILD.bazel, provided that we don’t run into other issues that would possibly prevent us from doing so. Placing BUILD.bazel files in directories to convert them into packages is an easy way to stop worrying about these ambiguities.

We now create the file //third_party/dtc/dtc.BUILD.bazel. This was the file we referred to earlier in the WORKSPACE file. Here are its contents in full (except for comments), and we’ll discuss each snippet in turn.

load("@rules_foreign_cc//foreign_cc:defs.bzl", "make")
filegroup(
    name = "all_files",
    srcs = glob(["**"]),
)
make(
  name = "libdtc",
  targets = [
    "PREFIX=$$INSTALLDIR$$ " +
    "WARNINGS=\"-Wall -Wpointer-arith -Wcast-qual -Wnested-externs " +
    "-Wsign-compare -Wstrict-prototypes -Wmissing-prototypes " +
    "-Wredundant-decls -Wshadow -Wwrite-strings\" " +
    " install",
  ],
  lib_source = ":all_files",
  out_binaries = [ "dtc", "fdtdump", ],
  visibility = ["//visibility:public"],
)
filegroup(
    name = "dtc",
    srcs = [":libdtc" ],
    output_group = "dtc",
    visibility = ["//visibility:public"],
)

The snippet below makes the make rule available from the external repository rules_foreign_cc.

load("@rules_foreign_cc//foreign_cc:defs.bzl", "make")

If you work with bazel, you have probably seen the load statement before.

Filegroups and caveats

Next, we will make all source files available for use in bazel.

filegroup(
    name = "all_files",
    srcs = glob(["**"]),
)

The glob will cause all files in all subdirectories of the external repository to be grouped into the filegroup all_files.

There could be situations in which you would not want to add all source files into a single filegroup target. For example, this could be the case if the source you are trying to build already has files with names relevant to bazel. For example, if the source has a file named BUILD.bazel, such a file will chop off any sources in its directory and subdirs from the glob(["**"]. This is because of how glob works: it will gather all the files from the current bazel package, where packages are defined as a subtree starting from a directory where a BUILD.bazel file exists, then extending into all the subdirs, until any other BUILD.bazel files are found. As a specific example, the ICU library for international text support has a vestigial bazel build that will hit you this way if you are not careful. Don’t ask how I know.

But this is not one of such situations, so we just dump all the files recursively into there. Since the file dtc.BUILD.bazel will become @dtc//:BUILD.bazel and will be the only BUILD file in the external repository named dtc, all files recursively will be relative to the package @dtc//. So for example, if the project dtc had a file with a path libfdt/fdt.c (relative to its top level directory), then you’d refer to that file in bazel as @dtc//:libfdt/fdt.c. I mention this as you may find this useful in some other situations; however at the moment we don’t need to do anything of sorts.

Finally the make rule

The workhorse of the entire build is the invocation of the make rule.

make(
  name = "libdtc",
  targets = [
    "PREFIX=$$INSTALLDIR$$ " +
    "WARNINGS=\"-Wall -Wpointer-arith -Wcast-qual -Wnested-externs " +
    "-Wsign-compare -Wstrict-prototypes -Wmissing-prototypes " +
    "-Wredundant-decls -Wshadow -Wwrite-strings\" " +
    " install",
  ],
  lib_source = ":all_files",
  out_binaries = [ "dtc", "fdtdump", ],
  visibility = ["//visibility:public"],
)

The issue that cost me much time to figure out is that the make rule works off of a few undocumented conventions. Not knowing them (though they are undocumented!) will make it very hard for you to use the make rule effectively.

The make rule will provide the build with a bazel-created directory that is expected to contain the results of the build once the build completes. You can refer to it as INSTALLDIR in the targets list, as was done above. You need to put it between two sets of two dollar signs so that the value is properly substituted by bazel.

This is not the directory where the make rule will build this external repository. This is the directory where the rule will copy the files it built at the very end of the build, and from where it will hand them off to bazel. The build directory is completely separate, which means that your make process must do an equivalent of make install into INSTALLDIR.

In the example above, the dtc library uses the environment variable PREFIX to pass the installation directory name. We use this knowledge to pass this prefix directory generated for us by bazel into the make rule by specifying:

targets = [
    "PREFIX=$$INSTALLDIR$$ "
    # ...other stuff here
]

Here, targets is one long string of text that has been split up into lines for readability only. Each entry in the targets list is the argument list for a make invocation. A hermetic build of GNU make will be used for building, this is hard-coded in rules_foreign_cc. This means that you do not need to have GNU Make already installed on your system. Instead, bazel will download its own copy and use that one to build.

The targets specification above translates roughly to a make invocation that looks something like this:

make \
  PREFIX=/some/directory/that/bazel/creates \
  WARNINGS="<that long string of options>" \
  install

The intention of this command line is to install the build artifacts into that PREFIX directory. (The WARNINGS env variable was used to remove a compiler option that my compiler didn’t seem to support for some reason.)

If you don’t know what the structure of the directory is after make install, I recommend that you try building your library outside of bazel by hand and note the contents of the installation directory. You can assume that the contents and the structure of the installation directory will be the same as in the stand-alone installation example.

Here’s a session transcript of such a stand-alone build of the dtc library. We build into the subdirectory dtc/foo

$ cd $HOME
$ git clone git@github.com:dgibson/dtc # creates subdir $HOME/dtc
$ cd dtc
$ make PREFIX=foo install # installs into $HOME/dtc/foo
# (wait wait wait)

You may need to install a few prerequisites if you don’t have them on your system already, such as bison.

Now, you can check out the contents of the installation directory:

$ cd foo 
$ tree
.
├── bin
│   ├── convert-dtsv0
│   ├── dtc
│   ├── dtdiff
│   ├── fdtdump
│   ├── fdtget
│   ├── fdtoverlay
│   └── fdtput
├── include
│   ├── fdt.h
│   ├── libfdt.h
│   └── libfdt_env.h
└── lib
    ├── libfdt-1.7.0.so
    ├── libfdt.a
    ├── libfdt.so -> libfdt.so.1
    └── libfdt.so.1 -> libfdt-1.7.0.so

The bin/ directory contains the built binaries. The include/ directory contains the include files, and lib contains the shared and the static libraries respectively.

The following parameter just lists all files that are part of the build.

lib_source = ":all_files",

It is important to tell bazel rules where all the input files are, because otherwise they will not be visible to the build rule in the build sandbox that bazel creates.

Excavating build artifacts from the install directory

This was the most under-documented part of this process.

out_binaries = [ "dtc", "fdtdump", ],

The out_binaries parameter specifies the names of the binaries of interest. But, from the available documentation, it was unclear how bazel knows exactly where these files are, since they could in principle be arbitrarily nested within the output directory.

The answer is in a bit of (undocumented) convention. As we’ve seen before, bazel will look for all build artifacts from the make rule in the directory INSTALLDIR. And we’ve seen how that directory’s file tree looks in the specific case of the dtc library.

It turns out that the parameter out_bin_dir tells bazel which directory to look for the binaries; and its default value is bin. This means that when you have

out_binaries = [ "dtc", "fdtdump", ],

then bazel will look for these files under the paths:

$INSTALLDIR/<out_bin_dir>/dtc
$INSTALLDIR/<out_bin_dir>/fdtdump

If your build process puts binaries into some different directory, you may need to figure out the correct value of out_bin_dir such that bazel would know where your built binaries are. Or, if you have an option, you could tweak the Makefile contents of your external repository (either upstream, or via a patch) to match the convention used in rules_foreign_cc.

Making the binaries available to bazel rules

Now that you built your binaries, how do you refer to them in a bazel build. Presumably you want to mention them in a genrule, or a custom build rule. You need a label that refers to each binary.

Here is how it is done for the dtc binary.

filegroup(
    name = "dtc",
    srcs = [":libdtc" ],
    output_group = "dtc",
    visibility = ["//visibility:public"],
)

Here the undocumented part is the output_group bit. Each binary output produced by the make rule invocation is placed into a separate output group, named after that binary. You can use that convention to excavate the dtc binary from the build artifacts. The dtc binary is now available to the rest of the bazel build as @dtc//:dtc.

You can repeat this exercise to expose any other binaries that you need. You do not need to expose all of them if you don’t need all of them.

Making the include files and libraries available to bazel rules

We’ve seen how to excavate the binaries from bazel rules. How do we do the same for include files and libraries?

It turns out, you do not need to do that. The make rule invocation will generate a C++ code provider which will already have all the include files and the libraries. This means you will need to do:

cc_binary(
  name = "your_binary,
  srcs = [...],
  deps = [
    "@dtc//:libdtc",
  ],
)

and bazel will know what to do from there. That means the downstream users of the target @dtc//:libdtc will have the appropriate compiler flags (e.g. -L and -I) set correctly by bazel.

The silent part here is that the includes and the libraries need to be findable in INSTALLDIR. This is achieved similarly to how it is done for binaries.

Except, in this case, the make parameters out_include_dir (default value include), and out_lib_dir and out_shared_libs govern.

Conclusion

I hope that this text, and the example explain how you can use the make build rule in bazel yourself.

The bazel examples didn’t quite help me understand the mechanisms at play here, so I wrote this explanation up in hope it is useful to someone in the future.

The above rules will use the default C++ toolchain to compile. Which means, if you use a non-hermetic C++ toolchain, your system’s C++ compiler will be used. If you use a hermetic toolchain, then that toolchain will be used.

While it may be of interest to explore how you can build with exactly the right toolchain, that explanation is out of scope of this post.

References

  1. https://bazelbuild.github.io/rules_foreign_cc/main/make.html
  2. https://bloggerbust.ca/post/adding-a-dependency-based-on-autotools-to-a-bazel-project/
  3. https://groups.google.com/g/bazel-discuss/c/q3hMXvv2zh8
  4. https://stackoverflow.com/questions/75248131/why-does-bazels-rules-foreign-cc-make-not-find-create-the-artifacts
  5. https://www.google.com/search?q=how+the+make()+build+rule+works+in+rules_foreign_cc