bazoekt: easy code indexing and search for your Bazel projects

2023/12/17

https://github.com/filmil/bazoekt

Since I can’t remember anything these days, I often find myself sifting through my code looking for specific keywords. Sometimes grep isn’t quite enough, and something more is needed.

I am also a fan of Bazel, and have recently decided to try using it as much as possible as a way to automate away the building pains. Bazel is nice because it removes almost all the project setup efforts. The flip side is the complexity, which has been criticized before.

It occurred to me recently that we have the tools to add code indexing easily to any bazel project. H.-W. Nijenhuis (incidentally also one of the original authors of Bazel) has published a program by the tongue-in-cheek name “zoekt” (“you seek” in Dutch, but apparently there’s a back story to it) that does fast trigram-based indexing and search. The program has stopped being developed a long time ago, but still works, and does its job well. Which, frankly, these days is a rarity. Sourcegraph has taken over the development of “zoekt”, but last time I checked their repository was broken. The original repository however compiles and runs without a hitch.

One thing I appreciated about zoekt is how uncomplicated it is. Especially versus tools like kythe or sourcegraph, which are next to impossible to set up. And for sure not worth setting up on a local machine.

It occurred to me that I could automate indexing of bazel projects by adding a few well-chosen scripts to an existing a Bazel repo, and feeding the default settings to zoekt for further processing. You could then easily add the result to any Bazel project at a moment’s notice. Zoekt has other approaches and options, such as indexing git repositories, or pulling and indexing public git repos and such. But most of those are not required if the only thing you want is to index the code you are working on right now.

And after some 30 minutes of tinkering (and frankly, a few hours resolving bizarre bazel issues), “bazoekt” was born. Once you make it part of your Bazel project setup (e.g. in your WORKSPACE file), it will download and prepare itself the first time you run any of the commands below. Like magic.

The details are available at https://github.com/filmil/bazoekt; but in short you need to add the following stanza in your WORKSPACE file:

# BEGIN: bazoekt
http_archive(
    name = "bazoekt",
    sha256 = "",
    strip_prefix = "bazoekt",
    urls = [
        "https://github.com/filmil/bazoekt/releases/download/0.0.7/bazoekt-linux-amd64.zip",
    ],
)
# END: bazoekt

The archive is a binary distribution, so it will not introduce any transitive dependencies that could potentially mess up your workspace. (Incidentally, that was one of the issues that took me the longest time to resolve: a repo that I wanted to use bazoekt on had a conflict in dependencies use that could not be resolved by obvious interventions in the WORKSPACE file. I am told bazel modules will solve this class of problems, but it remains to be seen.)

To use the setup, you must first run the indexer. This is a short command line as follows:

bazel run @bazoekt//:index

Once done, you run the following command to run a web server:

bazel run @bazoekt//:serve

By default the web server listens on the port 6070, so you can get there by visiting the URL:

https://localhost:6070

While the web server is running, the above URL will show you zoekt’s web interface. Typing any words into the uncomplicated search bar will find you their occurrences in mere milliseconds, and give you links to the respective sources.

The default settings are fixed so that the above commands will automatically find each other and do the right thing. Both invocations have a few flags you can use, but the idea is that you normally won’t need anything besides the basic setup, and that the defaults would be enough.

To invoke help, you can issue one of the following commands:

bazel run @bazoekt//:index -- --help # note the double-dash in the middle
bazel run @bazoekt//:serve -- --help

These allow you to change some of the trivial default settings to other trivial settings. Those are a tiny subset of the configuration options offered by “zoekt”, but I contend that changing them will rarely be needed.

While this is something I cobbled together for my own use, I would love to hear from anyone who may visit https://github.com/filmil/bazoekt and decide to give it a try.