Skip to content

ruifm/llvm-dev

llvm-dev — development environment overlay for llvm-project

CI License: Apache 2.0 with LLVM Exceptions

A just + podman-based, container-first workflow for upstream LLVM development. Ships as an overlay: you clone this repo as .dev/ inside an existing llvm-project checkout.

See CONTRIBUTING.md to send a patch, AGENTS.md for AI-agent / contributor conventions, CHANGELOG.md for release history, and SECURITY.md to report vulnerabilities.

Install

From the root of an llvm-project clone:

git clone https://github.com/ruifm/llvm-dev.git .dev
just -f .dev/justfile init

init drops a ./justfile symlink to .dev/justfile and appends the overlay's gitignore patterns to .git/info/exclude (so this setup stays out of the public tree). After that, just <recipe> works from the repo root or any subdirectory.

The only host prerequisite is just; everything else (clang, cmake, ninja, ccache, lldb, copilot CLI, ...) runs inside the container the first just invocation builds.

To update the overlay later: just overlay-update.

Run just --list for the full recipe catalog. The rest of this document covers the bits that aren't obvious from the recipe list.

Layout

.dev/
  Containerfile              # Fedora-based dev image (clang/lld/lldb/cmake/ninja/ccache/mold/...)
  cmake/
    common/
      Common.cmake           # Base shared by every build (targets=all, runtimes on, clang+mold)
      CommonDev.cmake        # Shared by every "dev-like" build (assertions, werror, coverage, ...)
    caches/                  # `-C`-able build-type caches (one per `just switch <name>`)
      Dev.cmake              # RelWithDebInfo + assertions + shared libs (default)
      Debug.cmake
      Release.cmake
      Profiling.cmake        # Release + frame pointers + debug info
      Sanitizers.cmake       # ASan + UBSan (combined "quick safety net")
      TSan.cmake / MSan.cmake

Pick a build via just switch <name> (it matches the cache file name). The active build lives at builds/current/ and all tooling (clangd, ninja, cmake) points at it.

Environment variable overrides

Every knob is overridable via CLI (just name=value recipe) or env var:

Env var Default Purpose
LLVM_DEFAULT_BUILD_TYPE Dev Build to use when no builds/current exists.
LLVM_GIT_DIR repo root Bind-mount root for the container.
LLVM_BUILD_HOME $repo/builds Where build directories live.
LLVM_DEFAULT_SHELL bash Shell used by a bare just rsh.
LLVM_BASE_REMOTE origin Remote for verify / lint-diff / format. Set to upstream for forks.
LLVM_USE_TTY true Attach a TTY to podman exec. Disable for editor/LSP invocations.
LLVM_CONTAINER_MEMORY 0 (unlimited) Memory limit (e.g. 32g).
LLVM_BUILD_ENV podman Set to native to bypass the container.
LLVM_COVERAGE false Runtime toggle: capture coverage profile data during run/test. See Coverage.
CCACHE_REMOTE_STORAGE (empty) Forwarded into the container as CCACHE_REMOTE_STORAGE.
EXTRA_PODMAN_RUN_ARGS (empty) Extra args for podman run (e.g. bind-mount dotfiles).
EXTRA_PODMAN_EXEC_ARGS (empty) Extra args for podman exec (e.g. extra env vars).

clangd integration

compile_commands.json is symlinked at the repo root pointing at builds/current/compile_commands.json, so clangd's default upward search works out of the box.

Invoke clangd inside the container so it uses the same toolchain as the build. A neovim LSP snippet:

cmd = { "just", "rsh", "use_tty=false", "clangd", "--background-index" }

Worktrees

git worktree layouts are handled: the justfile bind-mounts both the worktree directory and the shared --git-common-dir when they differ.

Profiling

just profile llvm-tblgen --version

Builds the Profiling cache variant, runs perf record inside the container, and writes /tmp/flamegraph.svg. Copy it out with podman cp <container>:/tmp/flamegraph.svg ..

just bench <target> [args…] runs the same Profiling build under hyperfine --warmup 3 --runs 10 for a quick wall-clock measurement.

just ab-test <refA> <refB> <target> [args…] compares two git refs head-to-head: spins up a worktree + build directory for each side, builds target under Profiling, and feeds both binaries to hyperfine with named outputs. Worktrees are cleaned up on exit (via trap); your current checkout and build are untouched.

Testing

just test [target] [filter] dispatches to the right backend:

  • Unittest source file or dir under */unittests/* (e.g. llvm/unittests/Support/PathTest.cpp) — walks up to the nearest CMakeLists.txt with add_*_unittest(<NAME> ...), builds <NAME>, and runs the resulting gtest binary with filter forwarded as --gtest_filter (gtest glob syntax, wrapped in *…* for substring matching — pass is_absolute to match any test whose name contains it).
  • Path under llvm/test/, clang/test/, mlir/test/, etc. — runs llvm-lit --filter=<filter> after building the matching *-test-depends target.
  • Bare project name (all, llvm, clang, clang-tools-extra, mlir, lld, lldb, flang, polly, bolt) — expanded to the corresponding check-<project> ninja target.
  • Anything else — treated as a raw ninja target (gtest binaries, custom check-* aliases, etc.).

filter is a Python regex. On the lit path it's forwarded as --filter; on the ninja path it's exported as LIT_FILTER, which check-* targets pick up automatically. gtest and other non-lit targets ignore it.

just test                                   # default: check-clang
just test clang                             # check-clang
just test mlir                              # check-mlir
just test all                               # check-all
just test clang 'CodeGen.*'                 # check-clang, filtered
just test clang/test/Sema/enum.cpp          # lit, single file
just test clang/test/Sema/ '.*enum.*'       # lit, directory + filter
just test llvm/unittests/Support/PathTest.cpp            # gtest binary (SupportTests)
just test llvm/unittests/Support/PathTest.cpp 'is_absolute'
just test ClangSemaTests                    # gtest binary (ninja)

Sanitizer runs

Switch to a sanitizer build type (just switch Sanitizers / TSan / MSan), then use just san-run <target> [args…]. It asserts the current build is a sanitizer variant and forwards to run, inheriting the canonical *SAN_OPTIONS that _cmd_prefix exports in those build types (abort on error, stack traces, leak detection, etc.). Pass extra_podman_exec_args='--env ASAN_OPTIONS=…' to override.

Pre-push gate

just presubmit [base=main] [test-target=default]

Runs format-difflint-difftest <target> in order, failing fast. Intended as a manual pre-push check, not a git hook.

git bisect

git bisect start BAD GOOD
git bisect run just bisect-run <target> [filter]

bisect-run rebuilds <target> and runs just test <target> <filter>. Build failures exit 125 (bisect's "skip"); the test's exit code determines good/bad. Use a ninja target or a bare project name for target, not a source-file path (otherwise build is a no-op and only the test gets re-run).

Rebase-verify

just verify <base> <test-target>

Interactive rebase onto base_remote/base with test <test-target> run after every commit to catch mid-series breakage. Both arguments are required so nobody accidentally reruns the default check-<project> across a long series.

Coverage

Source-based coverage is always compiled in for every dev-shape build (Dev, Debug, Sanitizers, TSan, MSan — anything that inherits CommonDev.cmake). Release and Profiling are not instrumented. Instrumentation costs ~10–20% runtime and ~15–30% binary size; the trade was made once so every build is coverage-ready.

Collection is runtime-gated by LLVM_COVERAGE / the coverage just-var:

  • coverage=false (default) — LLVM_PROFILE_FILE=/dev/null, all profile writes discarded. Zero files on disk, zero maintenance.
  • coverage=true — profile data lands in /tmp/coverage/cov-%p-%m.profraw inside the container.

When coverage=true, run and test automatically wipe /tmp/coverage before the body and emit an HTML + text report after. Manual workflow (accumulate multiple runs, report once) stays available via the public recipes:

# One-shot: fresh data → run → HTML + text summary.
just cov FileCheck foo.txt          # == just coverage=true run FileCheck foo.txt
just test-cov check-clang           # == just coverage=true test check-clang

# Restrict the report to one binary.
just coverage-report FileCheck

# Manual collect-then-report.
just coverage-reset
just coverage=true run FileCheck a.txt
just coverage=true run FileCheck b.txt
just coverage-report FileCheck

HTML output: /tmp/coverage/html/index.html inside the container (just rsh ls /tmp/coverage/html to inspect; podman cp to export).

just coverage-diff [base=main] [binaries…] restricts the report to files changed vs base_remote/base. It reuses an existing /tmp/coverage/merged.profdata, so run just test-cov (or just cov) first to populate it.

Caveats:

  • If run/test exits non-zero (e.g. a lit failure), Just skips the post-dep, so the auto-report is not emitted — run just coverage-report manually. Failures are surfaced rather than papered over.
  • exec_native=true bypasses _cmd_prefix and therefore the LLVM_PROFILE_FILE injection, so coverage is not captured on host-exec runs.
  • /tmp/coverage lives inside the dev container and is wiped by just reset.

Multi-clone isolation

Each clone gets its own container name (hashed from the repo path), its own clangd index volume, and its own builds/ directory. just stop only stops the current clone's container; just stop-all stops every clone's container.

ccache is shared across clones (safe thanks to CCACHE_BASEDIR + CCACHE_NOHASHDIR + LLVM_USE_RELATIVE_PATHS_IN_DEBUG_INFO).

Dev-tooling hygiene

just dev-fmt / just dev-lint / just dev-check format and lint the files that drive this setup (justfile, .dev/Containerfile, .dev/cmake/caches/). All tools run inside the container:

  • just --unstable --fmt for the justfile
  • hadolint for the Containerfile
  • cmake-format / cmake-lint (via cmakelang) for the cache files

dev-check is an alias for dev-lint, suitable for CI or a pre-push hook.

Documentation builds

just doc                        # default: docs-llvm-html
just doc docs-clang-html
just doc docs-lldb-html
just doc docs-mlir-doc

Output lives under builds/current/docs/html/index.html for the LLVM docs target and builds/current/tools/<project>/docs/html/ for each subproject.

Housekeeping

  • just status — show the active build type, build dir, and container.
  • just switch <type> — activate a different cmake-cache build type.
  • just gc — delete every directory under build_home except the currently-active build. Guarded by a confirmation prompt.
  • just reset — stop the container and wipe /tmp/coverage inside.
  • just stop / just stop-all — stop this clone's container / all.

The dev setup appends its own ignore patterns (.dev/gitignore) to .git/info/exclude during _init, so the tracked .gitignore stays clean of developer-local generated files (compile_commands.json, etc.).

Copilot CLI

just copilot [extra-args...] launches the GitHub Copilot CLI inside the dev container with --autopilot --enable-all-github-mcp-tools --yolo --continue.

  • Auth passthrough: if gh is logged in on the host, gh auth token is captured and forwarded as GH_TOKEN into the container. Otherwise run /login inside the CLI on first use; the token then lives in the named volume.
  • Persistence: /home/dev/.copilot is backed by a podman named volume (llvm-project.copilot). Sessions, agents, MCP/LSP config, and the session-store.db survive container rebuilds.
  • Reset: podman volume rm llvm-project.copilot.

Distributed compilation with distcc

Optional. Offload compilation to spare machine(s) on your LAN.

The Containerfile defines two stages:

  • worker — minimal: pinned clang (ARG CLANG_VERSION in the Containerfile) + distcc-server. That's it. Nothing else runs on a worker because distcc only dispatches compile (cc1) invocations; preprocessing and linking stay client-local.
  • dev (FROM worker, default final stage) — full interactive environment built on top. Inherits the pinned clang automatically, so the .o a worker produces is bit-compatible with what the dev box would have produced locally.

Only clang is pinned. Everything else (cmake, ninja, lld, mold, clangd overlay, dev tooling, distcc protocol daemon) floats to latest-stable. To bump clang: edit ARG CLANG_VERSION in .dev/Containerfile and rebuild both sides.

ccache sits in front of distcc via CCACHE_PREFIX=distcc, so cache hits never cross the network.

Client setup

Set LLVM_DISTCC_HOSTS in your shell rc:

export LLVM_DISTCC_HOSTS="localhost/16 laptop.lan/6,lzo"

Format: host/jobs[,lzo]. lzo enables compression, worth it on anything slower than wired gigabit. Pick jobs conservatively per helper — LLVM translation units can peak at 2–4 GB each, so /6 on a 16 GB helper leaves headroom. localhost/N where N ≈ local core count.

Invoke just build normally. When LLVM_DISTCC_HOSTS is empty, everything behaves as before.

Worker setup

On the helper machine, the only prerequisites are git, podman, and just:

git clone <your-llvm-project-remote>
cd llvm-project
just distccd

Runs in the foreground; Ctrl-C stops it. The minimal worker image is built on first run (no dev tooling, no clangd overlay — small and fast). By default the daemon accepts any client (LLVM_DISTCC_ALLOW=0.0.0.0/0); tighten it to a LAN subnet (e.g. 192.168.0.0/16) if the helper is reachable from untrusted networks. Override the port with LLVM_DISTCC_PORT (default 3632).

Running as a systemd user service

For a dedicated helper you probably want distccd to come up automatically and restart on failure. Three recipes wrap a systemd --user template unit (.dev/systemd/llvm-distccd@.service, instanced per checkout path):

just distccd-install    # link + daemon-reload + enable --now
just distccd-log        # journalctl --user -f -u ...
just distccd-uninstall  # disable --now

To have the service start at boot (no login required):

loginctl enable-linger "$USER"

Caveats

  • Plain distcc mode only. Pump mode is known to break on LLVM's include graph; don't enable it.
  • Linking and tablegen stay local — distcc only helps compilation.
  • distccd here does not use TLS. Keep the --allow subnet narrow; don't expose port 3632 beyond your LAN.
  • Troubleshooting: DISTCC_VERBOSE=1 just build ... on the client, or tail the just distccd stderr on the helper (or just distccd-log if you installed the systemd service). distcc --show-hosts lists the active host list.

Updating the Copilot CLI

The GitHub Copilot CLI moves fast. To pick up its latest release without rebuilding the whole dev image:

just update-copilot

The Copilot install is the final RUN of the dev stage, guarded by ARG COPILOT_CACHEBUST. Mutating the arg invalidates only that one layer, so the rebuild is seconds.

License

Apache License 2.0 with LLVM Exceptions. See LICENSE.TXT. The same license used by upstream llvm-project; you can freely copy recipes, cmake snippets, or container bits between the two.

About

A just + podman overlay for hacking on upstream llvm-project. Clone into .dev/, go.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors