Fix #184 transitive external repo change now reaches main-repo consumers#350
Open
tinder-maxwellelliott wants to merge 2 commits into
Open
Fix #184 transitive external repo change now reaches main-repo consumers#350tinder-maxwellelliott wants to merge 2 commits into
tinder-maxwellelliott wants to merge 2 commits into
Conversation
a6af68a to
6741202
Compare
…cted The user reported that modifying a file inside an external repo C does not propagate to an internal target A, when A depends on external repo B and B in turn depends on C. The original example uses rules_python's pip_parse where requirements.txt resolves into a separate external repo per package: A is a py_test rule, B is @Moto, C is @Cryptography. Bumping cryptography in requirements.txt does not surface A as impacted. This is functionally the same root behaviour as #197's still-unfixed case (external-repo wrapping), but the #184 user described it as a deeper transitive build-time chain, so a separate fixture documents that shape. New workspace `bzlmod_transitive_external`: //:consume genrule that depends on @outer//:lib @outer//:lib genrule in module `outer` that depends on @inner//:data @inner//:data filegroup wrapping inner/data.txt `outer` and `inner` are real bzlmod modules brought in via the root module's `local_path_override`s. outer's MODULE.bazel just declares `bazel_dep(name = "inner")` (only the root module is allowed to use local_path_override). `.bazelignore` excludes the two sub-trees so they are not also treated as packages of the main module. Verified by hand against the locally built CLI (Bazel 9.1): inner/data.txt = "hello" -> "world" $ bazel-diff generate-hashes -w A ... ; bazel-diff generate-hashes -w B ... $ diff <(jq -S '.hashes' A) <(jq -S '.hashes' B) # empty $ bazel-diff get-impacted-targets ... # impacted file is empty The expected behaviour: //:consume should be impacted because the build-time content it transitively consumes through @outer / @inner has changed. I also briefly removed @ignore locally to confirm the new test fails for the right reason on current bazel-diff: testTransitiveExternalRepoChangeImpactsConsumer_reproducerForIssue184 AssertionFailedError: //:consume should be impacted when inner/data.txt changes (transitive external dep via @outer -> @inner per #184); got: [] Restored @ignore so the test is skipped under `bazel test //cli:E2ETest`. Remove @ignore once the fix lands. Refs #184 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this change, BazelQueryService.queryBzlmodRepos synthesised one //external:<repo> target per bzlmod-managed repo, hashed by repo *metadata* only. There was no rule_input edge between those synthetic targets and no content hash of the underlying directory, so a change inside @inner never propagated to @outer or to //:consume when: //:consume -> @outer//:lib @outer//:lib -> @inner//:data @inner//:data -> inner/data.txt The user in #184 saw this with rules_python's pip_parse where @Cryptography sits behind @Moto behind a py_test rule. The same shape is what's left unfixed in #197 (alias re-export through an external repo). Two changes in BazelQueryService: 1. queryBzlmodRepos now parses `bazel mod graph --output=json` (via BazelModService, newly injected) and, for each repo, computes the list of direct bzlmod deps. For each dep, it emits an addRuleInput("//external:<dep_apparent>") on the synthetic target so RuleHasher follows the chain //:consume -> //external:outer -> //external:inner during digest computation. We bridge module name -> canonical name by stripping the trailing `+<version>` suffix that bzlmod's canonical-name scheme uses; this works for the no-version-conflict case and skips extension-generated repos (e.g. "rules_jvm_external++maven+maven") because they do not appear in `bazel mod graph` anyway. (Build.Repository.module_key is not populated by current Bazel, so we cannot key off that.) 2. repositoryToTarget now content-hashes the on-disk directory of every `local_repository`-rule repo (which is what `local_path_override` lowers to) and attaches the digest as a synthetic `_bazel_diff_content_hash` attribute. MODULE.bazel.lock is skipped because bazel regenerates it indeterministically. BCR-fetched repos are left alone -- their version is already in the metadata, so a version bump flips the synthetic target's hash via that channel. ModuleGraphParser gains a parseModuleGraphDepEdges helper that returns module_name -> [dep_module_names] from the same JSON parseModuleGraph already consumes. Module name is used as the key because it is present on every entry in `bazel mod graph` output and on every Repository returned by `bazel mod show_repo`, while module_key is not. The reproducer test loses its @ignore, picks up the standard Bazel 8.6.0+ skip guard (the fix relies on mod show_repo --output=streamed_proto), and is renamed from testTransitiveExternalRepoChangeImpactsConsumer_reproducerForIssue184 to ...regressionForIssue184. Verified: * Manual end-to-end on Bazel 9.1 against the bzlmod_transitive_external fixture: editing inner/data.txt with no extra flags now lists //:consume in the impacted-targets output (was empty before). * bazel test //cli:E2ETest --test_filter=...regressionForIssue184 PASSED in 2.9s. * Full unit-test sweep (12 kt_jvm_test targets) PASSED. * Broad non-Maven E2ETest sweep (testTransitiveExternalRepo*, testBzlmodLocalPathOverride*, testModuleBazelCommentOnly*, testGoModUpdate*, testMacroBzlChange*, testBzlmodShowRepoDetects*, testExcludeExternalTargets*, testFineGrainedHashBzlMod, testBzlmodTransitiveDepsQuery, testGenerateHashesWithCquery, ...) PASSED in 113.6s. Refs #184 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6741202 to
61a3a30
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #184. The same root cause sits behind the still-open part of #197 (alias-wrap chain), so this fix benefits both shapes.
Before.
BazelQueryService.queryBzlmodRepossynthesised one//external:<repo>target per bzlmod-managed repo, hashed by repo metadata only. There was norule_inputedge between those synthetic targets and no content hash of the underlying directory, so a change inside@innernever propagated to@outeror to//:consume:The user in #184 saw this with
rules_python'spip_parsewhere@cryptographysits behind@motobehind apy_testrule.After. Two changes in
BazelQueryService:queryBzlmodReposnow parsesbazel mod graph --output=json(viaBazelModService, newly injected) and, for each repo, computes the list of direct bzlmod deps. For each dep, it emitsaddRuleInput("//external:<dep_apparent>")on the synthetic target soRuleHasherfollows the chain//:consume → //external:outer → //external:innerduring digest computation.Module name → canonical name is bridged by stripping the trailing
+<version>suffix from canonical names (works for the no-version-conflict case; skips extension-generated repos likerules_jvm_external++maven+mavenbecause they don't appear inbazel mod graphanyway).Build.Repository.module_keyis not populated by current Bazel, so we cannot key off that.repositoryToTargetnow content-hashes the on-disk directory of everylocal_repository-rule repo (which is whatlocal_path_overridelowers to) and attaches the digest as a synthetic_bazel_diff_content_hashattribute.MODULE.bazel.lockis skipped because bazel regenerates it indeterministically. BCR-fetched repos are left alone — their version is already in the metadata, so a version bump flips the synthetic target's hash via that channel.ModuleGraphParsergains aparseModuleGraphDepEdgeshelper that returnsmodule_name → [dep_module_names]from the same JSONparseModuleGraphalready consumes.The reproducer test loses its
@Ignore, picks up the standard Bazel 8.6.0+ skip guard (the fix relies onmod show_repo --output=streamed_proto), and is renamed…_reproducerForIssue184→…_regressionForIssue184.Test plan
bzlmod_transitive_externalfixture: editinginner/data.txtwith no extra flags now lists//:consumein the impacted-targets output (was empty before).bazel test //cli:E2ETest --test_filter=testTransitiveExternalRepoChangeImpactsConsumer_regressionForIssue184→ PASSED in 2.9s.kt_jvm_testtargets → all 12 PASSED.E2ETestsweep (testTransitiveExternalRepo*,testBzlmodLocalPathOverride*,testModuleBazelCommentOnly*,testGoModUpdate*,testMacroBzlChange*,testBzlmodShowRepoDetects*,testExcludeExternalTargets*,testFineGrainedHashBzlMod,testBzlmodTransitiveDepsQuery,testGenerateHashesWithCquery, …) → PASSED in 113.6s.🤖 Generated with Claude Code