Skip to content

Yo DAWGS, I heard you like rewrites#70

Open
zinic wants to merge 1 commit intoSpecterOps:mainfrom
zinic:main
Open

Yo DAWGS, I heard you like rewrites#70
zinic wants to merge 1 commit intoSpecterOps:mainfrom
zinic:main

Conversation

@zinic
Copy link
Copy Markdown
Contributor

@zinic zinic commented May 4, 2026

Tighten CySQL traversal translation so variable-length and shortest-path expansions avoid broad frontier work unless the query gives the translator a safe anchor.

  • Gate bidirectional shortest-path translation behind bound endpoint pairs, static endpoint ID equality, or pair-aware local endpoint constraints that clear the bidirectional anchor threshold on both sides. Label-only endpoints and broad predicates now stay on the unidirectional path.
  • Add materialized traversal root, terminal, and endpoint-pair filters and pass them through unidirectional and bidirectional harness overloads. Bound endpoint frames and safe local endpoint predicates can now reduce the root and terminal universe before BFS starts.
  • Add a bidirectional shortest-path harness with per-direction visited tables, pair resolution pruning, path-limit support, and targeted pathspace indexes so shortestPath can stop after the requested number of paths while preserving correctness.
  • Seed recursive expansions from distinct root CTEs, use UNION ALL when the seed shape makes duplication safe, and move recursive edge and node lookups behind lateral subqueries with OFFSET 0 to keep PostgreSQL on correlated index lookups instead of broad flattened scans.
  • Push down LIMIT into eligible traversal CTEs and shortest-path harness calls when the tail projection is transparent, non-aggregate, unsorted, non-distinct, and contains only safe endpoint inequality filters.
  • Add a source-level reference collector and use it to prune traversal CTE exports for unused fixed-step node, edge, and path bindings while preserving repeated MATCH declarations, pattern predicates, and values needed by later traversal steps.
  • Reduce path projection overhead by reusing already-materialized path bindings and building fixed path composites directly from known node and edge composites.
  • Extend the PgSQL AST and formatter with DISTINCT select rendering and lateral subquery joins needed by the new expansion plans. Update PostgreSQL schema upgrade and downgrade SQL, golden translations, and integration and unit coverage for selectivity gating, reference liveness, LIMIT pushdown, lateral formatting, and shortest-path harness behavior.

Summary by CodeRabbit

  • New Features

    • Bidirectional shortest‑path search, expanded shortest‑path options, UNWIND and CREATE query support, LATERAL subquery formatting, and SELECT DISTINCT output
  • Bug Fixes

    • More robust traversal/recursive query generation, improved update/delete translations, and smarter limit pushdown for shortest‑path queries
  • Tests

    • Large expansion of unit and integration tests covering traversal, harnesses, formatting, and limit‑pushdown
  • Chores

    • Database migration and traversal harness schema updates; .gitignore extended

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds lateral-subquery AST/formatting, seed-driven recursive expansion with lateral lookup joins, bidirectional/pair-aware shortest-path decision logic and harness parameterization (filters + path_limit), limit-pushdown into harness calls, many regenerated SQL translation fixtures, and accompanying unit/integration tests and migrations.

Changes

Bidirectional shortest-path, seed-driven expansion & harness parameterization

Layer / File(s) Summary
Data / AST shape
cypher/models/pgsql/model.go, cypher/models/pgsql/functions.go, .gitignore
Adds exported LateralSubquery type; changes Join.Table from TableReference to Expression; adds FunctionBidirectionalSPHarness, FunctionNextValue, FunctionPGGetSerialSequence identifiers; .gitignore adds .zed.
Formatter
cypher/models/pgsql/format/format.go, cypher/models/pgsql/format/format_test.go
Formatter: emits lateral (<subquery>) with optional binding; supports select distinct; tests added for select distinct and lateral join formatting.
Translation model & capability checks
cypher/models/pgsql/translate/model.go, cypher/models/pgsql/translate/constraints_test.go
Expansion gains flags (UseMaterializedTerminalFilter, UseMaterializedEndpointPairFilter, UseBidirectionalSearch); new traversal decision APIs added (CanExecuteBidirectionalSearch, CanExecuteSelectiveBidirectionalSearch, CanExecutePairAwareBidirectionalSearch) and tests for those decisions.
Expansion builder & core translation
cypher/models/pgsql/translate/expansion.go, cypher/models/pgsql/translate/pattern.go, cypher/models/pgsql/translate/projection.go, cypher/models/pgsql/translate/selectivity.go
Introduces expansionSeed CTE abstraction, lateral edge/node lookup-join helpers, terminal-satisfaction locality, harness parameter plumbing (depth + bound filters), ExpansionBuilder.UseUnionAll and Build(...commonCTEs), and selectivityBidirectionalAnchorThreshold.
Limit pushdown wiring & projection changes
cypher/models/pgsql/translate/projection.go, cypher/models/pgsql/translate/limit_pushdown_test.go
Adds detection and pushdown of tail LIMIT into shortest-path harness calls or source CTEs; ensures at least one projection (literal 1) when empty; unit tests verify allowed/blocked pushdown and harness-parameter append.
Query part referencing & pruning
cypher/models/pgsql/translate/references.go, cypher/models/pgsql/translate/references_test.go, cypher/models/pgsql/translate/traversal.go
Adds conservative collector collectReferencedIdentifiers, QueryPart.SetReferencedIdentifiers, and helpers to prune/unexport unused frame exports; tests for identifier collection and binding references.
Shortest-path translation & harness selection
cypher/models/pgsql/translate/expansion.go, cypher/models/pgsql/translate/pattern.go, cypher/models/pgsql/translate/query.go
Shortest-path translation builds seed CTEs and decides between bidirectional/unidirectional roots based on model flags; prepareSinglePartQueryPart/multipart prep now collect referenced identifiers.
CREATE / UNWIND translation and mutation wiring
cypher/models/pgsql/translate/create.go, cypher/models/pgsql/translate/unwind.go, cypher/models/pgsql/translate/node.go, cypher/models/pgsql/translate/relationship.go
Adds CREATE translation machinery (frames, sequence value, inserts, projection frames), UNWIND translation and unwind-from-clause helpers, and routing for create-path node/edge patterns.
Harness SQL and schema migrations
drivers/pg/query/sql/schema_up.sql, drivers/pg/query/sql/schema_down.sql
Reworks unidirectional/bidirectional harness SQL to accept array/text filter overloads and path_limit; adds traversal filter tables, visited keyed by (root_id,id), pair-filter support, and corresponding down-migration drops.
Regenerated SQL translation fixtures
cypher/models/pgsql/test/translation_cases/*
Many SQL test fixtures rewritten to seed-driven CTEs, lateral joins, path composites, post-expansion terminal filtering, and new harness parameterization (multipart, pattern_binding, pattern_expansion, shortest_paths, nodes, stepwise_traversal, quantifiers, create, unwind, update, delete).
Tests & integration
cypher/models/pgsql/test/validation_integration_test.go, cypher/models/pgsql/translate/expansion_test.go, cypher/models/pgsql/translate/limit_pushdown_test.go, cypher/models/pgsql/format/format_test.go, cypher/models/pgsql/translate/references_test.go, integration/testdata/*`
Adds integration test TestBidirectionalASPHarnessOverloads, expansion/limit-pushdown/references/format unit tests; pool init updated to drivers.DatabaseConfiguration; many JSON integration cases added (create_inline, unwind_inline, multipart_inline).

Sequence Diagram

sequenceDiagram
    participant Client
    participant Translator
    participant CapCheck as CapabilityChecker
    participant ExpBuilder as ExpansionBuilder
    participant DB as DBHarness

    Client->>Translator: Translate request (traversal/shortest-path)
    Translator->>CapCheck: Evaluate bidirectional/selective/pair-aware eligibility
    CapCheck-->>Translator: Decision (UseBidirectionalSearch true/false)
    alt UseBidirectionalSearch = true
        Translator->>ExpBuilder: Build seed CTEs, lateral lookups, harness params (filters, path_limit)
        ExpBuilder-->>Translator: CTEs + bidirectional harness call
        Translator->>DB: Call bidirectional_*_harness(filter/array, path_limit)
        DB->>DB: Populate traversal_filter tables, expand fronts, resolve pairs, emit paths
        DB-->>Translator: Paths result
    else
        Translator->>ExpBuilder: Build primer/recursive expansion with seed CTEs
        ExpBuilder-->>Translator: CTEs + unidirectional harness call
        Translator->>DB: Call unidirectional_*_harness(array/filter, path_limit)
        DB->>DB: Expand frontier, prune visited, emit paths
        DB-->>Translator: Paths result
    end
    Translator-->>Client: Final result set
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • SpecterOps/DAWGS#45: Touches overlapping translation/SQL generation and harness logic; likely related.

Suggested reviewers

  • urangel

Poem

🐰 I hop through lateral fields and seeds,
Filters planted where the harness leads,
Pairs and paths stitched left and right,
Limits tucked in harness light,
Small hops become a bidirectional flight.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
cypher/models/pgsql/translate/expansion.go (1)

722-723: 💤 Low value

Consider checking the error from type conversion.

The error from pgsql.As[pgsql.SelectItem] is being ignored. While the conversion should always succeed for expression types, silently discarding errors can mask issues if the API changes.

♻️ Suggested fix
-	satisfiedSelectItem, _ := pgsql.As[pgsql.SelectItem](satisfied)
-	return satisfiedSelectItem
+	satisfiedSelectItem, err := pgsql.As[pgsql.SelectItem](satisfied)
+	if err != nil {
+		// This should never happen for expression types
+		return satisfied
+	}
+	return satisfiedSelectItem

The same pattern appears at line 765-766 in backwardTerminalSatisfaction.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/translate/expansion.go` around lines 722 - 723, The code
ignores the error returned by pgsql.As[pgsql.SelectItem] when converting
satisfied (satisfiedSelectItem, _ := pgsql.As[pgsql.SelectItem](satisfied));
capture the second return value (err) and handle it instead of discarding it:
check err after calling pgsql.As[pgsql.SelectItem](satisfied) and return or
propagate a descriptive error from the enclosing function (or fail fast)
depending on the function's error handling pattern. Apply the same change inside
backwardTerminalSatisfaction where pgsql.As[pgsql.SelectItem] is called so both
satisfiedSelectItem conversions are checked and errors are handled consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cypher/models/pgsql/translate/model.go`:
- Around line 298-313: The method CanExecutePairAwareBidirectionalSearch
currently returns true for pair-aware bidirectional search whenever expansion
and endpoint constraints exist; update it to measure endpoint selectivity and
ensure at least one endpoint qualifies as a bidirectional anchor before
returning true: call endpointSelectivity(...) for the left and right endpoints
(using s.Expansion and s.LeftNode.Identifier / s.RightNode.Identifier), check
for and propagate any error, then call isBidirectionalSearchAnchor(...) on each
selectivity and only return true if at least one endpoint is an anchor;
otherwise return false. Ensure you reference and use the existing functions
endpointSelectivity and isBidirectionalSearchAnchor and handle their
error/return values instead of unconditionally returning true.

In `@drivers/pg/query/sql/schema_down.sql`:
- Around line 19-50: The rollback script is missing drops for functions added in
schema_up; update drivers/pg/query/sql/schema_down.sql to mirror schema_up by
adding DROP FUNCTION IF EXISTS entries for the new overloads and helper builders
— specifically add drops for the unidirectional_sp_harness overloads that
include the path_limit parameter (the new signatures of
unidirectional_sp_harness(..., path_limit)), the unidirectional_asp/sp helper
overloads if any added, and the helper table-creation functions
create_unidirectional_shortest_path_tables() and
create_bidirectional_pathspace_tables() so that
create_unidirectional_shortest_path_tables,
create_bidirectional_pathspace_tables, and the new
unidirectional_sp_harness(...) signatures are removed on downgrade; ensure the
signatures in the DROP FUNCTION lines exactly match the signatures introduced in
schema_up for functions like unidirectional_sp_harness and the two
create_*_tables helpers.

In `@drivers/pg/query/sql/schema_up.sql`:
- Around line 1771-1779: The three early-return branches in
bidirectional_sp_harness (the forward-hit, backward-hit and midpoint-match exits
that currently "return query" from forward_front/backward_front/midpoint
results) ignore path_limit; modify each of those return paths to honor the
path_limit guard used elsewhere (apply the same conditional that truncates
results when path_limit is set — e.g., add the path_limit-based WHERE/ROW_NUMBER
or LIMIT predicate used in the other exit) so those branches cannot return more
rows than path_limit; update the return query for the forward_front (alias r),
backward_front and midpoint-match branches to include that same path_limit
check/pushdown logic (the same pattern applied at the other locations referenced
in the comment).

---

Nitpick comments:
In `@cypher/models/pgsql/translate/expansion.go`:
- Around line 722-723: The code ignores the error returned by
pgsql.As[pgsql.SelectItem] when converting satisfied (satisfiedSelectItem, _ :=
pgsql.As[pgsql.SelectItem](satisfied)); capture the second return value (err)
and handle it instead of discarding it: check err after calling
pgsql.As[pgsql.SelectItem](satisfied) and return or propagate a descriptive
error from the enclosing function (or fail fast) depending on the function's
error handling pattern. Apply the same change inside
backwardTerminalSatisfaction where pgsql.As[pgsql.SelectItem] is called so both
satisfiedSelectItem conversions are checked and errors are handled consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6bf4d424-c97d-49a9-9d5b-9e1697489ecd

📥 Commits

Reviewing files that changed from the base of the PR and between 1eace19 and 197c065.

📒 Files selected for processing (18)
  • .gitignore
  • cypher/models/pgsql/format/format.go
  • cypher/models/pgsql/format/format_test.go
  • cypher/models/pgsql/functions.go
  • cypher/models/pgsql/model.go
  • cypher/models/pgsql/test/translation_cases/multipart.sql
  • cypher/models/pgsql/test/translation_cases/pattern_binding.sql
  • cypher/models/pgsql/test/translation_cases/pattern_expansion.sql
  • cypher/models/pgsql/test/translation_cases/shortest_paths.sql
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/translate/constraints_test.go
  • cypher/models/pgsql/translate/expansion.go
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/translate/pattern.go
  • cypher/models/pgsql/translate/projection.go
  • cypher/models/pgsql/translate/selectivity.go
  • drivers/pg/query/sql/schema_down.sql
  • drivers/pg/query/sql/schema_up.sql

Comment thread cypher/models/pgsql/translate/model.go Outdated
Comment thread drivers/pg/query/sql/schema_down.sql
Comment thread drivers/pg/query/sql/schema_up.sql Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cypher/models/pgsql/translate/expansion.go (1)

769-840: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Bound shortest-path seeds can generate out-of-scope predicates.

Line 779 tries to “deframe” bound root predicates before pushing them into the seed CTE, but that helper only rewrites a narrow subset of expression nodes; Line 930 does not even do that for bound terminal predicates. These seed queries only range over traversal_root_filter/traversal_terminal_filter plus node, so any bound endpoint constraint that still references the previous frame or another bound symbol will format invalid SQL. Reuse the same locality split used in buildExpansionPatternRoot, and keep non-local terms out of the seed query unless you also join the previous frame into the seed.

Also applies to: 922-986

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cypher/models/pgsql/translate/expansion.go` around lines 769 - 840, The seed
CTEs (created in prepareForwardFrontPrimerQuery via
newExpansionRootIDsParameterSeed/newExpansionNodeSeed when
traversalStep.LeftNodeBound is true and usesBoundRootIDs() is true) can receive
predicates that reference out-of-scope/bound symbols; instead of calling
deframeExpression() blindly, split primerNodeConstraints into local vs non-local
terms using the same locality-split logic used by buildExpansionPatternRoot,
pass only the local terms into the seed creation (rootIDs/node seed), and keep
non-local terms out of the seed (or join the previous frame into the seed if you
intentionally need them). Apply the same locality filtering to terminal
predicates before adding them to any seed or join (the code paths controlled by
expansionModel.TerminalNodeConstraints, UseMaterializedTerminalFilter and
UseMaterializedEndpointPairFilter), and ensure any place that uses
deframeExpression is replaced with this locality-aware split so seeds only
contain predicates that reference symbols present in the seed CTE.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@cypher/models/pgsql/translate/expansion.go`:
- Around line 769-840: The seed CTEs (created in prepareForwardFrontPrimerQuery
via newExpansionRootIDsParameterSeed/newExpansionNodeSeed when
traversalStep.LeftNodeBound is true and usesBoundRootIDs() is true) can receive
predicates that reference out-of-scope/bound symbols; instead of calling
deframeExpression() blindly, split primerNodeConstraints into local vs non-local
terms using the same locality-split logic used by buildExpansionPatternRoot,
pass only the local terms into the seed creation (rootIDs/node seed), and keep
non-local terms out of the seed (or join the previous frame into the seed if you
intentionally need them). Apply the same locality filtering to terminal
predicates before adding them to any seed or join (the code paths controlled by
expansionModel.TerminalNodeConstraints, UseMaterializedTerminalFilter and
UseMaterializedEndpointPairFilter), and ensure any place that uses
deframeExpression is replaced with this locality-aware split so seeds only
contain predicates that reference symbols present in the seed CTE.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b02fbfc5-bccb-432b-ac66-9ab31466e0a2

📥 Commits

Reviewing files that changed from the base of the PR and between 197c065 and c843ac2.

📒 Files selected for processing (18)
  • .gitignore
  • cypher/models/pgsql/format/format.go
  • cypher/models/pgsql/format/format_test.go
  • cypher/models/pgsql/functions.go
  • cypher/models/pgsql/model.go
  • cypher/models/pgsql/test/translation_cases/multipart.sql
  • cypher/models/pgsql/test/translation_cases/pattern_binding.sql
  • cypher/models/pgsql/test/translation_cases/pattern_expansion.sql
  • cypher/models/pgsql/test/translation_cases/shortest_paths.sql
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/translate/constraints_test.go
  • cypher/models/pgsql/translate/expansion.go
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/translate/pattern.go
  • cypher/models/pgsql/translate/projection.go
  • cypher/models/pgsql/translate/selectivity.go
  • drivers/pg/query/sql/schema_down.sql
  • drivers/pg/query/sql/schema_up.sql
✅ Files skipped from review due to trivial changes (3)
  • cypher/models/pgsql/functions.go
  • .gitignore
  • cypher/models/pgsql/translate/selectivity.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • cypher/models/pgsql/format/format_test.go
  • cypher/models/pgsql/translate/pattern.go
  • drivers/pg/query/sql/schema_down.sql
  • cypher/models/pgsql/translate/constraints_test.go
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/test/translation_cases/shortest_paths.sql

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
cypher/models/pgsql/translate/model.go (1)

313-323: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pair-aware anchoring still requires both endpoints to clear the threshold.

This branch now returns false as soon as the primer side is below threshold, and then still requires the terminal side to pass. In practice that means pair-aware bidirectional search only turns on when both sides are anchors, so one-sided-anchor shortest-path queries fall back to the slower unidirectional plan.

Suggested fix
-	if primerSelectivity, err := s.endpointSelectivity(scope, s.Expansion.PrimerNodeConstraints, s.LeftNodeBound); err != nil {
-		return false, err
-	} else if !isBidirectionalSearchAnchor(primerSelectivity) {
-		return false, nil
-	}
-
-	if terminalSelectivity, err := s.endpointSelectivity(scope, s.Expansion.TerminalNodeConstraints, s.RightNodeBound); err != nil {
-		return false, err
-	} else {
-		return isBidirectionalSearchAnchor(terminalSelectivity), nil
-	}
+	primerSelectivity, err := s.endpointSelectivity(scope, s.Expansion.PrimerNodeConstraints, s.LeftNodeBound)
+	if err != nil {
+		return false, err
+	}
+
+	terminalSelectivity, err := s.endpointSelectivity(scope, s.Expansion.TerminalNodeConstraints, s.RightNodeBound)
+	if err != nil {
+		return false, err
+	}
+
+	return isBidirectionalSearchAnchor(primerSelectivity) ||
+		isBidirectionalSearchAnchor(terminalSelectivity), nil
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cypher/models/pgsql/translate/model.go` around lines 313 - 323, The current
branch short-circuits on the primer side and prevents one-sided anchors from
enabling pair-aware bidirectional search; instead, compute both endpoint
selectivities (call endpointSelectivity for s.Expansion.PrimerNodeConstraints
with s.LeftNodeBound and for s.Expansion.TerminalNodeConstraints with
s.RightNodeBound), handle errors for each, and then return the OR of
isBidirectionalSearchAnchor(primerSelectivity) ||
isBidirectionalSearchAnchor(terminalSelectivity) so that a single-side anchor
can enable pair-aware search.
🧹 Nitpick comments (2)
cypher/models/pgsql/translate/references_test.go (2)

51-57: 💤 Low value

Test name slightly misrepresents the scenario

TestCollectReferencedIdentifiersIncludesRepeatedMatchPatternDeclarations focuses on b being present while silently encoding the equally important assertion that a is not present (Line 54). The name suggests the function includes things from repeated MATCHes broadly, whereas the actual invariant is that cross-clause reference liveness is tracked correctly — unreferenced pattern variables like a are excluded. A name like TestCollectReferencedIdentifiersTracksInterClauseReferenceLiveness would better capture both sides.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cypher/models/pgsql/translate/references_test.go` around lines 51 - 57,
Rename the test function
TestCollectReferencedIdentifiersIncludesRepeatedMatchPatternDeclarations to a
name that reflects the liveness semantics (e.g.,
TestCollectReferencedIdentifiersTracksInterClauseReferenceLiveness) and update
the function identifier accordingly so the test reads: referencedIdentifiers :=
referencedIdentifiersForQuery(...) and the same assertions remain; ensure any
test registrations or references to the old function name are updated to avoid
broken references.

13-25: 💤 Low value

referencedIdentifiersForQuery silently skips multi-part queries

The helper always accesses regularQuery.SingleQuery.SinglePartQuery without asserting that SinglePartQuery is non-nil. If a test accidentally passes a query that maps to a multi-part structure (e.g., containing WITH), collectReferencedIdentifiers would be called with nil, likely panicking or returning an opaque error rather than a clear test failure.

🛡️ Proposed fix
 regularQuery, err := frontend.ParseCypher(frontend.NewContext(), cypherQuery)
 require.NoError(t, err)
 require.NotNil(t, regularQuery.SingleQuery)
 require.NotNil(t, regularQuery.SingleQuery.SinglePartQuery)
+require.Nil(t, regularQuery.SingleQuery.MultiPartQuery, "helper only handles single-part queries; use a dedicated helper for multi-part queries")

 referencedIdentifiers, err := collectReferencedIdentifiers(regularQuery.SingleQuery.SinglePartQuery)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cypher/models/pgsql/translate/references_test.go` around lines 13 - 25, The
helper referencedIdentifiersForQuery currently dereferences
regularQuery.SingleQuery.SinglePartQuery without asserting it's present; add a
require.NotNil(t, regularQuery.SingleQuery.SinglePartQuery, "expected
single-part query (received multi-part/WITH) before calling
collectReferencedIdentifiers") immediately after the existing require.NotNil for
SingleQuery so collectReferencedIdentifiers is never called with nil; this makes
the test fail fast and provides a clear message when a test accidentally
supplies a multi-part query.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cypher/models/pgsql/test/translation_cases/multipart.sql`:
- Around line 38-39: The final query has a cartesian product issue because it
includes s1 and s3 in the FROM clause without correlating them on s3.root_id and
(s1.n0).id, causing duplicate paths and inflated counts in the count(c) or
adminCount. To fix it, either remove s1 from the final FROM clause or add an
explicit join condition correlating s1 and s3 on (s1.n0).id = s3.root_id within
the relevant subquery that produces s2, ensuring each recursive path corresponds
correctly to its seed row.
- Around line 92-93: The translation loses OPTIONAL MATCH semantics because s1
builds from inner joins (edge e0 join node n0 join node n1), so nodes with no
matching e0 are filtered out; change the inner query that defines s1 to start
from the candidate node n0 and use LEFT JOIN for e0 (and LEFT JOIN for n1
conditioned on e0.start_id) so n0 rows are preserved when no edge exists, then
keep the aggregation in s0 (count(e0) will yield 0 for nulls) so s0.i0 = 0
correctly selects nodes with zero relationships; locate and update the joins in
the s1 definition around e0, n0, n1 to implement this left-join seed.

In `@cypher/models/pgsql/test/translation_cases/quantifiers.sql`:
- Around line 41-45: Remove the duplicated quantifier fixtures: delete the two
SQL case blocks that begin with "-- case: MATCH (m:NodeKind1) WHERE
m.unconstraineddelegation = true ..." and "-- case: MATCH (m:NodeKind1) WHERE
ANY(name in m.serviceprincipalnames WHERE name CONTAINS "PHANTOM") ..." so only
the original occurrences remain; ensure you remove the entire WITH ... SELECT
... statement for each duplicate (the s0/s2 CTE blocks and their final SELECT)
so the fixture loader no longer sees byte-for-byte duplicate cases.

In `@cypher/models/pgsql/translate/projection.go`:
- Around line 573-599: The helper directUnidirectionalShortestPathHarnessFrame
only recognizes pgsql.FunctionUnidirectionalSPHarness so endpoint-alias
discovery fails for bidirectional CTEs; update this routine
(directUnidirectionalShortestPathHarnessFrame) to also detect
pgsql.FunctionBidirectionalSPHarness (or the bidirectional harness enum/name
used in the project) when scanning selectBody.From.function calls, treat it the
same as FunctionUnidirectionalSPHarness for alias discovery, and keep the
existing duplicate-harness check and return semantics so
shortestPathLimitPushdownTransparentWhere can correctly push limits for
bidirectional shortest-path harnesses as well.

---

Duplicate comments:
In `@cypher/models/pgsql/translate/model.go`:
- Around line 313-323: The current branch short-circuits on the primer side and
prevents one-sided anchors from enabling pair-aware bidirectional search;
instead, compute both endpoint selectivities (call endpointSelectivity for
s.Expansion.PrimerNodeConstraints with s.LeftNodeBound and for
s.Expansion.TerminalNodeConstraints with s.RightNodeBound), handle errors for
each, and then return the OR of isBidirectionalSearchAnchor(primerSelectivity)
|| isBidirectionalSearchAnchor(terminalSelectivity) so that a single-side anchor
can enable pair-aware search.

---

Nitpick comments:
In `@cypher/models/pgsql/translate/references_test.go`:
- Around line 51-57: Rename the test function
TestCollectReferencedIdentifiersIncludesRepeatedMatchPatternDeclarations to a
name that reflects the liveness semantics (e.g.,
TestCollectReferencedIdentifiersTracksInterClauseReferenceLiveness) and update
the function identifier accordingly so the test reads: referencedIdentifiers :=
referencedIdentifiersForQuery(...) and the same assertions remain; ensure any
test registrations or references to the old function name are updated to avoid
broken references.
- Around line 13-25: The helper referencedIdentifiersForQuery currently
dereferences regularQuery.SingleQuery.SinglePartQuery without asserting it's
present; add a require.NotNil(t, regularQuery.SingleQuery.SinglePartQuery,
"expected single-part query (received multi-part/WITH) before calling
collectReferencedIdentifiers") immediately after the existing require.NotNil for
SingleQuery so collectReferencedIdentifiers is never called with nil; this makes
the test fail fast and provides a clear message when a test accidentally
supplies a multi-part query.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6e9d9113-8e78-40f4-a3b7-92f6e9740e6c

📥 Commits

Reviewing files that changed from the base of the PR and between c843ac2 and 4ab00d4.

📒 Files selected for processing (28)
  • .gitignore
  • cypher/models/pgsql/format/format.go
  • cypher/models/pgsql/format/format_test.go
  • cypher/models/pgsql/functions.go
  • cypher/models/pgsql/model.go
  • cypher/models/pgsql/test/translation_cases/delete.sql
  • cypher/models/pgsql/test/translation_cases/multipart.sql
  • cypher/models/pgsql/test/translation_cases/nodes.sql
  • cypher/models/pgsql/test/translation_cases/pattern_binding.sql
  • cypher/models/pgsql/test/translation_cases/pattern_expansion.sql
  • cypher/models/pgsql/test/translation_cases/quantifiers.sql
  • cypher/models/pgsql/test/translation_cases/shortest_paths.sql
  • cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
  • cypher/models/pgsql/test/translation_cases/update.sql
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/translate/constraints_test.go
  • cypher/models/pgsql/translate/expansion.go
  • cypher/models/pgsql/translate/limit_pushdown_test.go
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/translate/pattern.go
  • cypher/models/pgsql/translate/projection.go
  • cypher/models/pgsql/translate/query.go
  • cypher/models/pgsql/translate/references.go
  • cypher/models/pgsql/translate/references_test.go
  • cypher/models/pgsql/translate/selectivity.go
  • cypher/models/pgsql/translate/traversal.go
  • drivers/pg/query/sql/schema_down.sql
  • drivers/pg/query/sql/schema_up.sql
✅ Files skipped from review due to trivial changes (3)
  • cypher/models/pgsql/functions.go
  • cypher/models/pgsql/translate/selectivity.go
  • cypher/models/pgsql/format/format_test.go
🚧 Files skipped from review as they are similar to previous changes (8)
  • .gitignore
  • cypher/models/pgsql/translate/pattern.go
  • cypher/models/pgsql/model.go
  • cypher/models/pgsql/translate/constraints_test.go
  • drivers/pg/query/sql/schema_down.sql
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/translate/expansion.go
  • drivers/pg/query/sql/schema_up.sql

Comment thread cypher/models/pgsql/test/translation_cases/multipart.sql Outdated
Comment thread cypher/models/pgsql/test/translation_cases/multipart.sql Outdated
Comment thread cypher/models/pgsql/test/translation_cases/quantifiers.sql
Comment thread cypher/models/pgsql/translate/projection.go Outdated
@zinic
Copy link
Copy Markdown
Contributor Author

zinic commented May 5, 2026

@CodeRabbit full review, please.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

Tip

For best results, initiate chat on the files or code changes.

@zinic: Sure! Kicking off a full review of PR #70 now.

[full_review]

@zinic zinic force-pushed the main branch 2 times, most recently from 8461b20 to 2404e30 Compare May 5, 2026 18:06
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
cypher/models/pgsql/translate/expansion.go (1)

14-2172: ⚖️ Poor tradeoff

LGTM — refactor is cohesive and aligns with the regenerated fixtures.

The new abstractions hang together well:

  • expansionSeed + newExpansionNodeSeed/newExpansionBoundNodeSeed/newExpansionNodeFilterSeed cleanly centralize the four primer-source shapes (free, bound-from-prior-frame, parameter-filter, parameter-filter-with-constraints), with Materialized: false matching the not materialized seeds in the SQL fixtures.
  • recursiveExpansionEdgeLookupJoin / expansionNodeLookupJoin correctly emit JOIN LATERAL (... OFFSET 0) ... ON true, and the OFFSET 0 rationale is documented inline.
  • The split between local and external satisfaction (expansionTerminalSatisfactionLocality + DeferredNodeSatisfactionConstraint) keeps cross-frame predicates out of the recursive body and reapplies them at the projection — consistent with the new where s2.satisfied placement plus terminal predicates being applied post-aggregation in multipart.sql.
  • applyBoundEndpointProjectionConstraints correctly prepends the previous frame and ANDs the bound endpoint equalities only when at least one endpoint is bound, before the harness CTE join chain.
  • shortestPathVisitedPruningCondition and the forward/backward visited-table selection (expansionVisited vs expansionForwardVisited/expansionBackwardVisited) cleanly gate the new bidirectional harness path, and translateShortestPathTraversal wires UseBidirectionalSearch from CanExecutePairAwareBidirectionalSearch, matching the dispatch in cypher/models/pgsql/translate/pattern.go (snippets 1–2).

A couple of small observations you may want to consider as follow-ups (non-blocking):

  • buildBiDirectionalShortestPathsHarnessCall only sets UseMaterializedEndpointPairFilter; if canMaterializeEndpointPairFilter returns false, bidirectional mode won't fall back to UseMaterializedTerminalFilter even when canMaterializeTerminalFilter would allow it. That's a missed optimization rather than a correctness issue.
  • prepareForwardFrontPrimerQuery/prepareBackwardFrontPrimerQuery and shortestPathsParameters/bidirectionalAllShortestPathsParameters share substantial structure that could be parameterized by direction; deferrable.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cypher/models/pgsql/translate/expansion.go` around lines 14 - 2172, No code
changes are required for correctness; the refactor is cohesive and LGTM, but as
optional follow-ups: (1) consider adding a fallback in
buildBiDirectionalShortestPathsHarnessCall to use canMaterializeTerminalFilter
when canMaterializeEndpointPairFilter returns false (look at
buildBiDirectionalShortestPathsHarnessCall, canMaterializeEndpointPairFilter,
canMaterializeTerminalFilter, and where
UseMaterializedEndpointPairFilter/UseMaterializedTerminalFilter are set), and
(2) factor duplicated logic between
prepareForwardFrontPrimerQuery/prepareBackwardFrontPrimerQuery and
shortestPathsParameters/bidirectionalAllShortestPathsParameters into a shared
helper to reduce repetition.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@cypher/models/pgsql/translate/expansion.go`:
- Around line 14-2172: No code changes are required for correctness; the
refactor is cohesive and LGTM, but as optional follow-ups: (1) consider adding a
fallback in buildBiDirectionalShortestPathsHarnessCall to use
canMaterializeTerminalFilter when canMaterializeEndpointPairFilter returns false
(look at buildBiDirectionalShortestPathsHarnessCall,
canMaterializeEndpointPairFilter, canMaterializeTerminalFilter, and where
UseMaterializedEndpointPairFilter/UseMaterializedTerminalFilter are set), and
(2) factor duplicated logic between
prepareForwardFrontPrimerQuery/prepareBackwardFrontPrimerQuery and
shortestPathsParameters/bidirectionalAllShortestPathsParameters into a shared
helper to reduce repetition.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 483409c1-e69c-4c15-8d1a-3c4261b39c00

📥 Commits

Reviewing files that changed from the base of the PR and between 4ab00d4 and 2404e30.

📒 Files selected for processing (29)
  • .gitignore
  • cypher/models/pgsql/format/format.go
  • cypher/models/pgsql/format/format_test.go
  • cypher/models/pgsql/functions.go
  • cypher/models/pgsql/model.go
  • cypher/models/pgsql/test/translation_cases/delete.sql
  • cypher/models/pgsql/test/translation_cases/multipart.sql
  • cypher/models/pgsql/test/translation_cases/nodes.sql
  • cypher/models/pgsql/test/translation_cases/pattern_binding.sql
  • cypher/models/pgsql/test/translation_cases/pattern_expansion.sql
  • cypher/models/pgsql/test/translation_cases/quantifiers.sql
  • cypher/models/pgsql/test/translation_cases/shortest_paths.sql
  • cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
  • cypher/models/pgsql/test/translation_cases/update.sql
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/translate/constraints_test.go
  • cypher/models/pgsql/translate/expansion.go
  • cypher/models/pgsql/translate/limit_pushdown_test.go
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/translate/pattern.go
  • cypher/models/pgsql/translate/projection.go
  • cypher/models/pgsql/translate/query.go
  • cypher/models/pgsql/translate/references.go
  • cypher/models/pgsql/translate/references_test.go
  • cypher/models/pgsql/translate/selectivity.go
  • cypher/models/pgsql/translate/traversal.go
  • drivers/pg/query/sql/schema_down.sql
  • drivers/pg/query/sql/schema_up.sql
  • integration/testdata/cases/multipart_inline.json
✅ Files skipped from review due to trivial changes (4)
  • cypher/models/pgsql/functions.go
  • cypher/models/pgsql/model.go
  • cypher/models/pgsql/format/format_test.go
  • cypher/models/pgsql/translate/traversal.go
🚧 Files skipped from review as they are similar to previous changes (13)
  • .gitignore
  • cypher/models/pgsql/translate/selectivity.go
  • cypher/models/pgsql/translate/limit_pushdown_test.go
  • cypher/models/pgsql/translate/references_test.go
  • cypher/models/pgsql/test/translation_cases/nodes.sql
  • cypher/models/pgsql/translate/constraints_test.go
  • cypher/models/pgsql/translate/references.go
  • cypher/models/pgsql/test/translation_cases/pattern_expansion.sql
  • cypher/models/pgsql/test/validation_integration_test.go
  • drivers/pg/query/sql/schema_down.sql
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/test/translation_cases/update.sql
  • drivers/pg/query/sql/schema_up.sql

@zinic zinic force-pushed the main branch 2 times, most recently from e2eb55c to 98aac63 Compare May 5, 2026 18:54
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
cypher/models/pgsql/translate/model.go (1)

313-322: ⚠️ Potential issue | 🟠 Major

Pair-aware bidirectional gating still requires both endpoints to anchor.

This path only returns true when the primer side clears the anchor threshold and the terminal side clears it too. The pair-aware gate is supposed to open once either endpoint is selective enough, so anchored+broader endpoint pairs still fall back to the unidirectional harness here.

Suggested fix
-	if primerSelectivity, err := s.endpointSelectivity(scope, s.Expansion.PrimerNodeConstraints, s.LeftNodeBound); err != nil {
-		return false, err
-	} else if !isBidirectionalSearchAnchor(primerSelectivity) {
-		return false, nil
-	}
-
-	if terminalSelectivity, err := s.endpointSelectivity(scope, s.Expansion.TerminalNodeConstraints, s.RightNodeBound); err != nil {
-		return false, err
-	} else {
-		return isBidirectionalSearchAnchor(terminalSelectivity), nil
-	}
+	primerSelectivity, err := s.endpointSelectivity(scope, s.Expansion.PrimerNodeConstraints, s.LeftNodeBound)
+	if err != nil {
+		return false, err
+	}
+
+	terminalSelectivity, err := s.endpointSelectivity(scope, s.Expansion.TerminalNodeConstraints, s.RightNodeBound)
+	if err != nil {
+		return false, err
+	}
+
+	return isBidirectionalSearchAnchor(primerSelectivity) ||
+		isBidirectionalSearchAnchor(terminalSelectivity), nil
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cypher/models/pgsql/translate/model.go` around lines 313 - 322, The logic in
the block using endpointSelectivity/PrimerNodeConstraints/LeftNodeBound and
endpointSelectivity/TerminalNodeConstraints/RightNodeBound incorrectly requires
both endpoints to be anchors; change it so that after computing
primerSelectivity you immediately return true,nil if
isBidirectionalSearchAnchor(primerSelectivity) is true, otherwise fall through
to compute terminalSelectivity and return
isBidirectionalSearchAnchor(terminalSelectivity),nil (preserving error
propagation on each endpointSelectivity call). Ensure you reference and update
the code paths around endpointSelectivity, s.Expansion.PrimerNodeConstraints,
s.LeftNodeBound, s.Expansion.TerminalNodeConstraints, s.RightNodeBound, and
isBidirectionalSearchAnchor.
🧹 Nitpick comments (6)
drivers/pg/query/sql/schema_down.sql (1)

19-19: 💤 Low value

Add empty parentheses to bare-name DROP FUNCTION statements for explicit clarity.

drop function if exists create_traversal_filter_tables; (line 19) targets the zero-argument overload specifically. While this works correctly and the three overloads with arguments are dropped explicitly on lines 20–22, adding empty parentheses makes the intent unambiguous and protects against future changes that might alter the overload set:

Suggested change
-drop function if exists create_traversal_filter_tables;
+drop function if exists create_traversal_filter_tables();

The same applies to create_bidirectional_shortest_path_tables and create_bidirectional_pair_pathspace_indexes on lines 49–50.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@drivers/pg/query/sql/schema_down.sql` at line 19, Change the bare-name DROP
FUNCTION statements to explicitly target the zero-argument overload by adding
empty parentheses: replace occurrences like "drop function if exists
create_traversal_filter_tables;" with "drop function if exists
create_traversal_filter_tables();", and do the same for
create_bidirectional_shortest_path_tables and
create_bidirectional_pair_pathspace_indexes so the DROP unambiguously refers to
the zero-argument overload.
cypher/models/pgsql/test/testcase.go (1)

208-208: ⚡ Quick win

Use the provided ctx instead of a fresh context.Background() in the live test path.

AssertLive accepts ctx context.Context and forwards it to driver.Run on line 213, but the new translate.Translate call here still uses context.Background(). In a live test the caller's context is the only way to enforce timeouts/cancellation around translation, and it should be honored consistently with the runtime call.

♻️ Proposed change
-		if translation, err := translate.Translate(context.Background(), regularQuery, driver.KindMapper(), s.CypherParams, defaultGraph.ID); err != nil {
+		if translation, err := translate.Translate(ctx, regularQuery, driver.KindMapper(), s.CypherParams, defaultGraph.ID); err != nil {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cypher/models/pgsql/test/testcase.go` at line 208, The translate.Translate
call is using context.Background() instead of the test's provided ctx; update
the call to pass the incoming ctx (the same context that AssertLive receives and
that is passed to driver.Run) so translation honors cancellation/timeouts.
Locate the translate.Translate invocation (it currently passes
context.Background(), regularQuery, driver.KindMapper(), s.CypherParams,
defaultGraph.ID) and replace the context.Background() argument with ctx.
cypher/models/pgsql/translate/limit_pushdown_test.go (1)

1-206: 💤 Low value

Solid coverage of the pushdown gating logic.

The helpers cleanly separate frame/CTE construction from the predicate shapes, and the test matrix exercises the main "allow" cases (forward & reversed endpoint inequality, bidirectional harness) plus two "block" cases (mixed where predicate, pre-filtered source CTE). The two pushDownShortestPathLimit* tests pin the user-visible contract (a single Int8-cast harness parameter equal to part.Limit) for both unidirectional and bidirectional harnesses.

One small gap worth considering as a follow-up: there's no negative-path coverage for the bidirectional harness equivalent of TestLimitPushdownTailSourceBlocksMixedShortestPathWherePredicate / TestLimitPushdownTailSourceBlocksFilteredShortestPathSource. If the bidirectional harness ever diverges in its gating heuristics, the unidirectional-only block tests won't catch it. Easy to add by parameterizing the existing block tests over both pgsql.FunctionUnidirectionalSPHarness and pgsql.FunctionBidirectionalSPHarness.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cypher/models/pgsql/translate/limit_pushdown_test.go` around lines 1 - 206,
Duplicate/parameterize the two "block" tests
(TestLimitPushdownTailSourceBlocksMixedShortestPathWherePredicate and
TestLimitPushdownTailSourceBlocksFilteredShortestPathSource) to run for both
pgsql.FunctionUnidirectionalSPHarness and pgsql.FunctionBidirectionalSPHarness:
call limitPushdownTestPart with each harness, build the same tailSelects via
limitPushdownTestTail and limitPushdownTestEndpointInequality (or manipulated
source CTE), invoke limitPushdownTailSource, and require.False on canPushDown
for both harnesses; this ensures the bidirectional harness is covered without
changing existing helpers like limitPushdownTestPart, limitPushdownTestTail, or
limitPushdownTailSource.
cypher/models/pgsql/translate/unwind.go (1)

10-24: 💤 Low value

Minor: delete happens before DefineNew may fail.

prepareUnwindTarget removes variable from s.unwindTargets on line 15 unconditionally, but only later attempts s.scope.DefineNew(...) on line 17. If DefineNew fails, the translator already lost the "this is an unwind target" bit, so any retry/recovery path (or higher-level diagnostics that walks unwindTargets) will see the variable as already consumed. Safer to delete only after a successful DefineNew/Alias, e.g.:

♻️ Proposed change
 func (s *Translator) prepareUnwindTarget(variable *cypher.Variable) (*BoundIdentifier, bool, error) {
 	if _, isUnwindTarget := s.unwindTargets[variable]; !isUnwindTarget {
 		return nil, false, nil
 	}
 
-	delete(s.unwindTargets, variable)
-
 	binding, err := s.scope.DefineNew(pgsql.UnsetDataType)
 	if err != nil {
 		return nil, true, err
 	}
 
 	s.scope.Alias(pgsql.Identifier(variable.Symbol), binding)
+	delete(s.unwindTargets, variable)
 	return binding, true, nil
 }

In practice this is unlikely to matter (DefineNew rarely fails for a fresh untyped binding) — flagging for hardening only.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cypher/models/pgsql/translate/unwind.go` around lines 10 - 24,
prepareUnwindTarget currently deletes the entry from s.unwindTargets before
calling s.scope.DefineNew and s.scope.Alias; if DefineNew fails the unwind
target is lost. Change the logic in prepareUnwindTarget so you only delete from
s.unwindTargets after DefineNew returns successfully and after s.scope.Alias is
called (i.e., move the delete(s.unwindTargets, variable) to after the binding is
created and aliased) and ensure you still return the correct (binding, true,
err) on error without mutating s.unwindTargets.
cypher/models/pgsql/translate/expansion_test.go (1)

94-140: 💤 Low value

Useful pruning coverage; consider tightening the string-Contains assertions.

The two tests pin the right invariant — seed-local predicates (lower(nX.id)::text = ...) stay inside the recursive primer while external predicates ((s0.x).id) get hoisted to the outer harness call. That's exactly the pruning gate this PR introduces.

A couple of follow-up hardenings worth considering:

  • require.NotContains(t, primerQuery, "s0") on line 108/133 is overbroad — any future identifier that incidentally contains "s0" (e.g. "s0_seed", a column literal, a generated suffix) would silently violate the intent without failing. The line 109/134 (s0.x).id check is the real assertion; line 108/133 could be removed or replaced with a more specific frame-reference regex.
  • require.Contains on the formatted SQL is fragile to whitespace/parenthesization tweaks in the formatter. If format.Statement ever normalizes spacing differently, these tests will break for unrelated reasons. A normalization helper (collapse whitespace, lowercase) before substring matching would make them more robust, similar to what testcase.go already does.

Neither blocks merge — just easy resilience wins.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cypher/models/pgsql/translate/expansion_test.go` around lines 94 - 140, The
NotContains assertions that check primerQuery/backwardPrimerQuery for the
substring "s0" are overbroad and fragile; in the tests
TestBoundRootShortestPathPrimerKeepsOnlySeedLocalConstraints and
TestBoundTerminalShortestPathPrimerKeepsOnlySeedLocalConstraints replace/remove
those broad checks on primerQuery/backwardPrimerQuery (derived from
expansionModel.PrimerQueryParameter and
expansionModel.BackwardPrimerQueryParameter) and instead assert a more specific
frame reference (e.g. assert absence of "(s0." or use a word-boundary regex like
`\bs0\b`) and keep the existing explicit `(s0.x).id` check; also make the
formatted SQL assertions around format.Statement more robust by normalizing the
formattedQuery (collapse whitespace and lowercase) before doing require.Contains
on strings like "n0.id = (s0.x).id" and "(s0.n0).id = s1.root_id" (and the n1
variants) to avoid whitespace/parenthesization brittleness.
cypher/models/pgsql/translate/create.go (1)

416-430: 💤 Low value

Defensive check: kindIDs[0] can panic if AssertKinds returns an empty slice without error.

The length guards above (len(edgeCreate.Kinds) == 0 / > 1) constrain input, but kindIDs[0] still assumes the mapper returned a non-empty slice. If AssertKinds ever returns (nil, nil) for an unmapped kind, this will index out of range.

🛡️ Optional defensive guard
 	if kindIDs, err := s.kindMapper.AssertKinds(edgeCreate.Kinds); err != nil {
 		return nil, fmt.Errorf("failed to translate edge kind: %w", err)
+	} else if len(kindIDs) == 0 {
+		return nil, fmt.Errorf("kind mapper returned no IDs for edge kind: %s", edgeCreate.Kinds[0])
 	} else {
 		return pgsql.NewLiteral(kindIDs[0], pgsql.Int2), nil
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cypher/models/pgsql/translate/create.go` around lines 416 - 430, In
buildEdgeKindIDExpression, guard against AssertKinds returning an empty slice:
after calling s.kindMapper.AssertKinds(edgeCreate.Kinds) check err and then
verify len(kindIDs) > 0, returning a descriptive error (e.g., "no kind IDs
returned for kinds: %v") if empty; only then use kindIDs[0] when constructing
pgsql.NewLiteral. This references the Translator.buildEdgeKindIDExpression
function, the AssertKinds call, the kindIDs slice, and pgsql.NewLiteral.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cypher/models/pgsql/test/translation_cases/update.sql`:
- Around line 50-54: The test file contains a duplicated test case for "match
(a)-[r]->(:NodeKind1) set a.name = '123', r.is_special_outbound = true" (the
identical WITH...SELECT SQL block appears twice); remove the duplicate block or
replace one with the intended unique case, ensuring only one occurrence of the
SQL block that selects/updates node n0 and edge e0 (the WITH s0 ... s1 ... s2
... select 1; sequence) remains, and run tests to confirm no other duplicates
exist.

In `@cypher/models/pgsql/translate/projection.go`:
- Around line 697-699: The current check in shortestPathEndpointAliases
(projection.go) treats any selectBody.Where as opaque and returns false,
preventing shortestPathLimitPushdownTransparentWhere from recognizing CTEs that
only contain the injected bound-endpoint and seed predicates; update
shortestPathEndpointAliases to allow a non-nil selectBody.Where when the WHERE
clause consists solely of the known injected predicate shapes (the
bound-endpoint and seed predicates inserted by the shortest-path builders) by
inspecting the WHERE expression(s) and returning true if every conjunct matches
those predicate patterns, otherwise keep returning false; reference
selectBody.Where, shortestPathEndpointAliases, and
shortestPathLimitPushdownTransparentWhere when locating and implementing this
conditional whitelist logic.

In `@drivers/pg/query/sql/schema_up.sql`:
- Around line 629-631: In swap_forward_front and swap_backward_front the
dead-end pruning uses the wrong edge direction: update the NOT EXISTS predicate
so swap_forward_front deletes rows only when no outgoing edges exist by changing
the inner condition to e.start_id = r.next_id, and update swap_backward_front to
check for incoming edges by changing its inner condition to e.end_id =
r.next_id; keep the rest of the delete predicates (is_cycle and satisfied
checks) unchanged.

---

Duplicate comments:
In `@cypher/models/pgsql/translate/model.go`:
- Around line 313-322: The logic in the block using
endpointSelectivity/PrimerNodeConstraints/LeftNodeBound and
endpointSelectivity/TerminalNodeConstraints/RightNodeBound incorrectly requires
both endpoints to be anchors; change it so that after computing
primerSelectivity you immediately return true,nil if
isBidirectionalSearchAnchor(primerSelectivity) is true, otherwise fall through
to compute terminalSelectivity and return
isBidirectionalSearchAnchor(terminalSelectivity),nil (preserving error
propagation on each endpointSelectivity call). Ensure you reference and update
the code paths around endpointSelectivity, s.Expansion.PrimerNodeConstraints,
s.LeftNodeBound, s.Expansion.TerminalNodeConstraints, s.RightNodeBound, and
isBidirectionalSearchAnchor.

---

Nitpick comments:
In `@cypher/models/pgsql/test/testcase.go`:
- Line 208: The translate.Translate call is using context.Background() instead
of the test's provided ctx; update the call to pass the incoming ctx (the same
context that AssertLive receives and that is passed to driver.Run) so
translation honors cancellation/timeouts. Locate the translate.Translate
invocation (it currently passes context.Background(), regularQuery,
driver.KindMapper(), s.CypherParams, defaultGraph.ID) and replace the
context.Background() argument with ctx.

In `@cypher/models/pgsql/translate/create.go`:
- Around line 416-430: In buildEdgeKindIDExpression, guard against AssertKinds
returning an empty slice: after calling
s.kindMapper.AssertKinds(edgeCreate.Kinds) check err and then verify
len(kindIDs) > 0, returning a descriptive error (e.g., "no kind IDs returned for
kinds: %v") if empty; only then use kindIDs[0] when constructing
pgsql.NewLiteral. This references the Translator.buildEdgeKindIDExpression
function, the AssertKinds call, the kindIDs slice, and pgsql.NewLiteral.

In `@cypher/models/pgsql/translate/expansion_test.go`:
- Around line 94-140: The NotContains assertions that check
primerQuery/backwardPrimerQuery for the substring "s0" are overbroad and
fragile; in the tests
TestBoundRootShortestPathPrimerKeepsOnlySeedLocalConstraints and
TestBoundTerminalShortestPathPrimerKeepsOnlySeedLocalConstraints replace/remove
those broad checks on primerQuery/backwardPrimerQuery (derived from
expansionModel.PrimerQueryParameter and
expansionModel.BackwardPrimerQueryParameter) and instead assert a more specific
frame reference (e.g. assert absence of "(s0." or use a word-boundary regex like
`\bs0\b`) and keep the existing explicit `(s0.x).id` check; also make the
formatted SQL assertions around format.Statement more robust by normalizing the
formattedQuery (collapse whitespace and lowercase) before doing require.Contains
on strings like "n0.id = (s0.x).id" and "(s0.n0).id = s1.root_id" (and the n1
variants) to avoid whitespace/parenthesization brittleness.

In `@cypher/models/pgsql/translate/limit_pushdown_test.go`:
- Around line 1-206: Duplicate/parameterize the two "block" tests
(TestLimitPushdownTailSourceBlocksMixedShortestPathWherePredicate and
TestLimitPushdownTailSourceBlocksFilteredShortestPathSource) to run for both
pgsql.FunctionUnidirectionalSPHarness and pgsql.FunctionBidirectionalSPHarness:
call limitPushdownTestPart with each harness, build the same tailSelects via
limitPushdownTestTail and limitPushdownTestEndpointInequality (or manipulated
source CTE), invoke limitPushdownTailSource, and require.False on canPushDown
for both harnesses; this ensures the bidirectional harness is covered without
changing existing helpers like limitPushdownTestPart, limitPushdownTestTail, or
limitPushdownTailSource.

In `@cypher/models/pgsql/translate/unwind.go`:
- Around line 10-24: prepareUnwindTarget currently deletes the entry from
s.unwindTargets before calling s.scope.DefineNew and s.scope.Alias; if DefineNew
fails the unwind target is lost. Change the logic in prepareUnwindTarget so you
only delete from s.unwindTargets after DefineNew returns successfully and after
s.scope.Alias is called (i.e., move the delete(s.unwindTargets, variable) to
after the binding is created and aliased) and ensure you still return the
correct (binding, true, err) on error without mutating s.unwindTargets.

In `@drivers/pg/query/sql/schema_down.sql`:
- Line 19: Change the bare-name DROP FUNCTION statements to explicitly target
the zero-argument overload by adding empty parentheses: replace occurrences like
"drop function if exists create_traversal_filter_tables;" with "drop function if
exists create_traversal_filter_tables();", and do the same for
create_bidirectional_shortest_path_tables and
create_bidirectional_pair_pathspace_indexes so the DROP unambiguously refers to
the zero-argument overload.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0a7d75d9-17bb-4a02-8d5e-ad8f84afb072

📥 Commits

Reviewing files that changed from the base of the PR and between 2404e30 and 3fab9d8.

📒 Files selected for processing (44)
  • .gitignore
  • cypher/models/pgsql/format/format.go
  • cypher/models/pgsql/format/format_test.go
  • cypher/models/pgsql/functions.go
  • cypher/models/pgsql/model.go
  • cypher/models/pgsql/test/testcase.go
  • cypher/models/pgsql/test/translation_cases/create.sql
  • cypher/models/pgsql/test/translation_cases/delete.sql
  • cypher/models/pgsql/test/translation_cases/multipart.sql
  • cypher/models/pgsql/test/translation_cases/nodes.sql
  • cypher/models/pgsql/test/translation_cases/pattern_binding.sql
  • cypher/models/pgsql/test/translation_cases/pattern_expansion.sql
  • cypher/models/pgsql/test/translation_cases/quantifiers.sql
  • cypher/models/pgsql/test/translation_cases/shortest_paths.sql
  • cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
  • cypher/models/pgsql/test/translation_cases/unwind.sql
  • cypher/models/pgsql/test/translation_cases/update.sql
  • cypher/models/pgsql/test/validation_integration_test.go
  • cypher/models/pgsql/translate/constraints_test.go
  • cypher/models/pgsql/translate/create.go
  • cypher/models/pgsql/translate/expansion.go
  • cypher/models/pgsql/translate/expansion_test.go
  • cypher/models/pgsql/translate/hinting.go
  • cypher/models/pgsql/translate/limit_pushdown_test.go
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/translate/node.go
  • cypher/models/pgsql/translate/pattern.go
  • cypher/models/pgsql/translate/projection.go
  • cypher/models/pgsql/translate/query.go
  • cypher/models/pgsql/translate/references.go
  • cypher/models/pgsql/translate/references_test.go
  • cypher/models/pgsql/translate/relationship.go
  • cypher/models/pgsql/translate/selectivity.go
  • cypher/models/pgsql/translate/tracking.go
  • cypher/models/pgsql/translate/translator.go
  • cypher/models/pgsql/translate/traversal.go
  • cypher/models/pgsql/translate/unwind.go
  • cypher/models/pgsql/translate/with.go
  • drivers/pg/query/sql/schema_down.sql
  • drivers/pg/query/sql/schema_up.sql
  • drivers/pg/transaction.go
  • integration/testdata/cases/create_inline.json
  • integration/testdata/cases/multipart_inline.json
  • integration/testdata/cases/unwind_inline.json
✅ Files skipped from review due to trivial changes (1)
  • integration/testdata/cases/unwind_inline.json
🚧 Files skipped from review as they are similar to previous changes (3)
  • cypher/models/pgsql/translate/references.go
  • cypher/models/pgsql/translate/constraints_test.go
  • cypher/models/pgsql/test/translation_cases/quantifiers.sql

Comment thread cypher/models/pgsql/test/translation_cases/update.sql
Comment thread cypher/models/pgsql/translate/expansion.go
Comment thread cypher/models/pgsql/translate/projection.go Outdated
Comment thread drivers/pg/query/sql/schema_up.sql Outdated
@zinic zinic force-pushed the main branch 5 times, most recently from 2856185 to c92ade8 Compare May 6, 2026 05:36
@zinic
Copy link
Copy Markdown
Contributor Author

zinic commented May 6, 2026

@CodeRabbit blast me - review for correctness

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 6, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the Join.Table type change and all callers
rg -n "Join{" --type=go -A5 -B2 | head -120

Length of output: 74


🏁 Script executed:

#!/bin/bash
# Check Translate() variadic graphIDs usage - verify caller safety
ast-grep --pattern 'translate.Translate($$$)' | head -60

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check expansion.go Build signature change - variadic commonTableExpressions
ast-grep --pattern 'func (s *ExpansionBuilder) Build($$$)' | head -30

Length of output: 2112


🏁 Script executed:

#!/bin/bash
# Look at the bidirectional SP harness wiring and path_limit logic
grep -n "path_limit\|pathLimit\|PathLimit" drivers/pg/query/sql/schema_up.sql | head -40

Length of output: 2853


🏁 Script executed:

#!/bin/bash
# Check for potential nil dereferences in create.go
grep -n "\.Binding\.\|\.Frame\.\|\.Expansion\." cypher/models/pgsql/translate/create.go | head -60

Length of output: 990


🏁 Script executed:

#!/bin/bash
# Check unwind.go for correctness issues
cat cypher/models/pgsql/translate/unwind.go

Length of output: 2157


🏁 Script executed:

#!/bin/bash
# Check references.go for correctness in liveness analysis
cat cypher/models/pgsql/translate/references.go

Length of output: 5336


🏁 Script executed:

#!/bin/bash
# Check the Join struct change in model.go
grep -n "Join\|LateralSubquery" cypher/models/pgsql/model.go | head -40

Length of output: 664


🏁 Script executed:

#!/bin/bash
# Check schema_up.sql for correctness in bidirectional SP harness
grep -n "bidirectional_sp_harness\|_bidirectional_sp_harness\|use_array_parameters" drivers/pg/query/sql/schema_up.sql | head -60

Length of output: 4339


🏁 Script executed:

#!/bin/bash
# Check limit pushdown correctness
cat cypher/models/pgsql/translate/limit_pushdown_test.go

Length of output: 10921


🏁 Script executed:

#!/bin/bash
# Check expansion.go for UseUnionAll correctness and UNION ALL seeding
grep -n "UseUnionAll\|UNION ALL\|UnionAll" cypher/models/pgsql/translate/expansion.go | head -40

Length of output: 223


🏁 Script executed:

#!/bin/bash
# Check for potential issues in selectivity gating
cat cypher/models/pgsql/translate/selectivity.go | head -80

Length of output: 4014


🏁 Script executed:

#!/bin/bash
# Check schema_down.sql for completeness - all added functions should be dropped
grep "drop function" drivers/pg/query/sql/schema_down.sql | head -60

Length of output: 4136


🏁 Script executed:

#!/bin/bash
# Check references.go matchPatternDeclarationDepth logic for correctness
grep -n "matchPatternDeclaration\|depth\|Depth" cypher/models/pgsql/translate/references.go | head -40

Length of output: 744


🏁 Script executed:

#!/bin/bash
# Check for potential issues in projection.go - buildInternalProjection empty projection
grep -n "buildInternalProjection\|LastProjection\|directNode\|directEdge" cypher/models/pgsql/translate/projection.go | head -60

Length of output: 1226


🏁 Script executed:

#!/bin/bash
# Check format.go for how Join.Table (now Expression) is formatted - need type switch
grep -n "Join\|Table\|LateralSubquery\|formatNode\|formatJoin" cypher/models/pgsql/format/format.go | head -60

Length of output: 3673


🏁 Script executed:

#!/bin/bash
# Check the full formatNode/formatJoin implementation around Join.Table
cat cypher/models/pgsql/format/format.go | grep -A 30 "case pgsql.Join"

Length of output: 865


🏁 Script executed:

#!/bin/bash
# Check the references.go Variable case - does it get called for variables nested inside NodePattern/RelationshipPattern?
# We need to understand the walk.Cypher traversal order
grep -n "Variable\|addVariable\|addMatch" cypher/models/pgsql/translate/references.go

Length of output: 525


🏁 Script executed:

#!/bin/bash
# Check translateUnwind pop order vs the Cypher UNWIND tree structure
grep -n "PopOperand\|PushOperand\|unwind\|Unwind" cypher/models/pgsql/translate/translator.go | head -40

Length of output: 1334


🏁 Script executed:

#!/bin/bash
# Look at the full _bidirectional_sp_harness for correctness around pair_filter and path_limit interaction
sed -n '1626,1970p' drivers/pg/query/sql/schema_up.sql

Length of output: 14481


🏁 Script executed:

#!/bin/bash
# Check schema_down.sql for missing drops - compare new overloads in schema_up
grep -n "create or replace function public\." drivers/pg/query/sql/schema_up.sql | grep -v "^--" | head -80

Length of output: 4684


🏁 Script executed:

#!/bin/bash
# Check expansion.go BuildBiDirectionalShortestPathsRoot for correctness
grep -n "BuildBiDirectional\|BuildShortestPaths\|BuildAllShortest" cypher/models/pgsql/translate/expansion.go | head -20

Length of output: 352


🏁 Script executed:

#!/bin/bash
# Find BuildBiDirectionalShortestPathsRoot and read its implementation
sed -n '$(grep -n "func.*BuildBiDirectionalShortestPathsRoot" cypher/models/pgsql/translate/expansion.go | head -1 | cut -d: -f1)p' cypher/models/pgsql/translate/expansion.go

Length of output: 52


🏁 Script executed:

#!/bin/bash
# Find and display BuildBiDirectionalShortestPathsRoot
grep -n "func.*BuildBiDirectional\|func.*BuildShortest\|func.*BuildAll" cypher/models/pgsql/translate/expansion.go

Length of output: 352


🏁 Script executed:

#!/bin/bash
# Check format.go LateralSubquery formatting
grep -n "LateralSubquery\|lateral\|Lateral" cypher/models/pgsql/format/format.go | head -30

Length of output: 160


🏁 Script executed:

#!/bin/bash
# Check the Join struct and how Table field is formatted now
ast-grep --pattern 'case pgsql.Join:
  $$$' | head -50

Length of output: 387


🏁 Script executed:

#!/bin/bash
# Check expansion.go around line 1743 for UseUnionAll logic correctness
sed -n '1730,1780p' cypher/models/pgsql/translate/expansion.go

Length of output: 1777


🏁 Script executed:

#!/bin/bash
# Check if references.go correctly handles Variable nodes nested inside NodePattern/RelationshipPattern 
# by looking at the walk.Cypher AST structure
grep -rn "func.*Walk\|func.*Cypher\b\|VisitChildren\|Children" cypher/models/walk/ | head -40

Length of output: 418


🏁 Script executed:

#!/bin/bash
# Check for missing schema_down drops - specifically the new unidirectional_sp_harness overloads with path_limit
# schema_up adds: unidirectional_sp_harness with (text, text, int4, root_ids int8[], path_limit int8) 
# and (text, text, int4, path_limit int8)
grep "unidirectional_sp_harness" drivers/pg/query/sql/schema_down.sql

Length of output: 648


🏁 Script executed:

#!/bin/bash
# Check the swap_forward_front and swap_backward_front changes for correctness
grep -n "swap_forward_front\|swap_backward_front\|delete from\|DELETE FROM" drivers/pg/query/sql/schema_up.sql | head -40

Length of output: 3393


🏁 Script executed:

#!/bin/bash
# Check walk_cypher.go NodePattern children - does it visit Variable?
grep -n "NodePattern\|RelationshipPattern\|Variable" cypher/models/walk/walk_cypher.go | head -40

Length of output: 177


🏁 Script executed:

#!/bin/bash
# Check walk_pgsql.go Join case - confirm Table is not in Branches
sed -n '250,270p' cypher/models/walk/walk_pgsql.go

Length of output: 584


🏁 Script executed:

#!/bin/bash
# Check if limit pushdown uses AST walking to find function calls
grep -n "walk\|Walk\|findCTE\|FunctionCall\|harness" cypher/models/pgsql/translate/projection.go | head -40

Length of output: 2518


🏁 Script executed:

#!/bin/bash
# Check the full LateralSubquery case in format.go including binding handling
sed -n '280,300p' cypher/models/pgsql/format/format.go

Length of output: 678


🏁 Script executed:

#!/bin/bash
# Check unidirectional SP path_limit correctness in non-pair-filter mode  
sed -n '668,770p' drivers/pg/query/sql/schema_up.sql

Length of output: 3883


🏁 Script executed:

#!/bin/bash
# Check how bidirectional_sp_harness path_limit applies in array mode
sed -n '1971,2015p' drivers/pg/query/sql/schema_up.sql

Length of output: 1964


🏁 Script executed:

#!/bin/bash
# Check the expansion.go for how it wires path_limit into harness parameters
grep -n "path_limit\|PathLimit\|limitParam\|Int8\|int8" cypher/models/pgsql/translate/expansion.go | grep -i "limit\|path" | head -30

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check CanExecutePairAwareBidirectionalSearch in model.go for correctness
grep -n "CanExecutePairAware\|CanExecuteSelective\|UseBidirectional" cypher/models/pgsql/translate/model.go | head -20

Length of output: 329


🏁 Script executed:

#!/bin/bash
# Verify how walk_pgsql handles Join.Table - if it's missing could affect countLimitPushdown
grep -n "countLimitPushdown\|isLimitPushdown\|appendLimit\|pushDown" cypher/models/pgsql/translate/projection.go | head -20

Length of output: 1069


🏁 Script executed:

#!/bin/bash
# Check if bidirectional_asp_harness drops in schema_down cover all new overloads
grep "bidirectional_asp_harness" drivers/pg/query/sql/schema_down.sql

Length of output: 459


🏁 Script executed:

#!/bin/bash
# Check BuildBiDirectionalShortestPathsRoot and BuildBiDirectionalAllShortestPathsRoot implementations
sed -n '1515,1690p' cypher/models/pgsql/translate/expansion.go

Length of output: 6692


🏁 Script executed:

#!/bin/bash
# Check the resolved_pairs PRIMARY KEY constraint - root_id,next_id could collide for multiple paths
grep -n "resolved_pairs\|primary key\|on conflict" drivers/pg/query/sql/schema_up.sql | head -30

Length of output: 1909


🏁 Script executed:

#!/bin/bash
# Check expansion.go newExpansionSeed types and constructors
sed -n '1,120p' cypher/models/pgsql/translate/expansion.go

Length of output: 3514


@zinic: Here is my correctness-focused review of PR #70. The architectural changes are well-structured overall, but I found three concrete correctness issues ranging from medium to low severity.


Correctness Review

🔴 Issue 1 — walk_pgsql.go: Join.Table missing from walker Branches (Medium)

File: cypher/models/walk/walk_pgsql.go, lines 255–259

After Join.Table was promoted from TableReference to Expression (to accommodate LateralSubquery), the SQL AST walker was not updated to include it:

case pgsql.Join:
    return &Cursor[pgsql.SyntaxNode]{
        Node:     node,
        Branches: []pgsql.SyntaxNode{typedNode.JoinOperator}, // ← Join.Table omitted
    }, nil

Any code using walk.PgSQL to traverse the SQL AST will silently skip everything inside Join.Table — including inner queries of LateralSubquery joins. The present aggregate-detection call in projection.go:592 doesn't currently suffer because harnesses live in CTEs, not joins. However this is a latent correctness gap: the walk silently produces an incomplete view of the AST for any future analysis that needs to inspect lateral join contents (e.g., reference collection, expression rewriting, or pushdown safety checks).

Fix:

case pgsql.Join:
    return &Cursor[pgsql.SyntaxNode]{
        Node:     node,
        Branches: []pgsql.SyntaxNode{typedNode.Table, typedNode.JoinOperator},
    }, nil

🔴 Issue 2 — schema_up.sql: path_limit silently ignored in _bidirectional_sp_harness when use_array_parameters = true (Medium)

File: drivers/pg/query/sql/schema_up.sql, _bidirectional_sp_harness

The path_limit loop guard is:

(path_limit <= 0 or resolved_pairs_count < path_limit)

resolved_pairs_count is only incremented inside if use_pair_filter then blocks. When use_array_parameters = true, use_pair_filter is false (it is defined as not use_array_parameters and length(pair_filter) > 0), so resolved_pairs_count stays at 0 for the entire run. The guard is therefore always satisfied, and path_limit has no effect on the BFS loop termination.

The final limit path_limit at the end of the function body is also inside the if use_pair_filter block:

if use_pair_filter then
    if path_limit > 0 then
        return query select * from resolved_pairs ... limit path_limit;
    else
        ...
    end if;
end if;

So when callers use bidirectional_sp_harness(root_ids int8[], terminal_ids int8[], path_limit int8) (which delegates to _bidirectional_sp_harness(..., use_array_parameters := true)), the result is unbounded regardless of path_limit. The non-pair-filter return paths (return query ... exit;) have no limit applied.

Suggested fix: Apply path_limit to the return query ... exit; paths in the non-pair-filter branch, for example:

-- In the forward satisfied-path return (non-pair-filter):
return query select ... from forward_front r where r.satisfied ...
             limit case when path_limit > 0 then path_limit else null end;

or accumulate non-pair-filter results into resolved_pairs and apply the unified final limit.


🟡 Issue 3 — schema_down.sql: Missing DROP for bidirectional_sp_harness(text, text, text, text, int4, text, text) (Low)

File: drivers/pg/query/sql/schema_down.sql

schema_up.sql creates the two-text-filter, no-path-limit wrapper overload (the 0::int8 default delegate):

create or replace function public.bidirectional_sp_harness(
    forward_primer text, forward_recursive text,
    backward_primer text, backward_recursive text,
    max_depth int4,
    root_filter text, terminal_filter text)   -- ← no path_limit param

schema_down.sql drops the (text, text, text, text, int4, text, text, int8) variant (with path_limit) but has no corresponding:

drop function if exists bidirectional_sp_harness(text, text, text, text, int4, text, text);

Rolling back the migration will leave this function dangling.


✅ Confirmed Correct

  • format.go LateralSubquery binding order: The exprStack is LIFO; the rendering order resolves to lateral (QUERY) alias, which matches the test expectation ... join lateral (...) e on true. ✓
  • UseUnionAll gating: true when no external primer constraints (seed rows already distinct per root) or for bound-node seeds; false when an external CTE cross-join is needed (requiring deduplication). Logic is sound. ✓
  • references.go *cypher.Variable case: Since walk_cypher.go does not include NodePattern.Variable or RelationshipPattern.Variable as separate branch nodes (as verified by the passing TestCollectReferencedIdentifiersIgnoresPatternDeclarations), the *cypher.Variable Enter case fires only for standalone uses outside MATCH declarations. The conservative liveness analysis is correct. ✓
  • resolved_pairs PK (root_id, next_id): Intentionally allows only one (shortest) path per pair in SP mode; on conflict do nothing silently keeps the first/shallowest. Semantically correct for shortest-path (not all-shortest-paths). ✓
  • boundEndpointFilterParameters: When hasPairFilter is true, rootFilter/terminalFilter remain empty strings and pairFilter carries the full constraint — correct because the pair filter subsumes individual root/terminal filters. ✓

@zinic zinic force-pushed the main branch 2 times, most recently from 464418d to a18cd0e Compare May 6, 2026 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant