Hotfix: apply LIMIT before CALL subqueries in dataset/transgene queries by Robbie1977 · Pull Request #44 · VirtualFlyBrain/VFBquery

Robbie1977 · 2026-05-29T20:50:33Z

Hotfix for the perf-test failure on main after PR #43 merge: https://github.com/VirtualFlyBrain/VFBquery/actions/runs/26659171834/job/78577045325

PR #43 added CALL subqueries to get_aligned_datasets, get_all_datasets, and get_transgene_expression_here, but LIMIT was appended at the END of the constructed query. Cypher applies LIMIT after the CALL subqueries fire, so every candidate ds/ep gets enriched through 4 (or 2) CALL subqueries before being trimmed.

For AlignedDatasets that meant 86 datasets × 4 subqueries (one of which is count(DISTINCT img) over has_source edges). For AllDatasets, 130 datasets. For TransgeneExpressionHere on mushroom body, 2,340 EPs with a 5-hop thumbnail join.

The fix moves LIMIT after WITH DISTINCT and before the CALL subqueries fire, so only the kept rows are enriched. Also drops the ORDER BY name from _dataset_return_clause and moves it next to LIMIT in each caller (can't have two ORDER BYs).

Dry-run against pdb.v4 public read-only

AlignedDatasets LIMIT 10 — 1.64 s (was timing out before fix)
AllDatasets LIMIT 20 — 1.10 s
TransgeneExpressionHere LIMIT 10 on mushroom body — 0.51 s

All comfortably under their thresholds (3 s, 3 s, 15 s).

PR #43 broke THRESHOLD_MEDIUM (3 s) on AlignedDatasets / AllDatasets and THRESHOLD_SLOW (15 s) on TransgeneExpressionHere because LIMIT was appended at the end of each Cypher and applied AFTER the four (or two) CALL subqueries fired for every candidate ds/ep. One of the dataset subqueries does count(DISTINCT img) across has_source edges; the transgene one traverses a 5-hop image join inside the CALL. Move LIMIT after `WITH DISTINCT ds` (or `WITH DISTINCT ep`) and before the CALL subqueries so only the rows we keep get enriched. Drop the ORDER BY from `_dataset_return_clause` and move it next to LIMIT in each caller, since you can't ORDER BY twice in the same query. Dry-run against pdb.v4 (public read-only): AlignedDatasets LIMIT 10 -> 1.64 s AllDatasets LIMIT 20 -> 1.10 s TransgeneExpr... LIMIT 10 -> 0.51 s All under their respective thresholds.

Robbie1977 merged commit f8d8618 into main May 29, 2026
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hotfix: apply LIMIT before CALL subqueries in dataset/transgene queries#44

Hotfix: apply LIMIT before CALL subqueries in dataset/transgene queries#44
Robbie1977 merged 1 commit into
mainfrom
fix/perf-limit-before-call

Robbie1977 commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Robbie1977 commented May 29, 2026

Dry-run against pdb.v4 public read-only

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant