C++ search by ms609 · Pull Request #210 · ms609/TreeSearch

ms609 · 2026-03-19T18:56:59Z

other optimizations + features

Manual testing underway; shiny app in particular has some usability issues.

- Add pipeline step 5a to driven search list (post-ratchet XSS+RSS+CSS) - Add 'Post-ratchet sectorial pass' subsection in sectorial search section - Fix stale consensusStableReps preset documentation (T-264 disabled it) - Update AGENTS.md pipeline with steps 5a, 6 (NNI-perturb), 8 (PCSA), 10-11

…einsert

…cid-cons conflict

…(S-RED focus 10) When divided_steps puts s > info_max_steps, compute_profile() caps the cost at info_amounts[info_max_steps-1], but precompute_profile_delta used old_cost = 0.0 for any s outside [1, info_max_steps]. This overestimated the delta, causing overly conservative candidate rejection in profile TBR screening. Fix: mirror the capping logic from compute_profile(). Conservative bug: no incorrect trees accepted, only missed improvements on unusually homoplastic characters. Very low practical impact. Regression test added (test-ts-iw-profile-red10.R).

…date AGENTS.md worktree table, triage u.005 (interleaved sectorial design rationale added to T-269 notes)

…tasks entry

…FORM) — no bugs

…ps, fresh lib for TreeSearch build)

…b-path dep issue

… T-235 SPR fix verified, flat_blocks.active_mask latent noted

…2/T-268 wrapped

…spatched

T-266: Taxon pruning-reinsertion perturbation

…n (hamilton-hpc skill, model)

…search, cid-consensus conflict resolved

In R CMD check, R runs as a non-interactive subprocess with captured stdout. R_FlushConsole() calls fflush() on that pipe; when the buffer fills the call blocks indefinitely, causing the 6 h GHA timeout seen on every ubuntu runner for PR #210. Gate the \r-overwrite progress line and the flush behind R_Interactive (FALSE in batch/check contexts). Interactive sessions are unchanged. At verbosity >= 2 in batch mode, emit plain \n-terminated lines so diagnostic logs still carry progress detail without the flush risk. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…taset S-RED area-5 finding: ts_sector.cpp's build_reduced_dataset() did not copy flat_blocks or all_weight_one from the full DataSet. Since use_flat = ds.all_weight_one is checked at the top of every tbr_search() call, sector TBR always used the slower non-flat path even when all characters have weight=1. Fixing all_weight_one without flat_blocks would have triggered UB (flat functions dereference flat_blocks.data()); both fixed together. Also: - profiling round 1: Ratchet area (#2) PROFILED — 62% of search time via verbosity=2 phase data; VTune not available for per-function hotspot; T-300 (lazy rescore) is the actionable follow-up - focus-areas.md: NNI-perturb → AT-LIMIT (disabled via T-274), Ratchet → PROFILED with 2026-05-18 baseline (2.80 s/rep Zhu2013 thorough) - red-team.md last_focus → area 5 with full findings Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

In R CMD check, R runs as a non-interactive subprocess with captured stdout. R_FlushConsole() calls fflush() on that pipe; when the buffer fills the call blocks indefinitely, causing the 6 h GHA timeout seen on every ubuntu runner for PR #210. Gate the \r-overwrite progress line and the flush behind R_Interactive (FALSE in batch/check contexts). Interactive sessions are unchanged. At verbosity >= 2 in batch mode, emit plain \n-terminated lines so diagnostic logs still carry progress detail without the flush risk. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace full_rescore() on the accept path with two incremental downpass+uppass pairs from nz (clip grandparent) and nx (spare/regraft node) — the two stale chains after apply_tbr_move in the SPR case. TBR moves (with non-trivial rerooting) still fall back to full_rescore. DEBUG_RESCORE guard enabled: asserts incremental == full_rescore on every accepted SPR move. Remove #define once GHA passes cleanly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ash on linux Incremental SPR rescore systematically underestimates score by 3 steps (observed on ubuntu GHA; diff=-3 consistent across all test cases). Root cause not identified; full_rescore restored as the safe fallback. T-300 to be redesigned with per-node local_cost instrumentation offline. R_Interactive guard in ts_parallel.cpp is kept (separate fix, working). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Four tests in test-ts-stopping.R and one in test-ts-xpiwe.R loaded inapplicable.datasets but then used inapplicable.phyData, a different object. Tests only passed because a prior test had already loaded inapplicable.phyData into the environment. Changed to load the correct object (inapplicable.phyData) so tests are self-contained. Found by red-team area #8 review. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ailure ts_drift.cpp:680 — second drift_apply_tbr_move call in the EW RFD re-apply path lacked drift_restore_topology() before build_postorder() on failure. Mirrors the correct pattern at line 589. Path is theoretically unreachable (same args succeed on first apply → succeed on second), but guards against partial step-1 detach leaving an invalid tree if the path were ever reached. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace full_rescore() on the SPR accept path with a single-pass postorder walk over the union of paths nz→root and nx→root. Each affected node is visited exactly once, reading current children's prelims — no shared- ancestor double-read, no IW arithmetic on stale state. This supersedes the reverted overlap-chain implementation (1e3fc9a), which underestimated EW length by exactly 3 steps on every accepted move. Root cause of the −3 was never identified analytically; the dirty-set algorithm sidesteps the question by construction. New functions in ts_fitch.cpp: fitch_dirty_downpass(tree, ds, start_a, start_b) — postorder rescore over the dirty set; returns EW length delta. fitch_dirty_uppass(tree, ds, start_a, start_b) — companion final_ pass, seeded from the same start points, propagating downward. Wired into tbr_search at ts_tbr.cpp:1138, gated on is_spr && !has_na. TBR moves with non-trivial rerooting and datasets with inapplicable characters continue to use full_rescore (NA variant deferred). DEBUG_RESCORE cross-check left enabled: every accepted SPR move asserts incremental == full_rescore. Remove the define in a follow-up after one clean GHA cycle. Local validation (Zhu2013, congreveLamsdellMatrices, synthetic 30×25 EW): - 92 tests passed across test-ts-tbr-search, test-ts-tbr-symmetry, test-ts-ratchet-search, test-ts-ratchet-opt. - 0 DEBUG_RESCORE mismatches across EW + IW paths with multiple SPR accepts per replicate (scores 173, 192, 9.140673). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Profiling round 2 — area #4 (TBR full-rescore at acceptance): - VTune confirmed full_rescore is 19.2% self-time (fitch_na_score 0.585 s + load_tip_states 0.032 s) on Zhu2013 thorough preset, n_threads=1. - All time flows through tbr_search line 1138 (accept path), confirming T-300 as the right target. Zhu2013 has inapplicable characters so the initial T-300 commit (EW only) does not benefit it; an NA variant is the next follow-up. - Secondary candidate identified: ts::simd::any_hit_reduce_avx2 at 9.6%. Adds dev/profiling/drivers/tbr-rescore.R (Zhu2013 driver, ~3.9 s bare). Updates baselines.md, log.md, focus-areas.md, findings.md. RESUME.md refreshed for hand-off after T-300 push. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds a temporary full_rescore cross-check on every NNI incremental delta, gated to EW (IW has a separate pre-existing arithmetic bug, see below). Together with DEBUG_RESCORE in tbr_search, this tests whether the fitch_incremental_downpass single-chain primitive is the source of the −3 that took down the previous T-300 attempt. Local validation (Vinther2008 EW + synthetic 30×25 EW, 5 replicates each): 0 DEBUG_NNI_RESCORE assertions on the EW path. Interpretation: the single-chain primitive is correct. The −3 in the reverted T-300 must have arisen from the two-chain composition itself, not from a bug in fitch_incremental_downpass. Both my analytical argument (deltas cancel) and the dirty-set empirical result are consistent with the primitive being sound; some interaction *between* chains (e.g. shared save_node_state ordering, or a read at LCA I haven't isolated) produced the systematic underestimate. The dirty-set algorithm avoids the question by construction. Separate finding while wiring this assertion: nni_search at line 83 computes `new_score = best_score + delta` unconditionally — but `delta` is an integer EW step delta from fitch_incremental_downpass, and `best_score` for IW is a float weighted score. The IW NNI path therefore reports wrong intermediate scores (diffs of 0.01–2.0 against full_rescore in local testing). Impact is bounded — nni_search() ends with a final full_rescore, so the returned score is correct, but intermediate accept/reject decisions under IW use a wrong predicate. Filed as a follow-up to fix via extract_char_steps + compute_weighted_score (same pattern as tbr_search accept path). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The NNI accept-path in nni_search() added an integer EW step delta to a float weighted score whenever ds.concavity was finite, producing wrong intermediate scores under IW/profile parsimony. Empirically, a 30x25 synthetic EW phyDat with concavity=10 triggered 17+ DEBUG_NNI_RESCORE mismatches per replicate (non-integer diffs like 0.96, -1.94); the EW path emitted zero. After fitch_incremental_downpass, tree.local_cost is correct for the whole tree (NNI only changes children at edge c; off-chain nodes retain valid local_cost from the score_tree at function entry). Mirror the T-300 SPR accept-path pattern from ts_tbr.cpp: under std::isfinite( ds.concavity), call extract_char_steps + compute_weighted_score (which dispatches IW vs profile by ds.scoring_mode). EW path keeps best_score + delta. Also remove the EW-only gate around DEBUG_NNI_RESCORE — it was there purely to mask the now-fixed IW bug. Leaving the assertion live on both paths catches any regression in either branch. Local: dev/plans/t300-test.R clean on both EW and IW; full test-ts-(nni|search|driven|tbr) testthat suites pass with no DEBUG_NNI_RESCORE emissions. T-300 follow-up. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Two GHA cycles (R-CMD-check 26078069315, ASAN 26078069318, post-NNI-fix combined 26078652587) confirm zero assertions across all platforms on both EW and IW paths. The cross-check scaffolds have served their purpose: the dirty-set rescore in tbr_search and the corrected NNI incremental in nni_search both match full_rescore exactly. Removes: - src/ts_tbr.cpp:23 #define DEBUG_RESCORE - src/ts_tbr.cpp:1166–1176 #ifdef block (full_rescore + actual = ref) - src/ts_search.cpp:20 #define DEBUG_NNI_RESCORE - src/ts_search.cpp:103–119 #ifdef block (save/full_rescore/restore) The actual rescore work (extract_char_steps + compute_weighted_score branch for IW, dirty-set delta for EW) is unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds NA-aware dirty-set Pass 1 + Pass 2 (fitch_na_dirty_downpass and fitch_na_dirty_uppass) mirroring the EW dirty-set, and wires them into tbr_search's SPR accept path under has_inapplicable. Pass 3 still runs over the full tree because it populates internal down2 (read by extract_char_steps) and counts NA-block steps; savings come from skipping Pass 1 + Pass 2 on off-dirty nodes. Bug found during validation: full_rescore for NA EW data routes through fitch_score_ew which adds ds.ew_offset (topology-independent step count) on top of fitch_na_score's return. My accept path was reading fitch_na_pass3_score directly without the offset — producing a systematic diff=−3 on Vinther2008 (where ew_offset = 3). Fix: add ds.ew_offset in the EW branch (line 1183). IW path is unaffected because compute_weighted_score handles per-pattern step contributions directly. Validation: DEBUG_NA_RESCORE cross-check kept for one GHA cycle. Local Vinther2008 run, 5 replicates each EW + IW: DEBUG_RESCORE (EW, IW): 0 / 0 mismatches DEBUG_NA_RESCORE (NA-EW, NA-IW): 0 / 0 mismatches Also includes dev/plans/t300-test.R (the four-path test harness) and dev/plans/pr244-examples-hang-brief.md (background-agent brief for the separate PR #244 hang investigation). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Threads ConcordanceTable() into the PaintCharacters example so the two functions appear together in the rendered example block. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

R CMD check flagged R_Interactive as a non-API entry point. Switch to isatty(fileno(stdout)) (POSIX) / _isatty(_fileno(stdout)) (Windows), which is the public equivalent: a captured pipe is never a TTY, an interactive console is. Preserves the deadlock fix from b186e80 (no flush in batch mode) without using internal R symbols. Also wrap the slow TreeLength examples (profile parsimony, random-tree scoring, HSJ block) in \donttest{}: the example block was clocking >5 s elapsed on CI runners.

Fixes negative-delta scoring in LengthAdded() by unwrapping qmApp to a scalar when multiple all-applicable rows exist in the contrast matrix. Also keys the contrast-row validity check on used tokens (so stale zero-sum rows for unused tokens no longer error) and adds a regression test for the qmApp fix and a 6-token error-message check via Asher2005[, 67]. The examples-hang on devel/arm/macOS that blocked merge for several days was caused by R_Interactive being a non-API symbol; the portable fix landed on cpp-search as 2b6b6be and was pulled in via merge f59c490.

GHA across all platforms confirmed the NA dirty-set rescore matches full_rescore exactly (zero DEBUG_NA_RESCORE mismatches). Drop the guarded cross-check block and the define, mirroring the EW cleanup in 44a4ebe.

Drops the NA-aware dirty-set rescore (commit 014ccde + cleanup 5b210fd) from 3.88 s to 3.29 s median on Zhu2013 / 12 ratchet reps — a 15.2 % wall-time reduction, against an 18.2 % upper bound predicted from VTune's fitch_na_score share of DLL time. Score 647 matches baseline on every replicate. Baseline confirmed via dev/profiling/t300_na_bench.R against an A/B of five contemporaneous lib snapshots installed from short-lived worktrees, ruling out shared-dep contamination (TreeTools, Rcpp). Records the driver script and updates focus-areas.md to mark area #4 DONE. .gitignore tweaked to exclude the throwaway .bench-libs/ and .rds files the driver emits.

…te (#245) Closes the coverage gap that let the IW NNI rescore bug (3df9088) ship. The existing IW NNI test (test-ts-spr-nni-opt.R) only asserted result$score >= 0 and topology shape — it never cross-checked that the returned score matched the IW score of the returned topology. Pre-fix, nni_search did new_score = best_score + integer_delta unconditionally, so under IW the returned score was the accumulated wrong float and the existing assertions still passed. New file mirrors the EW pattern from "NNI: 15-tip random dataset" (result$score == ts_score(rt, ds)) but under concavity=10, sweeps a few concavities, and exercises multistate + NA cases. Profile parsimony isn't reached because ts_nni_search's Rcpp bridge doesn't forward infoAmounts; the IW path through compute_weighted_score is the same codepath the fix unified, so this catches both modes at the C++ level. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

ms609 marked this pull request as draft March 25, 2026 14:21

ms609 added 29 commits March 27, 2026 07:01

chore: add T-269 fine-grained sectorial interleaving benchmark (P3)

a961fdf

fix: include <climits> for INT_MAX on Linux/ARM

4c756c8

docs: regenerate SearchControl.Rd with pruneReinsert params

4b27602

chore: T-266 GHA run 3 after climits + Rd fixes (Agent A)

88ad4b5

chore: park T-150 — SPIC added (E), GHA 23636944848

24bc323

chore: S-COORD round 30

d46302e

chore: agent-A T-266 PR #235 opened; T-267 deleted (completed)

bdada25

chore: S-COORD round 31 — T-150 GHA failure, T-270/T-272 filed

3dec740

chore: T-270 complete (vignette + AGENTS.md docs)

540dda6

chore: agent-A log T-270 done

d190205

Merge remote-tracking branch 'origin/cpp-search' into feature/prune-r…

ed5b351

…einsert

chore: S-PR round 31 — rebased #235/#216, closed #178 (T-272), noted …

42ae8a5

…cid-cons conflict

chore: S-RED focus 10 complete — profile delta fix logged

6a3afcb

chore: agent-A log — S-RED focus 10, S-PR, T-270 complete

48c8b11

chore(T-268): branch housekeeping — prune 11 stale local branches, up…

838b14c

…date AGENTS.md worktree table, triage u.005 (interleaved sectorial design rationale added to T-269 notes)

chore: T-268 done — remove from to-do, update agent-F log, completed-…

48af269

…tasks entry

chore: S-PR (T-204 GHA failure noted) + S-RED focus 1 (Fitch, AVX2, X…

26c2406

…FORM) — no bugs

chore(T-252): add t252_v2.sh — fix lib paths (use lib-baseline for de…

de80d18

…ps, fresh lib for TreeSearch build)

chore: T-252 PARKED (SLURM 16599543) — t252_v2.sh submitted, fixed li…

8a0d3a9

…b-path dep issue

chore: S-RED focus 2 complete (F) — T-263 snapshot hoisting verified,…

addc46a

… T-235 SPR fix verified, flat_blocks.active_mask latent noted

chore: S-COORD round 32 (F) — T-273 filed, agent status updated, T-25…

2d902e9

…2/T-268 wrapped

fix: add 'warmup' and 'config' to WORDLIST (CI spelling check R 4.1)

de9e521

chore: T-204 re-dispatched (timing issue), T-266 spelling fix + re-di…

3221765

…spatched

Merge pull request #235 from ms609/feature/prune-reinsert

c91b00d

T-266: Taxon pruning-reinsertion perturbation

chore: commit S-RED focus 2 notes to red-team.md; update settings.jso…

c737fbe

…n (hamilton-hpc skill, model)

chore: T-266 complete; T-204 GHA redispatch (23643078732)

c4366f7

chore: claim T-273 (F), S-PR round 32 — PR branches updated from cpp-…

1ae972f

…search, cid-consensus conflict resolved

ms609 and others added 5 commits May 18, 2026 13:07

sssshhh

7a996d6

Merge branch 'main' into cpp-search

995a13a

/profile init

fde200b

ms609 and others added 22 commits May 18, 2026 23:09

docs: update RESUME.md post T-300 (b67db1a)

9b9b170

PaintCharacters() draft

1bd1346

Concordance paint-swatch

2148483

docs(PaintCharacters): add ConcordanceTable example call

df7dd54

Threads ConcordanceTable() into the PaintCharacters example so the two functions appear together in the rendered example block. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Remotes: ms609/TreeTools,

012b6d2

Spelling

b2652a0

spell

7185b54

chore(T-300 NA): remove DEBUG_NA_RESCORE scaffolding

5b210fd

GHA across all platforms confirmed the NA dirty-set rescore matches full_rescore exactly (zero DEBUG_NA_RESCORE mismatches). Drop the guarded cross-check block and the define, mirroring the EW cleanup in 44a4ebe.

This was referenced May 19, 2026

Support parallelization in ratchet iterations #185

Open

Incremental character evaluation #172

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C++ search#210

C++ search#210
ms609 wants to merge 660 commits into
mainfrom
cpp-search

ms609 commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ms609 commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant