perf(shard): hot-path lock quick wins + event-driven SPSC wake (removes the ~1ms cross-shard floor)#172
Conversation
Onboard the repo onto the ADD (AI-Driven Development) state-tracked workflow. Foundation mapped from code (PROJECT/CONVENTIONS/GLOSSARY/ allowlist), baseline locked by Tin Dang. Milestone v1-shared-nothing implements the 2026-06 architecture review priorities 1/2/5: restore shared-nothing integrity, remove the 1ms monoio cross-shard reply floor, and the hot-path lock quick-wins batch. First task hotpath-lock-quickwins: contract FROZEN @ v1 (8 quick wins, behavior parity frozen, both runtimes). author: Tin Dang
Red/green TDD suite written and confirmed red BEFORE the build: - quickwins_red.rs: qw7 key-map pruning (runtime-red, 100 vs 40 entries) plus four parity pins (backlog golden model vs the per-byte reference, WAL false-on-full durability gate, replication offset exactness under 4-thread concurrency, INFO total_commands exactness under 4-thread concurrency). - quickwins_red_api.rs: compile-red against the two new surfaces the frozen contract introduces (server::socket_opts:: apply_client_socket_opts, ReplicationState::offset_handle). Task: .add/tasks/hotpath-lock-quickwins (contract FROZEN @ v1). author: Tin Dang
…wins Implements the 8-item quick-wins batch from the 2026-06 architecture review (priority 5) — every change behavior-preserving, parity pinned by tests/quickwins_red.rs: - QW1 TCP_NODELAY on every accepted client socket. New server::socket_opts::apply_client_socket_opts helper; wired into the shared tokio/monoio accept-opts hook (extends the existing SAFETY-commented BorrowedFd block - no new unsafe), the uring register path, and the multishot-accept LISTENER (Linux copies the nonagle flag to accepted sockets; raw CQE fds have no safe hook). Redis defaults tcp-nodelay yes; we never set it (finding 3.2). - QW2 WAL sender Mutex<Option<Sender>> -> OnceLock: set once before accept, lock-free per-write reads; try_wal_append_required false-on-full durability gate unchanged (finding 1.8). - QW3 replication offsets: shard_offsets/master_repl_offset Arc'd and distributed as a lock-free OffsetHandle at event-loop startup; the SPSC drain no longer read-locks RwLock<ReplicationState> per write (finding 1.4). The repl_state param in wal_append_and_fanout and the drain chain now carries the handle. - QW4 TOTAL_COMMANDS: single global AtomicU64 (one cache line bounced across all shard cores per command) -> 64 padded per-thread slots, exact sum on INFO/stats reads (finding 1.6). - QW5 ReplicationBacklog::append: per-byte eviction loop -> bulk drain+extend; state machine byte-identical (golden-tested), the same buffer that cost ~21% CPU on 8-shard SET p=64 (finding 1.5). - QW6 uring inflight_sends: Vec::remove(0) per SendComplete CQE -> VecDeque::pop_front, O(1) FIFO reclaim (finding 3.6). - QW7 vector key_hash_to_key / key_hash_to_global_id pruned in mark_deleted_for_key - maps now track live keys, not historical inserts (finding 6.3: unbounded growth under key churn). - QW8 client registry: per-pipeline-batch global write lock and per-batch is_killed read lock -> lock-free Arc<ClientLiveState> handle (db/last_cmd/flags/kill_flag atomics) returned by register(); CLIENT LIST/INFO/KILL semantics unchanged (finding 1.3). Verified: quickwins suites 7/7 green, lib 3566 green, tokio+jemalloc matrix 2945 green, clippy -D warnings clean on both feature sets. Task: .add/tasks/hotpath-lock-quickwins author: Tin Dang
…ence Gate PASS (auto-resolved, autonomy:auto): both-runtime suites green, 132/132 consistency on Linux VM, lock-inventory audit clean, wiring + dead-code deep checks recorded. Bench (moon-dev, fresh servers, no persistence): 4-shard P=16 SET +22% / GET +40% vs main baseline; 1-shard flat within run noise (3-run recheck). Observe deltas recorded; milestone v1 task 1/3 done. author: Tin Dang
ADD task spsc-wake-floor, contract FROZEN @ v1: remove the ~1ms monoio cross-shard latency floor by riding monoio 0.2.4's `sync` cross-thread wake (race2 tick+notify in the shard loop, direct recv_async replies, drain-cap self-re-notify, 2 INFO Stats counters). Red/pin state recorded before any build work: - spsc_wake_floor_red.rs: swf1/swf2 RED (INFO fields spsc_notify_wakes / spsc_drain_renotify absent); swf0 PIN PASSED on BOTH drivers (macOS kqueue + Linux io_uring on moon-dev VM) — proves cross-thread Waker::wake() DOES reach parked monoio tasks, disproving the stale pending_wakers relay comments; swf_a3 PIN (flume re-queues undelivered notify token on RecvFut drop); swf3 PIN (PX expiry + kill-9 WAL recovery cadence, green pre-build). - spsc_wake_floor_red_api.rs: compile-RED (E0432) on moon::runtime::race and the four metrics_setup counter fns. footer: ADD .add/tasks/spsc-wake-floor/TASK.md §1-§4; gitignore target-linux (VM cross-build dir, avoids Mach-O/ELF clobbering in shared target/). author: Tin Dang
Replace the monoio shard loop's single 1ms-tick await with an allocation-free two-arm race (timer | spsc Notify): cross-shard messages now wake the consumer the moment they arrive instead of waiting for the next periodic tick, and cross-shard replies are received directly (pinned OneshotReceiver future raced against a 100ms chunked sleep, 30s total timeout) instead of being swept by the pending_wakers relay once per tick. Mechanism (frozen contract, .add/tasks/spsc-wake-floor/TASK.md s3 v1): - src/runtime/race.rs (new): hand-rolled Race2 future — no monoio::select! (leaks ~100B per re-entry), no per-iteration allocation. - src/shard/event_loop.rs: monoio loop awaits race2(tick, notified); every-wake body = drain + drain-cap re-notify + cdc + waker sweep + snapshot/migration message handling; cadence work (WAL flush, sub-timers, cached clock) remains timer-exclusive. Tokio notify/periodic arms get the same drain-cap re-notify. - src/shard/spsc_handler.rs: drain_spsc_shared returns hit-cap/snapshot-seen so a full-ring drain (256-message cap) self-re-notifies instead of stranding the tail until the next tick. - src/admin/metrics_setup.rs + src/command/connection.rs: INFO Stats gains spsc_notify_wakes / spsc_drain_renotify observability counters. - src/shard/uring_handler.rs: fix latent task-1 compile break (VecDeque .push -> .push_back x4 + collapsed if-let) — this Linux+tokio-only path never compiled on macOS, so task 1 would have failed CI. - Stale "monoio cannot see cross-thread wakes" comments corrected: monoio 0.2.4's sync feature delivers them (eventfd/kqueue unpark), proven at runtime by swf0 on both drivers. Evidence (OrbStack Linux VM, monoio release, vs task-1 baseline 9861407): - 4-shard c=1 SET (cross-shard path): p99 4.071ms -> 0.071ms (57x), 562 -> 12,786 rps. Milestone exit criterion p99 < 1ms: PASS. - 4-shard pipelined P16: SET +368%, GET +19% — no regression (A2 gate). - swf suite green on macOS kqueue AND Linux io_uring; swf2b 700-conn drain-cap e2e green on both platforms (universal 8-hash-tag stimulus). - test_txn_commit_wal_crash_recovery flake investigated: pre-existing, fails 1/15 on BOTH baseline and this build (binary-level A/B), not introduced here. refs: .add/tasks/spsc-wake-floor/TASK.md, bench-results.txt author: Tin Dang
Gate PASS (auto-resolved on complete evidence, .add/tasks/spsc-wake-floor/TASK.md s6): - Tests: macOS (3570 lib + suites) and Linux VM (release monoio 30 suites; tokio 2968 lib + integration, 0 failed) all green; swf red-suite fully green on both platforms incl. the 700-conn swf2b drain-cap load test. - Bench (bench-results.txt): 4-shard c=1 SET p99 4.071ms -> 0.071ms (57x); pipelined 4-shard SET +368% / GET +19% — milestone criterion p99 < 1ms PASS, A2 no-regression gate PASS. - Consistency: branch vs main IDENTICAL (15 pre-existing FAILs + Phase-152 early-exit exist on main too; zero new failures). Residue recorded in s7. - Flake triage: test_txn_commit_wal_crash_recovery fails 1/15 on baseline AND build — pre-existing, follow-up noted. s7 lessons recorded: CI-matrix-before-done (task-1 Linux-only compile gap), PIPESTATUS on gating pipelines, stale second VM checkout trap, diskfull guard vs near-full shared volume, platform-portable load stimuli, binary-level flake A/B method. author: Tin Dang
…raps CLAUDE.md's OrbStack commands pointed at /Users/tindang/workspaces/tind-repo/ moon — a stale second checkout stuck at the hash-ttl era. VM runs that followed those commands silently tested months-old code (this produced a false regression alarm during spsc-wake-floor verify). The live repo is /Volumes/Games/tindang-repo/moon, exposed at the same path inside moon-dev. Also documents the traps discovered while verifying task 2: - pin MOON_BIN for server-spawning integration tests (find_moon_binary falls back to target/release/moon, unknown provenance) - CARGO_TARGET_DIR=target-linux for VM builds of the shared checkout - Moon's diskfull guard (<5% free) vs the near-full /Volumes/Games volume: run consistency/persistence suites from a VM-local clone - no macOS-side source edits while a VM build compiles the same checkout author: Tin Dang
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
|
Warning Review limit reached
More reviews will be available in 51 minutes and 55 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
📝 WalkthroughWalkthroughProject docs, templates, and skill guides were expanded; milestone/task state and benchmarks were added. Core runtime and shard paths changed for lock-free client/replication counters, socket options, event-loop wake/drain, WAL sender init, and io_uring in-flight buffering. Tests validate new behaviors and metrics. ChangesADD foundation and execution flow
Sequence Diagram(s)sequenceDiagram
participant MonoioEventLoop
participant Race2
participant SpscHandler
participant MetricsSetup
MonoioEventLoop->>Race2: race timer tick against spsc_notify_local.notified()
Race2-->>MonoioEventLoop: timer_fired or notify wake
MonoioEventLoop->>SpscHandler: drain_spsc_shared(..., repl_offsets)
SpscHandler-->>MonoioEventLoop: hit_cap bool
MonoioEventLoop->>MetricsSetup: bump_spsc_notify_wake() / bump_spsc_drain_renotify()
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
✨ Finishing Touches🧪 Generate unit tests (beta)
|
Lint gate requires a CHANGELOG entry: documents the event-driven SPSC wake (57x cross-shard p99), hot-path lock quick wins, the new INFO counters, and the uring_handler VecDeque compile fix. author: Tin Dang
There was a problem hiding this comment.
Actionable comments posted: 19
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
src/shard/event_loop.rs (1)
52-2471: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy liftSplit this event-loop module to satisfy the repository file-size limit.
This file is now well beyond the 1500-line cap and this PR adds more hot-path logic to it. Please split tokio/monoio loop bodies into submodules and keep this file as orchestration only.
As per coding guidelines, “No single
.rsfile should exceed 1500 lines. Split into submodules if approaching this limit.”🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/shard/event_loop.rs` around lines 52 - 2471, The run() function in event_loop.rs is over the 1500-line limit; split the large tokio and monoio loop bodies into smaller submodules so this file remains orchestration-only. Extract the tokio select! branch (all code inside the #[cfg(feature = "runtime-tokio")] tokio::select! block and its helper logic such as io_uring handling, conn accept branches, SPSC drain handling, persistence_tick bodies, timers, and WAL/CDC logic) into a new submodule (e.g., event_loop::tokio_loop) with a function like tokio_loop::run_tokio_loop(...) that accepts the same local state (shard_databases, wal_writer/wal_v3_writer, snapshot_state, timers, pubsub_arc, spsc_notify_local, etc.) and returns updated state where needed; likewise extract the monoio periodic/tick body (the #[cfg(feature = "runtime-monoio")] block including race2 await, monoio_tick_counter dispatch, and its SPSC drain/persistence work) into event_loop::monoio_loop::run_monoio_loop(...). In event_loop.rs keep only the top-level setup (variable initialization, configuration, PageCache, WAL setup, and final teardown) and call into the new run_tokio_loop/run_monoio_loop functions under the appropriate cfgs, passing ownership/refs of the required symbols (dispatch_tx, shard_databases, pubsub_arc, wal_append_rx, wal_writer, wal_v3_writer, checkpoint_manager, shard_manifest, spill_thread, spsc_notify_local, pending_migrations, pending_cdc_subscribes, autovacuum_daemon, timers, cached_clock, etc.); ensure any moved helper logic (e.g., persistence_tick::* interactions, uring_handler calls) remains reachable via use/re-exports and update visibility as needed so compilation and semantics remain unchanged.Source: Coding guidelines
src/shard/spsc_handler.rs (1)
1-3439: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy liftSplit this module to satisfy the repository’s Rust file-size rule.
src/shard/spsc_handler.rsis now far above the 1500-line cap, which violates the project’s explicit Rust module-size policy and makes further changes harder to review safely.As per coding guidelines: “No single
.rsfile should exceed 1500 lines. Split into submodules if approaching this limit.”🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/shard/spsc_handler.rs` around lines 1 - 3439, File is over the 1500-line module-size limit; split large logical units into smaller submodules. Move the big, self-contained units into separate files: extract drain_spsc_shared and its tests into a new spsc_drain.rs (keep symbol drain_spsc_shared); extract handle_shard_message_shared plus ShardMessage dispatch arms into spsc_dispatch.rs (keep symbol handle_shard_message_shared and ShardMessage usage); move vector-related helpers (dispatch_vector_command, auto_index_hset, auto_index_hset_public/_txn, find_vector_blob, handle_vector_insert, handle_vector_insert_field, update_metadata_only, index_payload_field, parse_geo_value) into spsc_vector.rs (preserve pub(crate) visibility where used); move wal_append_and_fanout and its wal_append_tests into spsc_wal.rs. In the original mod file replace the moved definitions with pub(crate) mod declarations (pub(crate) mod spsc_drain; etc.), adjust use paths (crate::shard::spsc_* or super::spsc_*) and re-export symbols if callers expect them at the old location; ensure private helpers remain private or use pub(crate) as needed, update any crate:: references/imports, run cargo build/test and fix visibility/import errors.Source: Coding guidelines
src/shard/shared_databases.rs (1)
144-149:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winRemove stale unsafe-transmutation documentation from
set_wal_append_tx.Line 145–146 still claims “unsafe transmutation of the Arc,” but this path now uses
OnceLock::setand contains no unsafe block. Please update/remove this stale note to keep safety docs accurate.Suggested doc cleanup
- /// Called once during event loop startup. Uses interior mutability via - /// unsafe transmutation of the Arc — safe because this is called exactly - /// once per shard before any connections are accepted. - /// Set the WAL append channel sender for a shard. - /// Called once during event loop startup before connections are accepted. + /// Called once during event loop startup before connections are accepted.As per coding guidelines, verify all safety-related comments remain accurate after implementation changes.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/shard/shared_databases.rs` around lines 144 - 149, The doc comment for set_wal_append_tx incorrectly mentions "unsafe transmutation of the Arc" even though the implementation now uses OnceLock::set with no unsafe; update the documentation in the set_wal_append_tx function to remove or reword the stale unsafe-transmutation claim and instead describe the current safe initialization behavior (e.g., that the WAL append channel sender is set once during event loop startup using OnceLock::set and no unsafe code is used).Source: Coding guidelines
src/shard/uring_handler.rs (1)
55-59:⚠️ Potential issue | 🟠 Major | ⚡ Quick winOnly enqueue
InFlightSendafter a successful send submission.Right now Lines 55–59 and 68–73 record in-flight entries even when
submit_send_fixed/submit_sendfails. That can desynchronize FIFO bookkeeping from CQEs and strand entries.Suggested patch
- let _ = driver.submit_send_fixed(conn_id, buf_idx, resp_len as u32); - inflight_sends - .entry(conn_id) - .or_default() - .push_back(InFlightSend::Fixed(buf_idx)); - return; + match driver.submit_send_fixed(conn_id, buf_idx, resp_len as u32) { + Ok(()) => { + inflight_sends + .entry(conn_id) + .or_default() + .push_back(InFlightSend::Fixed(buf_idx)); + return; + } + Err(e) => { + tracing::warn!("submit_send_fixed failed for conn {}: {}", conn_id, e); + driver.reclaim_send_buf(buf_idx); + } + } @@ - let _ = driver.submit_send(conn_id, ptr, len); - inflight_sends - .entry(conn_id) - .or_default() - .push_back(InFlightSend::Buf(resp_buf)); + match driver.submit_send(conn_id, ptr, len) { + Ok(()) => { + inflight_sends + .entry(conn_id) + .or_default() + .push_back(InFlightSend::Buf(resp_buf)); + } + Err(e) => { + tracing::warn!("submit_send failed for conn {}: {}", conn_id, e); + } + }Also applies to: 68-73
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/shard/uring_handler.rs` around lines 55 - 59, The code enqueues InFlightSend entries into inflight_sends unconditionally even when driver.submit_send_fixed or driver.submit_send fails; change both sites (where submit_send_fixed and submit_send are called) to check the Result from driver.submit_send_fixed(...) / driver.submit_send(...) and only call inflight_sends.entry(...).or_default().push_back(InFlightSend::Fixed(...)) or push_back(InFlightSend::Owned(...)) after the submission returns Ok; on Err, do not enqueue and handle or log the error instead so FIFO bookkeeping stays synchronized with CQEs and strand entries.
🧹 Nitpick comments (1)
src/vector/store.rs (1)
1-2448: 🏗️ Heavy liftFile exceeds length guideline by 948 lines.
This file is 2448 lines, which exceeds the 1500-line limit by 948 lines. As per coding guidelines: "No single .rs file should exceed 1500 lines. Split into submodules if approaching this limit."
While this PR does not introduce the violation (it's pre-existing), the guideline recommends splitting into submodules. Consider breaking this into:
store/core.rs- VectorStore and basic operationsstore/compaction.rs- compaction/merge logicstore/persistence.rs- warm/cold tier operationsstore/tests.rs- test module🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/vector/store.rs` around lines 1 - 2448, The file is too large (>1500 lines); split it into smaller submodules and re-export as needed: extract core index/store types and APIs (VectorStore, VectorIndex, IndexMeta, FieldSegments, VectorFieldMeta, related impls like create_index, drop_index, mark_deleted_for_key, begin_background_compactions/poll_install_compactions, resident_bytes, enforce_mmap_budget_all, try_warm_transitions_all/try_cold_transitions_all) into store/core.rs; move compaction/merge-related logic (compact_segments, begin_background_compact/_due, poll_install_compaction, begin_background_merge/_due, poll_install_merge, force_merge_index_with_tolerance, run_vacuum_pass, needs_merge) into store/compaction.rs; move persistence/tier transitions and registration (try_warm_transitions, try_cold_transitions, register_warm_segments, register_cold_segments, enforce_segment_holder_budget, warm/cold helpers and SegmentHandle usage) into store/persistence.rs; and move the #[cfg(test)] module into store/tests.rs; update mod declarations and any use paths to reference the new modules and re-export public types from store/mod.rs so callers of VectorStore, VectorIndex, IndexMeta keep the same symbols.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.add/docs/04-step-2-scenarios.md:
- Line 8: Remove the blank line inside the blockquote and add missing language
tags to fenced code blocks: update the blockquote to have no empty line between
its lines, ensure the first fenced block containing "Scenario: <short name>"
uses ```gherkin as the opening fence and a matching closing fence, and change
the second fenced block that starts with "Scenario: successful transfer" to also
open with ```gherkin and have a proper closing fence; locate these by the
scenario text ("Scenario: <short name>" and "Scenario: successful transfer") and
the surrounding blockquote.
In @.add/docs/05-step-3-contract.md:
- Line 8: Remove the blank line inside the blockquote on Line 8 and add language
identifiers to the fenced code blocks (e.g., change ``` to ```text) so
markdownlint MD028/MD040 are satisfied; specifically, edit the blockquote in
.add/docs/05-step-3-contract.md to eliminate the internal blank line and update
each triple-backtick fence around the contract examples (the blocks showing "#
contracts/<name>.md" and the "POST /transfers ..." examples) to use a language
tag like "text".
In @.add/docs/06-step-4-tests.md:
- Line 8: Remove the blank line inside the markdown blockquote that triggered
MD028 by deleting the empty line between the consecutive lines that start with
'>' so the blockquote has no blank lines inside; ensure each paragraph within
the blockquote begins with '>' and re-run markdownlint to confirm the MD028
warning is resolved.
In @.add/docs/07-step-5-build.md:
- Line 21: The markdown in .add/docs/07-step-5-build.md contains fenced code
blocks missing language identifiers (MD040); update the two offending blocks
(the opening triple-backticks that currently have no language tag around the
blocks containing the SPEC/test instructions and the AI pipeline snippet) to
include a language identifier (e.g., add "text" after the opening ```). Locate
the two occurrences referenced in the review (around the snippets starting with
"Read SPEC.md..." and "AI writes code → pipeline runs...") and change the
opening ``` to ```text for each so markdownlint MD040 is satisfied.
In @.add/docs/appendix-a-templates.md:
- Line 12: Update each plain fenced code block in the appendix-a-templates
markdown by adding a language identifier to the opening triple-backtick (change
``` to ```text) so the fences pass markdownlint MD040; search for the raw ```
openings in the file (the ones flagged in the review) and replace them with
```text for each occurrence.
In @.add/docs/appendix-b-prompts.md:
- Line 9: The heading "playbook/1_specify.md" is using ### which jumps from the
higher-level # header earlier; change that heading to the correct level (e.g.,
##) to restore proper heading hierarchy in the document, and add explicit
language identifiers to all fenced code blocks referenced (the backtick fences
near the occurrences shown) so they pass markdownlint (use appropriate tags like
```md, ```markdown, ```bash, or ```json depending on the block content). Ensure
the updated heading text remains exactly "`playbook/1_specify.md`" and update
each fenced block's opening fence to include the correct language token.
In @.add/docs/appendix-d-worked-example.md:
- Line 11: The four fenced code blocks in the worked example that currently use
plain triple backticks should include explicit language identifiers (e.g.,
```text, ```gherkin, ```http) to satisfy markdownlint and improve rendering;
edit each opening fence (the blocks that are just ``` with no language) and add
the appropriate language token matching the block content (use `text` for plain
output/example text, `gherkin` for feature/scenario snippets, `http` for
request/response examples) so the fenced blocks at those locations render and
lint correctly.
In @.add/docs/appendix-f-requirements-matrix.md:
- Line 153: The markdown fenced code block used for the traceability snippet is
missing a language tag; update the triple-backtick fence for that traceability
block to include the language identifier `text` (i.e., replace ``` with ```text)
so markdownlint passes and formatting is explicit.
In @.add/docs/appendix-g-references.md:
- Line 36: Update the inline code span in the markdown table row that currently
reads "joined by `; `" to remove the space inside the code span; change it to
something like "joined by `;` and a space" so the code span contains only the
semicolon (fixing MD038) — locate the table row with "several at once | joined
by `; ` |" and replace the middle cell accordingly.
In @.add/docs/README.md:
- Line 3: The heading "A complete, practical book on building software when AI
writes the code" is at level ### which jumps from the top-level # and triggers
markdownlint MD001; change that heading to level ## (or another appropriate
level consistent under the existing top-level title) so the document maintains a
proper hierarchical order and satisfies MD001.
In @.add/PROJECT.md:
- Around line 37-40: Add a blank line between the heading "## Key Decisions
(append-only)" and the start of the table, and also ensure there's a blank line
after the table (the row starting with "| date | decision | why | outcome |") so
the table is separated from surrounding content to satisfy MD058; update the
section around those symbols accordingly.
In @.add/state.json:
- Around line 17-26: The task state for "spsc-wake-floor" in state.json is
marked as phase: "done" and gate: "PASS" but MILESTONE.md still shows the
checkbox for spsc-wake-floor unchecked; reconcile these two sources of truth by
either (A) updating MILESTONE.md to check the box for spsc-wake-floor and mark
any related exit-criteria lines (lines referencing that task) as complete, or
(B) if the task is not actually complete, revert the entry in .add/state.json
for the spsc-wake-floor task (the object keyed "spsc-wake-floor") to the correct
phase/gate values and flag_verified status; verify other exit criteria mentioned
(the entries corresponding to the same task in MILESTONE.md) are updated to
match and ensure timestamps/updated fields in state.json reflect the correct
state.
- Line 32: The "stage" field in state.json is set to "mvp" but must match
MILESTONE.md's declared stage "production"; update the JSON "stage" key value
from "mvp" to "production" so the milestone stage is consistent across files and
run a quick validation to ensure the JSON remains valid.
In @.claude/skills/add/adopt.md:
- Around line 21-23: Several fenced code blocks in the ADD-skill docs (e.g., in
adopt.md) are untyped and trigger markdownlint MD040; update each
triple-backtick fence to include a language identifier (use `text` for plain
samples and `bash` for executable snippets) so every code fence is typed and CI
linting passes.
In @.claude/skills/add/report-template.md:
- Around line 18-22: Two unlabeled fenced-code blocks (the one starting with
"ARC goal: <the milestone / project goal this decision serves>" and the block
starting with "SUMMARY one line: intent + target + where we are") must be
labeled to satisfy markdownlint MD040; update both opening fences from ``` to a
language tag such as ```text so each fenced block includes a language
identifier, leaving the block contents unchanged and preserving
spacing/indentation.
In @.claude/skills/add/run.md:
- Around line 139-141: Add a language tag to the fenced code block containing
the line "autonomy: auto | conservative" (the triple-backtick block starting at
the line with that text) to satisfy markdownlint MD040; change the opening fence
from ``` to something like ```text (or ```yaml) so the block is
language-specified while keeping the block contents unchanged.
In @.claude/skills/add/streams.md:
- Around line 31-36: The fenced code block in .claude/skills/add/streams.md (the
unlabeled triple-backtick block showing the pipeline diagram) triggers
markdownlint MD040; add a language tag (e.g., ```text) to the opening fence so
the block is labeled and linting passes, ensuring the diagram lines remain
unchanged and only the fence is updated.
In `@src/server/conn/handler_sharded/dispatch.rs`:
- Around line 123-128: The code is using stale in-memory bits from e.live.flags
when calling e.live.touch(...), causing CLIENT INFO to show outdated flags;
change the flags source to reflect the current connection state instead of
e.live.flags. Replace the e.live.flags.load(...) call passed into
ClientFlags::from_bits with a value derived from the current conn (e.g., read
conn's live/flags field or call a helper that computes current flags from conn
state) so that e.live.touch(conn.selected_db, ClientFlags::from_bits(<current
conn flags>)) uses up-to-date flags; ensure you load atomically with the same
Ordering used elsewhere.
In `@src/shard/uring_handler.rs`:
- Around line 382-386: The call to
crate::server::socket_opts::apply_client_socket_opts(&std_listener) is currently
ignored, losing failures silently; change it to handle errors explicitly by
propagating the Result (use the ? operator on apply_client_socket_opts) or, if
the surrounding function signature cannot return a Result, at minimum log the
error and take a clear fallback path before calling std_listener.into_raw_fd().
Ensure the fix references apply_client_socket_opts and the code that calls
std_listener.into_raw_fd() so failures are not dropped silently.
---
Outside diff comments:
In `@src/shard/event_loop.rs`:
- Around line 52-2471: The run() function in event_loop.rs is over the 1500-line
limit; split the large tokio and monoio loop bodies into smaller submodules so
this file remains orchestration-only. Extract the tokio select! branch (all code
inside the #[cfg(feature = "runtime-tokio")] tokio::select! block and its helper
logic such as io_uring handling, conn accept branches, SPSC drain handling,
persistence_tick bodies, timers, and WAL/CDC logic) into a new submodule (e.g.,
event_loop::tokio_loop) with a function like tokio_loop::run_tokio_loop(...)
that accepts the same local state (shard_databases, wal_writer/wal_v3_writer,
snapshot_state, timers, pubsub_arc, spsc_notify_local, etc.) and returns updated
state where needed; likewise extract the monoio periodic/tick body (the
#[cfg(feature = "runtime-monoio")] block including race2 await,
monoio_tick_counter dispatch, and its SPSC drain/persistence work) into
event_loop::monoio_loop::run_monoio_loop(...). In event_loop.rs keep only the
top-level setup (variable initialization, configuration, PageCache, WAL setup,
and final teardown) and call into the new run_tokio_loop/run_monoio_loop
functions under the appropriate cfgs, passing ownership/refs of the required
symbols (dispatch_tx, shard_databases, pubsub_arc, wal_append_rx, wal_writer,
wal_v3_writer, checkpoint_manager, shard_manifest, spill_thread,
spsc_notify_local, pending_migrations, pending_cdc_subscribes,
autovacuum_daemon, timers, cached_clock, etc.); ensure any moved helper logic
(e.g., persistence_tick::* interactions, uring_handler calls) remains reachable
via use/re-exports and update visibility as needed so compilation and semantics
remain unchanged.
In `@src/shard/shared_databases.rs`:
- Around line 144-149: The doc comment for set_wal_append_tx incorrectly
mentions "unsafe transmutation of the Arc" even though the implementation now
uses OnceLock::set with no unsafe; update the documentation in the
set_wal_append_tx function to remove or reword the stale unsafe-transmutation
claim and instead describe the current safe initialization behavior (e.g., that
the WAL append channel sender is set once during event loop startup using
OnceLock::set and no unsafe code is used).
In `@src/shard/spsc_handler.rs`:
- Around line 1-3439: File is over the 1500-line module-size limit; split large
logical units into smaller submodules. Move the big, self-contained units into
separate files: extract drain_spsc_shared and its tests into a new spsc_drain.rs
(keep symbol drain_spsc_shared); extract handle_shard_message_shared plus
ShardMessage dispatch arms into spsc_dispatch.rs (keep symbol
handle_shard_message_shared and ShardMessage usage); move vector-related helpers
(dispatch_vector_command, auto_index_hset, auto_index_hset_public/_txn,
find_vector_blob, handle_vector_insert, handle_vector_insert_field,
update_metadata_only, index_payload_field, parse_geo_value) into spsc_vector.rs
(preserve pub(crate) visibility where used); move wal_append_and_fanout and its
wal_append_tests into spsc_wal.rs. In the original mod file replace the moved
definitions with pub(crate) mod declarations (pub(crate) mod spsc_drain; etc.),
adjust use paths (crate::shard::spsc_* or super::spsc_*) and re-export symbols
if callers expect them at the old location; ensure private helpers remain
private or use pub(crate) as needed, update any crate:: references/imports, run
cargo build/test and fix visibility/import errors.
In `@src/shard/uring_handler.rs`:
- Around line 55-59: The code enqueues InFlightSend entries into inflight_sends
unconditionally even when driver.submit_send_fixed or driver.submit_send fails;
change both sites (where submit_send_fixed and submit_send are called) to check
the Result from driver.submit_send_fixed(...) / driver.submit_send(...) and only
call inflight_sends.entry(...).or_default().push_back(InFlightSend::Fixed(...))
or push_back(InFlightSend::Owned(...)) after the submission returns Ok; on Err,
do not enqueue and handle or log the error instead so FIFO bookkeeping stays
synchronized with CQEs and strand entries.
---
Nitpick comments:
In `@src/vector/store.rs`:
- Around line 1-2448: The file is too large (>1500 lines); split it into smaller
submodules and re-export as needed: extract core index/store types and APIs
(VectorStore, VectorIndex, IndexMeta, FieldSegments, VectorFieldMeta, related
impls like create_index, drop_index, mark_deleted_for_key,
begin_background_compactions/poll_install_compactions, resident_bytes,
enforce_mmap_budget_all, try_warm_transitions_all/try_cold_transitions_all) into
store/core.rs; move compaction/merge-related logic (compact_segments,
begin_background_compact/_due, poll_install_compaction,
begin_background_merge/_due, poll_install_merge,
force_merge_index_with_tolerance, run_vacuum_pass, needs_merge) into
store/compaction.rs; move persistence/tier transitions and registration
(try_warm_transitions, try_cold_transitions, register_warm_segments,
register_cold_segments, enforce_segment_holder_budget, warm/cold helpers and
SegmentHandle usage) into store/persistence.rs; and move the #[cfg(test)] module
into store/tests.rs; update mod declarations and any use paths to reference the
new modules and re-export public types from store/mod.rs so callers of
VectorStore, VectorIndex, IndexMeta keep the same symbols.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 141c7b13-b5b8-41d1-ba9a-5ee0b37c852e
⛔ Files ignored due to path filters (4)
.add/docs/add-competencies.pngis excluded by!**/*.png.add/docs/add-flow.pngis excluded by!**/*.png.add/docs/add-foundation.pngis excluded by!**/*.png.add/docs/add-hierarchy.pngis excluded by!**/*.png
📒 Files selected for processing (91)
.add/CONVENTIONS.md.add/GLOSSARY.md.add/MODEL_REGISTRY.md.add/PROJECT.md.add/SETUP-REVIEW.md.add/dependencies.allowlist.add/docs/00-introduction.md.add/docs/01-principles.md.add/docs/02-the-flow.md.add/docs/03-step-1-specify.md.add/docs/04-step-2-scenarios.md.add/docs/05-step-3-contract.md.add/docs/06-step-4-tests.md.add/docs/07-step-5-build.md.add/docs/08-step-6-verify.md.add/docs/09-the-loop.md.add/docs/10-setup-and-stages.md.add/docs/11-governance.md.add/docs/12-roles.md.add/docs/13-adoption.md.add/docs/14-foundation.md.add/docs/15-foundations-and-lineage.md.add/docs/README.md.add/docs/appendix-a-templates.md.add/docs/appendix-b-prompts.md.add/docs/appendix-c-glossary.md.add/docs/appendix-d-worked-example.md.add/docs/appendix-e-checklists.md.add/docs/appendix-f-requirements-matrix.md.add/docs/appendix-g-references.md.add/milestones/v1-shared-nothing/MILESTONE.md.add/state.json.add/tasks/hotpath-lock-quickwins/TASK.md.add/tasks/hotpath-lock-quickwins/bench-results.txt.add/tasks/spsc-wake-floor/TASK.md.add/tasks/spsc-wake-floor/bench-results.txt.add/tooling/add.py.add/tooling/templates/CONVENTIONS.md.tmpl.add/tooling/templates/GLOSSARY.md.tmpl.add/tooling/templates/MILESTONE.md.tmpl.add/tooling/templates/MODEL_REGISTRY.md.tmpl.add/tooling/templates/PROJECT.md.tmpl.add/tooling/templates/TASK.md.tmpl.add/tooling/templates/dependencies.allowlist.tmpl.claude/skills/add/SKILL.md.claude/skills/add/adopt.md.claude/skills/add/deltas.md.claude/skills/add/fold.md.claude/skills/add/graduate.md.claude/skills/add/intake.md.claude/skills/add/loop.md.claude/skills/add/phases/0-setup.md.claude/skills/add/phases/1-specify.md.claude/skills/add/phases/2-scenarios.md.claude/skills/add/phases/3-contract.md.claude/skills/add/phases/4-tests.md.claude/skills/add/phases/5-build.md.claude/skills/add/phases/6-verify.md.claude/skills/add/phases/7-observe.md.claude/skills/add/report-template.md.claude/skills/add/run.md.claude/skills/add/scope.md.claude/skills/add/setup-review.md.claude/skills/add/streams.md.gitignoreAGENTS.mdCLAUDE.mdsrc/admin/metrics_setup.rssrc/client_registry.rssrc/command/connection.rssrc/replication/backlog.rssrc/replication/state.rssrc/runtime/channel.rssrc/runtime/mod.rssrc/runtime/race.rssrc/server/conn/handler_monoio/dispatch.rssrc/server/conn/handler_monoio/mod.rssrc/server/conn/handler_sharded/dispatch.rssrc/server/conn/handler_sharded/mod.rssrc/server/mod.rssrc/server/socket_opts.rssrc/shard/conn_accept.rssrc/shard/event_loop.rssrc/shard/shared_databases.rssrc/shard/spsc_handler.rssrc/shard/uring_handler.rssrc/vector/store.rstests/quickwins_red.rstests/quickwins_red_api.rstests/spsc_wake_floor_red.rstests/spsc_wake_floor_red_api.rs
| > **Purpose:** rewrite each rule from the spec as a concrete, pass-or-fail scenario. | ||
| > **Produces:** `features/<name>.feature`. | ||
| > **Person's job:** decide what "correct" looks like in concrete situations. **AI's job:** draft the scenarios. | ||
|
|
There was a problem hiding this comment.
Fix markdownlint violations in blockquote and fenced code blocks.
Line 8 has a blank line inside a blockquote (MD028), and Lines 31 and 41 start fenced blocks without language tags (MD040).
Proposed patch
> **Person's job:** decide what "correct" looks like in concrete situations. **AI's job:** draft the scenarios.
-
+>
> **Part of the specification bundle (v7).** In the default flow these scenarios are drafted by the AI alongside the spec, contract, and failing tests as **one bundle**, approved by a person **once** (the one approval), at the contract freeze — not signed off step by step. This chapter is how to get the scenarios *right*; [05 Contract](./05-step-3-contract.md) is where the bundle is frozen. See [11 Governance](./11-governance.md).
-```
+```gherkin
Scenario: <short name>
Given <starting situation>
When <action>
Then <expected result>
And <what must remain unchanged> # when relevant- +gherkin
Scenario: successful transfer
Given A has 100 and B has 0, both mine
When I transfer 30 from A to B
</details>
Also applies to: 31-31, 41-41
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>
[warning] 8-8: Blank line inside blockquote
(MD028, no-blanks-blockquote)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.add/docs/04-step-2-scenarios.md at line 8, Remove the blank line inside the
blockquote and add missing language tags to fenced code blocks: update the
blockquote to have no empty line between its lines, ensure the first fenced
block containing "Scenario: " uses gherkin as the opening fence and a matching closing fence, and change the second fenced block that starts with "Scenario: successful transfer" to also open with gherkin and have a
proper closing fence; locate these by the scenario text ("Scenario: " and "Scenario: successful transfer") and the surrounding blockquote.
</details>
<!-- fingerprinting:phantom:poseidon:hawk -->
<!-- cr-comment:v1:30a019b592a0cef6fdc04b8a -->
_Source: Linters/SAST tools_
<!-- This is an auto-generated comment by CodeRabbit -->
| > **Purpose:** fix the external shape of the feature — interfaces, data structures, names, and error cases — and freeze it. | ||
| > **Produces:** `contracts/<name>.md` (plus a mock and contract tests). | ||
| > **Person's job:** approve and freeze the shape. **AI's job:** generate the first draft, the mock, and the contract tests. | ||
|
|
There was a problem hiding this comment.
Address markdownlint MD028/MD040 in this chapter.
Line 8 uses a blank line inside a blockquote, and Lines 30/41 start fenced code blocks without language identifiers.
Proposed patch
> **Person's job:** approve and freeze the shape. **AI's job:** generate the first draft, the mock, and the contract tests.
-
+>
> **The one approval lands here (v7).** In the default flow the AI drafts spec, scenarios, this contract, and the failing tests as **one specification bundle**, and a person gives a **single approval at this freeze**. Freezing the contract is the one human gate of the bundle, not the third of three sign-offs; reject any part and the whole bundle returns to draft (backward correction, not failure). See [11 Governance](./11-governance.md).
-```
+```text
# contracts/<name>.md
<METHOD> <path> body: { <fields> }
200 -> { <success fields> }
4xx -> { error: "<code>" | "<code>" }
Schema: <tables/fields touched, and access pattern>
Status: FROZEN @ v<n>- +text
POST /transfers body: { fromAccountId, toAccountId, amount }
200 -> { transferId, fromBalance, toBalance }
</details>
Also applies to: 30-30, 41-41
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>
[warning] 8-8: Blank line inside blockquote
(MD028, no-blanks-blockquote)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.add/docs/05-step-3-contract.md at line 8, Remove the blank line inside the
blockquote on Line 8 and add language identifiers to the fenced code blocks
(e.g., change totext) so markdownlint MD028/MD040 are satisfied;
specifically, edit the blockquote in .add/docs/05-step-3-contract.md to
eliminate the internal blank line and update each triple-backtick fence around
the contract examples (the blocks showing "# contracts/.md" and the "POST
/transfers ..." examples) to use a language tag like "text".
</details>
<!-- fingerprinting:phantom:poseidon:hawk -->
<!-- cr-comment:v1:1219a9901baf4541d06014a0 -->
_Source: Linters/SAST tools_
<!-- This is an auto-generated comment by CodeRabbit -->
| > **Purpose:** turn the scenarios and contract into automated tests, and confirm they fail before any code exists. | ||
| > **Produces:** a failing (red) automated test suite. | ||
| > **Person's job:** set the targets and coverage. **AI's job:** generate the tests. | ||
|
|
There was a problem hiding this comment.
Remove the blank line inside the blockquote.
Line 8 triggers markdownlint MD028 (no-blanks-blockquote).
Proposed patch
> **Person's job:** set the targets and coverage. **AI's job:** generate the tests.
-
+>
> **Part of the specification bundle (v7).** In the default flow these tests are drafted by the AI as part of the specification **bundle** (spec · scenarios · contract · tests) and approved by a person **once**, at the contract freeze — the tests are part of what that one approval covers. They still must be **red before the build**. See [11 Governance](./11-governance.md).📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| > **Person's job:** set the targets and coverage. **AI's job:** generate the tests. | |
| > | |
| > **Part of the specification bundle (v7).** In the default flow these tests are drafted by the AI as part of the specification **bundle** (spec · scenarios · contract · tests) and approved by a person **once**, at the contract freeze — the tests are part of what that one approval covers. They still must be **red before the build**. See [11 Governance](./11-governance.md). |
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 8-8: Blank line inside blockquote
(MD028, no-blanks-blockquote)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.add/docs/06-step-4-tests.md at line 8, Remove the blank line inside the
markdown blockquote that triggered MD028 by deleting the empty line between the
consecutive lines that start with '>' so the blockquote has no blank lines
inside; ensure each paragraph within the blockquote begins with '>' and re-run
markdownlint to confirm the MD028 warning is resolved.
Source: Linters/SAST tools
|
|
||
| The instruction is explicit about constraints, because the constraints are what keep the speed safe. | ||
|
|
||
| ``` |
There was a problem hiding this comment.
Add language identifiers to fenced code blocks.
Lines 21 and 43 trigger markdownlint MD040 (fenced-code-language).
Proposed patch
-```
+```text
Read SPEC.md, contracts/<name>.md, and tests/<name>_test.py.
Implement the feature so that EVERY test passes.
Constraints:
@@
Report which tests pass and exactly what you changed.- +text
AI writes code → pipeline runs the tests → some still fail
→ AI iterates → ... → all green → hand to Verify
Also applies to: 43-43
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 21-21: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.add/docs/07-step-5-build.md at line 21, The markdown in
.add/docs/07-step-5-build.md contains fenced code blocks missing language
identifiers (MD040); update the two offending blocks (the opening
triple-backticks that currently have no language tag around the blocks
containing the SPEC/test instructions and the AI pipeline snippet) to include a
language identifier (e.g., add "text" after the opening ```). Locate the two
occurrences referenced in the review (around the snippets starting with "Read
SPEC.md..." and "AI writes code → pipeline runs...") and change the opening ```
to ```text for each so markdownlint MD040 is satisfied.
Source: Linters/SAST tools
| ## Project-level (set up once) | ||
|
|
||
| ### `CONVENTIONS.md` | ||
| ``` |
There was a problem hiding this comment.
Add language identifiers to fenced blocks.
These fences trigger markdownlint MD040. Add a language tag (e.g., text) to each opening fence for lint-clean docs.
Suggested diff pattern
-```
+```textAlso applies to: 22-22, 30-30, 40-40, 55-55, 64-64, 73-73, 82-82
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 12-12: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.add/docs/appendix-a-templates.md at line 12, Update each plain fenced code
block in the appendix-a-templates markdown by adding a language identifier to
the opening triple-backtick (change ``` to ```text) so the fences pass
markdownlint MD040; search for the raw ``` openings in the file (the ones
flagged in the review) and replace them with ```text for each occurrence.
Source: Linters/SAST tools
| ``` | ||
| ARC goal: <the milestone / project goal this decision serves> | ||
| done: <proven progress — tasks done · exit-criteria met · what this gate proves> | ||
| plan: <this gate → the next step → the goal> | ||
| ``` |
There was a problem hiding this comment.
Add fenced-code language tags to satisfy markdownlint.
Line 18 and Line 53 use unlabeled fenced blocks (MD040), which will keep docs lint noisy.
Suggested patch
-```
+```text
ARC goal: <the milestone / project goal this decision serves>
done: <proven progress — tasks done · exit-criteria met · what this gate proves>
plan: <this gate → the next step → the goal>- +text
SUMMARY one line: intent + target + where we are
DECISION what you need from the human (or "none — FYI")
⚠ FLAGS lowest-confidence first, why + cost-if-wrong
EVIDENCE small table: tests · gates · parity · check — engine-sourced
NEXT the single next action + what it unlocks
Also applies to: 53-59
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 18-18: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.claude/skills/add/report-template.md around lines 18 - 22, Two unlabeled
fenced-code blocks (the one starting with "ARC goal: <the milestone / project
goal this decision serves>" and the block starting with "SUMMARY one line:
intent + target + where we are") must be labeled to satisfy markdownlint MD040;
update both opening fences from ``` to a language tag such as ```text so each
fenced block includes a language identifier, leaving the block contents
unchanged and preserving spacing/indentation.
Source: Linters/SAST tools
| ``` | ||
| autonomy: auto | conservative | ||
| ``` |
There was a problem hiding this comment.
Specify a language for the fenced block at Line 139.
This block triggers markdownlint MD040; add a language tag (e.g., text) to keep doc lint clean.
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 139-139: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.claude/skills/add/run.md around lines 139 - 141, Add a language tag to the
fenced code block containing the line "autonomy: auto | conservative" (the
triple-backtick block starting at the line with that text) to satisfy
markdownlint MD040; change the opening fence from ``` to something like ```text
(or ```yaml) so the block is language-specified while keeping the block contents
unchanged.
Source: Linters/SAST tools
| ``` | ||
| add.py status ─► READY-QUEUE ──spawn workers──► builds run ──► REVIEW-QUEUE ──► done | ||
| (deps=PASS?) (machine span) (concurrent) (decision points, | ||
| ▲ strictly serial) | ||
| └──────────────── a task gating PASS unblocks its dependents ──────────────┘ | ||
| ``` |
There was a problem hiding this comment.
Add a language tag to the fenced block.
Line 31 uses an unlabeled fenced block, which triggers markdownlint MD040.
Suggested fix
-```
+```text
add.py status ─► READY-QUEUE ──spawn workers──► builds run ──► REVIEW-QUEUE ──► done
(deps=PASS?) (machine span) (concurrent) (decision points,
▲ strictly serial)
└──────────────── a task gating PASS unblocks its dependents ──────────────┘</details>
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>
[warning] 31-31: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.claude/skills/add/streams.md around lines 31 - 36, The fenced code block in
.claude/skills/add/streams.md (the unlabeled triple-backtick block showing the
pipeline diagram) triggers markdownlint MD040; add a language tag (e.g.,
ensuring the diagram lines remain unchanged and only the fence is updated.
Source: Linters/SAST tools
Adversarial perf + safety review of PR #172 (two specialist passes): 0 blockers, 0 introduced regressions. In-PR fix: document that spsc_notify_wakes includes drain-cap self-re-notify wakes (flume bounded(1) tokens coalesce, so producer wakes are approximately notify_wakes - drain_renotify). Deferred findings (all pre-existing or out-of-scope) recorded in TASK.md s7: AOF connection-path LSN lock (completes QW3), CLIENT LIST/INFO redundant registry lock, drain consumer round-robin, double mark_deleted_for_key, pending_wakers relay removal with task 3. author: Tin Dang
… passes) - CLIENT INFO derived its flags by reloading e.live.flags (stale bits frozen at the previous touch) instead of from current conn state; now mirrors the CLIENT LIST path in both handler_monoio and handler_sharded dispatch. - create_reuseport_listener silently dropped the TCP_NODELAY setsockopt result on the uring listener; now logs a warning (non-fatal, but QW1 latency behavior would be silently lost). author: Tin Dang
CodeRabbit flagged state.json/MILESTONE.md drift: check off spsc-wake-floor and the p99<1ms exit criterion (both gated PASS with evidence), and record the discovered consistency-suite gap on main against the 132/132 criterion. The state.json milestone stage field is add.py-owned machine state and is left to the tooling. author: Tin Dang
Review pass summary (2 specialist passes + CodeRabbit)Adversarial perf review and safety/concurrency review both returned 0 blockers, 0 regressions introduced. Key confirmations:
Fixed in-PR (5118a26, e6e95e7, 78ad88d):
Deferred to follow-up (recorded in
|
Summary
Two gated tasks from the 2026-06 architecture deep review (milestone
v1-shared-nothing), built with red/green TDD under ADD:Task 1 —
hotpath-lock-quickwins(ddcb6bc)Eliminates per-command global locks and accept-path overhead (review themes 1/5/6). Includes a latent fix found at task-2 verify:
uring_handler.rscalled.pushon aVecDeque(Linux+tokio-only path that never compiled on macOS — would have failed CI on its own).Task 2 —
spsc-wake-floor(9861407 → 4ecf285)The monoio shard loop's only await point was the 1ms periodic tick, so every cross-shard hop paid up to 1ms on the target shard plus up to 1ms on the origin's reply relay. Now:
src/runtime/race.rs(new): hand-rolled allocation-free two-arm race future (nomonoio::select!— it leaks per re-entry).race2(tick, spsc_notify.notified())— messages drain on arrival; cadence work (WAL flush, clock, sub-timers) stays timer-exclusive.OneshotReceiverraced against 100ms chunked sleeps, 30s cap; identical error strings) instead of the per-tickpending_wakerssweep.spsc_notify_wakes/spsc_drain_renotify.syncdelivers them (proven at runtime byswf0on kqueue and io_uring).Benchmarks (OrbStack Linux VM, monoio release, vs task-1 baseline)
Single-shard throughput also +63–80% (the tick-only loop was oversleeping local work). GET fast-path unchanged (within VM noise) — cross-shard reads never used SPSC.
Test plan
swf0,swf_a3), race2 semantics, INFO counters, burst completeness, drain-cap unit test, 700-conn#[ignore]drain-cap load test (green on both platforms), cadence pins (PX expiry + kill-9 WAL recovery)scripts/test-consistency.sh: branch == main exactly (15 pre-existing FAILs exist on main; zero new)-D warningson both feature sets, both platformsKnown pre-existing issues surfaced (not introduced here)
test_txn_commit_wal_crash_recoveryflakes 1/15 on baseline AND branch (BGREWRITEAOF/kill race) — follow-up candidateEvidence:
.add/tasks/{hotpath-lock-quickwins,spsc-wake-floor}/(TASK.md §6, bench-results.txt)Summary by CodeRabbit
New Features
Optimizations
Documentation