Skip to content

Refresh Linux host artifacts#125

Open
div0rce wants to merge 6 commits into
mainfrom
perf/linux-host-artifact-refresh
Open

Refresh Linux host artifacts#125
div0rce wants to merge 6 commits into
mainfrom
perf/linux-host-artifact-refresh

Conversation

@div0rce

@div0rce div0rce commented Jun 17, 2026

Copy link
Copy Markdown
Owner

Summary

  • Refresh the post-M49 Linux host artifacts from Fedora Asahi Linux for DPDK, NIC offload/timestamping, perf stat/report, NUMA/affinity, false sharing, socket profiling/load/stress, and crash recovery.
  • Add generated-artifact publishing normalization so raw tool output no longer leaves trailing horizontal whitespace or trailing blank EOF lines in committed reports.
  • Update project progress/memory docs to reflect that M49 is merged, this is a Linux artifact refresh follow-up, and issue M29 follow-up: generate full Linux hardware PMU perf artifacts #90 remains open because this host does not expose the required cache PMU counters.

Evidence Boundaries

  • nic_offload_environment.txt is a read-only capability observation for wld0 Broadcom BCM4387 via brcmfmac; no offload/RSS/timestamp/driver state was changed and no NIC latency benchmark ran.
  • dpdk_environment.txt records linux-missing-dpdk; no hugepages, device binding, or packet-path benchmark ran.
  • perf_stat_linux.txt is partial PMU evidence only: cycles/instructions/branches/branch-misses are visible, but cache-reference/cache-miss counters are unsupported on this host.
  • Socket artifacts are loopback constrained and do not exercise NIC, driver, routing, or real-network behavior.

Verification

  • git diff --check
  • find scripts -maxdepth 1 -name '*.sh' -exec bash -n {} \;
  • shellcheck scripts/qsl_common.sh scripts/dpdk_environment_check.sh scripts/nic_offload_check.sh scripts/perf_stat.sh scripts/perf_record.sh scripts/numa_affinity_study.sh scripts/profile_gateway_io.sh scripts/socket_load.sh scripts/socket_stress.sh scripts/crash_recovery_validation.sh
  • make check with host socket access: all 240 tests passed

Note: a sandboxed make check run failed to create the TCP loopback listener and was interrupted; the host-level rerun passed.

Summary by CodeRabbit

  • Documentation

    • Updated development roadmap and progress tracking; milestone M49 (NIC offload and low-latency networking study) marked complete; Linux host artifact refresh phase initiated
    • Regenerated benchmark results and research artifacts reflecting new environment
  • Chores

    • Improved artifact publishing workflow with automatic output formatting

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Currently processing new changes in this PR. This may take a few minutes, please wait...

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: efcdeef6-6fd0-4bfd-ad98-bcf4d273c130

📥 Commits

Reviewing files that changed from the base of the PR and between 5fc8d2c and 3731057.

📒 Files selected for processing (4)
  • HANDOFF.md
  • PROGRESS.md
  • results/nic_offload_environment.txt
  • scripts/qsl_common.sh
 ___________________________________________________________________
< That's not technical debt - that's technical *predatory lending*. >
 -------------------------------------------------------------------
  \
   \   (\__/)
       (•ㅅ•)
       /   づ
📝 Walkthrough

Walkthrough

Adds qsl_publish_artifact to scripts/qsl_common.sh to trim trailing whitespace and blank lines before atomically writing output files. All benchmark/study scripts are migrated from mv to this new helper. All study result artifacts are regenerated on a Fedora Asahi Linux AArch64 host, and milestone/roadmap docs record M49 as merged with a new follow-up branch active.

Changes

Linux host artifact refresh and publish helper

Layer / File(s) Summary
qsl_publish_artifact helper function
scripts/qsl_common.sh
New qsl_publish_artifact(tmp, out) function validates arguments, trims trailing whitespace per line, strips trailing blank lines, and atomically moves the cleaned temp file to the destination.
Script migrations to qsl_publish_artifact
scripts/crash_recovery_validation.sh, scripts/dpdk_environment_check.sh, scripts/nic_offload_check.sh, scripts/numa_affinity_study.sh, scripts/perf_record.sh, scripts/perf_stat.sh, scripts/profile_gateway_io.sh, scripts/socket_load.sh, scripts/socket_stress.sh
All nine scripts replace their final mv "$TMP_OUT" "$OUT" with qsl_publish_artifact "$TMP_OUT" "$OUT", including both normal and failure/unsupported-host paths.
Study docs updated for Fedora Asahi Linux
docs/dpdk_research.md, docs/nic_offload_study.md
dpdk_research.md replaces the macOS unsupported-host example with a linux-missing-dpdk Fedora Asahi classification; nic_offload_study.md replaces the macOS framing with a Linux read-only capability observation for wld0 (brcmfmac).
Regenerated results artifacts on Fedora Asahi Linux
results/dpdk_environment.txt, results/nic_offload_environment.txt, results/perf_stat_linux.txt, results/perf_report_linux.txt, results/numa_affinity_study.txt, results/socket_load_summary.txt, results/socket_profile_loopback.txt, results/socket_stress_summary.txt, results/crash_recovery_validation.txt, results/false_sharing_study.txt
All result files are regenerated with AArch64/Asahi Linux environment metadata and new measured values; DPDK and NIC offload results change classification from macOS unsupported-host to Linux-specific classes; perf stat/report now contain real PMU counter output.
Milestone and roadmap documentation
MILESTONES.md, PROGRESS.md, HANDOFF.md, AGENTS.md, CLAUDE.md
M49 marked merged in MILESTONES.md; PROGRESS.md adds a [2026-06-17] decision log entry and updates M48/M49 status rows; HANDOFF.md shifts active work to the artifact-refresh follow-up branch; AGENTS.md and CLAUDE.md record M49 completion and the new branch.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

  • M29 follow-up: generate full Linux hardware PMU perf artifacts #90: This PR regenerates Linux perf artifacts (perf_stat_linux.txt, perf_report_linux.txt) on Fedora Asahi and explicitly documents that issue #90 (missing required cache PMU counters on this host) remains unresolved, directly referencing it as a blocker in the updated PROGRESS.md.

Possibly related PRs

  • div0rce/quant-systems-lab#111: This PR adds qsl_publish_artifact to scripts/qsl_common.sh and migrates all scripts to use it — a direct extension of the pattern introduced when qsl_common.sh was first created and scripts were refactored to source it.
  • div0rce/quant-systems-lab#114: This PR modifies scripts/numa_affinity_study.sh to call qsl_publish_artifact in both its write_unsupported_artifact and main completion paths — directly touching the script introduced in PR #114.
  • div0rce/quant-systems-lab#117: The change to scripts/crash_recovery_validation.sh (switching mv to qsl_publish_artifact) is a direct follow-up to the crash-recovery validation harness first added in PR #117.

Suggested reviewers

  • codescene-delta-analysis

Poem

🐇 Hoppity-hop, a new helper appears,
Trimming the whitespace and calming our fears!
From macOS unsupported to Linux ARM land,
Fedora Asahi results now freshly in hand.
M49 merged, the milestone log gleams —
The rabbit refreshes all artifacts, it seems! 🌿

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The PR description provides a clear summary, evidence boundaries, and verification steps, though it deviates from the template structure (missing 'Milestone', 'Definition of Done', and 'Tests' sections in the specified format). Consider restructuring to align with the template: explicitly state milestone, use checkbox format for DoD items, and format test commands as specified in the template.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Refresh Linux host artifacts' accurately captures the main purpose of this PR, which refreshes multiple generated artifacts from a Fedora Asahi Linux host.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/linux-host-artifact-refresh

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

@div0rce div0rce marked this pull request as ready for review June 17, 2026 18:53
@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

Note

Unit test generation is a public access feature. Expect some limitations and changes as we gather feedback and continue to improve it.


Generating unit tests... This may take up to 20 minutes.

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

✅ Created PR with unit tests: #126

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/qsl_common.sh`:
- Around line 193-208: The temporary file created with mktemp on line 193 is
placed in the system's default temporary directory, which may be on a different
filesystem than the destination. This causes the mv command on line 207 to
potentially perform a non-atomic copy+unlink operation instead of an atomic
rename, risking incomplete writes if interrupted. Modify the mktemp call to
create the temporary file in the same directory as the destination file by using
the -p flag with the directory containing the output file, ensuring the
subsequent mv operation remains atomic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 895f1234-28c3-44ba-ba5c-7addca298d43

📥 Commits

Reviewing files that changed from the base of the PR and between d8c16b2 and 5fc8d2c.

📒 Files selected for processing (27)
  • AGENTS.md
  • CLAUDE.md
  • HANDOFF.md
  • MILESTONES.md
  • PROGRESS.md
  • docs/dpdk_research.md
  • docs/nic_offload_study.md
  • results/crash_recovery_validation.txt
  • results/dpdk_environment.txt
  • results/false_sharing_study.txt
  • results/nic_offload_environment.txt
  • results/numa_affinity_study.txt
  • results/perf_report_linux.txt
  • results/perf_stat_linux.txt
  • results/socket_load_summary.txt
  • results/socket_profile_loopback.txt
  • results/socket_stress_summary.txt
  • scripts/crash_recovery_validation.sh
  • scripts/dpdk_environment_check.sh
  • scripts/nic_offload_check.sh
  • scripts/numa_affinity_study.sh
  • scripts/perf_record.sh
  • scripts/perf_stat.sh
  • scripts/profile_gateway_io.sh
  • scripts/qsl_common.sh
  • scripts/socket_load.sh
  • scripts/socket_stress.sh

Comment thread scripts/qsl_common.sh Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5fc8d2cec7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread results/nic_offload_environment.txt Outdated
Comment on lines +54 to +56
link/ether 7e:c8:91:10:5e:af brd ff:ff:ff:ff:ff:ff permaddr 74:a6:cd:93:74:cc promiscuity 0 allmulti 0 minmtu 68 maxmtu 1500 netns-immutable addrgenmode none numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 parentbus pci parentdev 0000:01:00.0
altname wlp1s0f0
altname wlx74a6cd9374cc

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Redact host MAC addresses from artifacts

When this artifact is committed or shared publicly, the raw ip -details output exposes stable Wi-Fi hardware identifiers (permaddr and the wlx... altname), which violates the repo's no-secrets/no-host-identifiers expectation for generated evidence. Please sanitize MAC addresses before publishing the NIC capability report so future refreshes do not leak device-specific identifiers.

Useful? React with 👍 / 👎.

Comment thread PROGRESS.md
Linux host artifacts. `git diff --check`, Bash syntax checks, ShellCheck for the touched artifact
scripts, and host-level `make check` passed. Draft PR #125 is open for review:
<https://github.com/div0rce/quant-systems-lab/pull/125>.
- **Next action:** review draft PR #125; do not merge from automation.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep resume instructions in sync

This new next action points agents at draft PR #125, but the same file still ends with a Next action remains section telling them to wait on M49 PR #124, and HANDOFF.md still lists completed M47 as the top current priority. Because the repo tells agents to reconstruct state from these files on resume, these contradictory instructions can send the next session back to already-completed work instead of the artifact-refresh review.

Useful? React with 👍 / 👎.

Address review findings on PR #125:

- qsl_publish_artifact now creates its cleaned temp file in the
  destination directory, so the final mv is an atomic same-filesystem
  rename rather than a cross-filesystem copy+unlink that could leave a
  partially written artifact if interrupted.
- qsl_publish_artifact now redacts host MAC identifiers (link/ether,
  permaddr, and the wlx<mac> altname) before publishing, so generated
  evidence can no longer leak stable hardware identifiers (golden
  rule #15: no secrets in git).
- Redact the Wi-Fi MAC / permaddr / altname already present in
  results/nic_offload_environment.txt.
- Reconcile stale resume state: PROGRESS.md "Next action remains" and
  HANDOFF.md priority order now point at PR #125 instead of the merged
  M47/M49 work.

The committed nic artifact was sanitized in place to remove the
identifiers immediately; the Linux host artifacts should be regenerated
on that host to refresh source-digest provenance and apply the new
publish-time MAC sanitization.

@codescene-delta-analysis codescene-delta-analysis Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No application code in the PR — skipped Code Health checks.

See analysis details in CodeScene

Quality Gate Profile: Pay Down Tech Debt
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.

div0rce added a commit that referenced this pull request Jun 18, 2026
Point-in-time limitations audit of Quant Systems Lab for the PR #124/#125
state. Records the three PR #125 review findings (host-MAC leak,
non-atomic publish, stale resume docs), their on-branch fix
(perf/linux-host-artifact-refresh, commit 3731057), and step-by-step
instructions for the Linux agent to finish the artifact regeneration
that macOS cannot run. Host identifiers are described, not reproduced.
Handoff/working doc; not intended for merge.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 373105794f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Git commit (informational): 4d8d13b
Source digest: sha256:f05d1a186fb93fdde51c8d797cc84f56567880ef15e70fb6f4e7f96f29ffee2f
Git commit (informational): fd99ad8
Source digest: sha256:c7f574e1260e1f7bdc475423b302108e2708fe5ba8d3e115ae9d38f003a0b978

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Regenerate artifacts after publisher changes

The refreshed artifact records this source digest as clean provenance, but the current tree no longer hashes to it: recomputing the crash-recovery-validation digest from the committed inputs now yields sha256:380930b985462931bf26668fd2cea017698a961958df3e28b1eb9d0bba53033d, not the stored value, and the same stale-digest mismatch shows up across the refreshed result files. Because these Source digest fields are the authoritative artifact identity after the provenance migration, leaving them stale makes consumers treat evidence generated by an older generator state as if it matches this commit, especially after scripts/qsl_common.sh changed publishing behavior.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant