Skip to content

feat:(reproduce-drawer): add reproduce drawer frontend and benchmark_environment table with ingest#346

Open
rafaykhan-source wants to merge 4 commits into
SemiAnalysisAI:masterfrom
rafaykhan-source:issue-270
Open

feat:(reproduce-drawer): add reproduce drawer frontend and benchmark_environment table with ingest#346
rafaykhan-source wants to merge 4 commits into
SemiAnalysisAI:masterfrom
rafaykhan-source:issue-270

Conversation

@rafaykhan-source
Copy link
Copy Markdown
Contributor

Reproduce Drawer

Issue Requirements

A drawer that opens from any benchmark point or row and shows exactly how to reproduce that number: the framework launch command, the full config JSON, and the environment it ran in (framework SHA, container tag, driver, CUDA, GPU SKU).

Requirements

  1. Triggered from scatter-point clicks and data-table row clicks. Coexists cleanly with the existing pinned-tooltip and zoom behavior on the chart, ideally a link or button on the tooltip.

Exists as a button in the tooltip, and rows of the data-table are now clickable, linking to the reproduce drawer.

  1. Three tabs: command, config JSON, environment. Each has a copy button.

There are three tabs with a shared copy button that will copy the contents of the active tab.

Reproduce Tooltip:
image

  1. Command tab is framework-aware and covers every framework in the registry. Frameworks without a well-defined launch command show a clear fallback pointing the user to the config JSON.

Launch Command:
image

Launch Command Fallback:
image

  1. Launch-command generation is a pure function from config + framework to CLI string, so it is unit-testable per framework and reusable for future diffing.

The pure function is entitled buildLaunchCommand.

  1. Drawer closes on Esc and outside-click without losing chart zoom or other URL state.
  2. Link out to the run's server log.

Exists alongside the shared copy button for each tab:
image

  1. Analytics: drawer-open and copy events, with the framework as a property on copy events.

Analytics were added and track open and copy events with framework as a property.

Additional Screenshots:

Config JSON Tab:
image

Framework Tab:
image

Additional Notes

  1. Only some of the desired fields of: "framework SHA, container tag, driver, CUDA, GPU SKU" in the issue are accessible via server logs. Regex is used to parse the logs for information to populate the benchmark_environments table (there is also a pnpm backfill command along with the migration).

The schema for benchmark_environments is as follows:
image

There are two methods for sourcing the information. One env_json is reliant on a possible upstream change to CI to populate this information alongside server logs if possible, and the fallback log_parse tries to extract the environment info from server logs.

If making the CI change is not something on the roadmap or worth doing, I can totally drop related fields and code predicated on it, and stick with making a new table that only extract what info can be extracted from server_logs.

github-actions Bot and others added 4 commits May 13, 2026 19:23
Adds a 3-tab drawer (Command / Config JSON / Environment) that opens from
scatter pinned tooltip, GPU graph tooltip, or inference table row, showing
exactly how a benchmark was produced.

- Pure `buildLaunchCommand(framework, config)` library with per-framework
  generators for vllm, sglang, trt (with the trtllm alias). Disagg configs
  emit two stitched commands (prefill / decode workers). Compound stacks
  (Dynamo, ATOM, MoRI) intentionally show a clear fallback that points at
  the Config JSON tab.
- ReproduceDrawer reuses the existing right-side dialog pattern, with per-
  tab Copy buttons, a Server log link, and Esc + outside-click close that
  doesn't perturb chart zoom or URL state.
- Drawer state lives in InferenceContext so it is reachable from scatter,
  GPU graph, and the table without prop-drilling.
- Analytics: reproduce_drawer_opened, reproduce_copy (with framework),
  reproduce_drawer_open_clicked, reproduce_server_log_clicked,
  inference_table_reproduce_clicked.
- Tests: 21 unit tests for buildLaunchCommand cover every framework,
  disagg, compound fallbacks, alias resolution, missing-field placeholders;
  4 new tooltipUtils tests assert the Reproduce button only appears when
  pinned; new E2E spec exercises the table-row entry point and the
  unofficial-run overlay path.

Closes SemiAnalysisAI#270

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 13, 2026

@rafaykhan-source is attempting to deploy a commit to the SemiAnalysisAI Team on Vercel.

A member of the Team first needs to authorize it.

@rafaykhan-source rafaykhan-source marked this pull request as ready for review May 13, 2026 23:55
@vercel
Copy link
Copy Markdown

vercel Bot commented May 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
inferencemax-app Ready Ready Preview, Comment May 14, 2026 0:05am

Request Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant