Skip to content

(Replaced by #1354) Update MiniMax M2.5 FP8 H200 vLLM agg recipes#1298

Closed
anish-shanbhag wants to merge 2 commits into
SemiAnalysisAI:mainfrom
anish-shanbhag:ashan/port-inferencemax-53-minimax-h200-no-slurm-shared
Closed

(Replaced by #1354) Update MiniMax M2.5 FP8 H200 vLLM agg recipes#1298
anish-shanbhag wants to merge 2 commits into
SemiAnalysisAI:mainfrom
anish-shanbhag:ashan/port-inferencemax-53-minimax-h200-no-slurm-shared

Conversation

@anish-shanbhag
Copy link
Copy Markdown
Collaborator

@anish-shanbhag anish-shanbhag commented May 7, 2026

Update MiniMax-M2.5 FP8 H200 vLLM to vllm/vllm-openai:v0.20.1-ubuntu2404

Set vLLM serving knobs in benchmarks/single_node/minimaxm2.5_fp8_h200.sh: generated benchmark max-model-len, previous eval max-model-len handling, fp8 KV cache, FlashInfer attention/autotune, Triton MoE, and MiniMax QK norm fusion.

@anish-shanbhag anish-shanbhag marked this pull request as ready for review May 8, 2026 23:48
@anish-shanbhag anish-shanbhag requested a review from a team May 8, 2026 23:48
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@anish-shanbhag anish-shanbhag force-pushed the ashan/port-inferencemax-53-minimax-h200-no-slurm-shared branch 3 times, most recently from e5688c8 to 8b72d09 Compare May 12, 2026 01:06
@anish-shanbhag anish-shanbhag force-pushed the ashan/port-inferencemax-53-minimax-h200-no-slurm-shared branch from 8b72d09 to 3dea91d Compare May 12, 2026 01:08
@anish-shanbhag anish-shanbhag force-pushed the ashan/port-inferencemax-53-minimax-h200-no-slurm-shared branch from b66321f to 3dea91d Compare May 12, 2026 17:56
Copy link
Copy Markdown
Contributor

@functionstackx functionstackx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert github.token change


- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
token: ${{ secrets.REPO_PAT }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anish-shanbhag can u revert this plz

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @functionstackx, CI was failing here since the source branch is from a fork; I've opened #1354 to supersede this PR

@anish-shanbhag
Copy link
Copy Markdown
Collaborator Author

Closing in favor of #1354 which no longer uses a fork branch

@anish-shanbhag anish-shanbhag changed the title Update MiniMax M2.5 FP8 H200 vLLM agg recipes (Replaced by #1354) Update MiniMax M2.5 FP8 H200 vLLM agg recipes May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

3 participants