Skip to content

fix(test): apply aux-layer id fix to integration test (follow-up to #89)#90

Merged
yubofredwang merged 1 commit intomainfrom
ywang/fix-vllm-integration-aux-final-layer-id
Apr 28, 2026
Merged

fix(test): apply aux-layer id fix to integration test (follow-up to #89)#90
yubofredwang merged 1 commit intomainfrom
ywang/fix-vllm-integration-aux-final-layer-id

Conversation

@yubofredwang
Copy link
Copy Markdown
Collaborator

Summary

Follow-up to #89. The integration runner tests/test_vllm_engine_integration.py (script-style, not pytest-collected) duplicated the same off-by-one in two places:

  • get_aux_layer_ids built post-layer ids [1, num_layers // 2 - 1, num_layers - 4] but appended num_layers - 1 for the final slot, never applying the +1 shift.
  • main() did the same when --aux-layers was passed.

Without this fix, any manual end-to-end run of the integration script would have captured the wrong final-layer slot too, masking the issue from spot validation.

Changes

Both call sites now translate TorchSpec post-layer ids to vllm's capture-after-layer convention via the +1 shift, and append num_hidden_layers (the pre-norm slot) for last_hidden_states. Behavior matches the fixed VllmEngine.init from #89.

Test plan

Refs: #87, #89

Copilot AI review requested due to automatic review settings April 28, 2026 06:57
`tests/test_vllm_engine_integration.py` is a script-style runner (uses
`main()` with argparse, not pytest-collected) that exercises the full
extract_hidden_states + MooncakeHiddenStatesConnector path.  It
duplicated the same off-by-one fixed in #89 in two places:

- `get_aux_layer_ids` built `[1, num_layers // 2 - 1, num_layers - 4]`
  in TorchSpec post-layer semantics but appended `num_layers - 1` for
  the final-layer slot, missing the +1 shift entirely.
- `main()` did the same when the user overrode aux layers via CLI.

Without this fix, any end-to-end run of the integration script would
have captured the wrong final-layer slot too, hiding the issue from
manual validation.

This commit translates both call sites to the corrected convention:
TorchSpec post-layer ids are shifted +1 to match vllm's
capture-after-layer hook, and `num_hidden_layers` (the pre-`norm` slot)
is appended for `last_hidden_states`.  Behavior now matches the fixed
`VllmEngine.init`.

Follow-up to #89.

Signed-off-by: Yubo Wang <yubowang2019@gmail.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the integration runner to match vLLM’s capture-after-layer indexing so last_hidden_states resolves to the pre-norm slot (follow-up to #89).

Changes:

  • Shift default Eagle3 TorchSpec post-layer IDs by +1 to match vLLM capture indices.
  • When --aux-layers is provided, shift user-provided post-layer IDs by +1, filter out-of-range, and append num_hidden_layers for last_hidden_states.
  • Expand get_aux_layer_ids docstring to document the indexing convention and final-slot behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 50 to 68
def get_aux_layer_ids(model_path: str) -> list[int]:
"""Replicate VllmEngine's aux layer resolution: default Eagle3 layers + final layer."""
"""Replicate VllmEngine's aux layer resolution: default Eagle3 layers + final layer.

See ``VllmEngine.init`` for the canonical implementation. vllm's hidden
state hook captures index ``layer_idx + 1`` after each layer runs, so
valid capture indices are ``[0, num_hidden_layers]``; index
``num_hidden_layers`` is the pre-``norm`` slot used as
``last_hidden_states`` for target logit computation.
"""
cfg = AutoConfig.from_pretrained(model_path, trust_remote_code=True)
cfg = getattr(cfg, "text_config", cfg)
num_layers = cfg.num_hidden_layers
# Default Eagle3 aux layers
aux_ids = [1, num_layers // 2 - 1, num_layers - 4]
# VllmEngine appends the final layer for last_hidden_states capture
final_layer = num_layers - 1
if final_layer not in aux_ids:
aux_ids.append(final_layer)
# TorchSpec post-layer ids for default Eagle3 aux layers, shifted +1 to
# match vllm's capture-after-layer convention.
aux_ids = [lid + 1 for lid in [1, num_layers // 2 - 1, num_layers - 4]]
# Append the post-last-layer / pre-norm slot for last_hidden_states capture.
if num_layers not in aux_ids:
aux_ids.append(num_layers)
return aux_ids, cfg.hidden_size, num_layers
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_aux_layer_ids is annotated as returning list[int], but it returns a 3-tuple (aux_ids, hidden_size, num_layers) (and is unpacked that way in main). Update the return type annotation (and ideally the docstring summary) to match the actual return value to avoid misleading type usage.

Copilot uses AI. Check for mistakes.
Comment on lines +270 to +274
# User passes TorchSpec post-layer ids; shift +1 to vllm's
# capture-after-layer convention and append the pre-norm slot.
aux_layer_ids = [lid + 1 for lid in args.aux_layers if lid < num_layers]
if num_layers not in aux_layer_ids:
aux_layer_ids.append(num_layers)
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI help for --aux-layers still reads as generic “aux layer IDs”, but this block now explicitly interprets them as TorchSpec post-layer IDs and shifts them by +1. Please update the argparse help= text to state the expected convention (post-layer IDs) and that the script auto-appends the pre-norm slot (num_hidden_layers) so users don’t pass already-shifted vLLM capture indices by mistake.

Copilot uses AI. Check for mistakes.
@yubofredwang yubofredwang reopened this Apr 28, 2026
@yubofredwang yubofredwang merged commit 6f21829 into main Apr 28, 2026
7 checks passed
@yubofredwang yubofredwang deleted the ywang/fix-vllm-integration-aux-final-layer-id branch April 28, 2026 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants