-
Notifications
You must be signed in to change notification settings - Fork 228
Pull requests: sgl-project/SpecForge
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Supports eagle3 training for Gemma3 27B and Gemma4 26B.
#553
opened May 1, 2026 by
pyc96
Collaborator
Loading…
6 tasks
Add transformers-like checkpoint parameters (--save-total-limit, --save-strategy, and so on)
#547
opened Apr 27, 2026 by
thechaos16
Loading…
2 of 6 tasks
feat: add is_vlm param to safe_conversations_generator for multimodal data support
#545
opened Apr 24, 2026 by
sunny-infra
Loading…
2 of 6 tasks
chore: regenerate_train_data accepts API key and https URL
#543
opened Apr 24, 2026 by
lianakoleva
Loading…
1 of 6 tasks
add the configs for qwen3-vl-8b-instruct model
#542
opened Apr 23, 2026 by
sunny-infra
Loading…
1 of 6 tasks
fix: correct vocab_size to 262144 in gemma3-1b-eagle3 configuration
#537
opened Apr 16, 2026 by
javierlimt6
Loading…
1 of 6 tasks
Fix incorrect LSE gradients in cached FlashAttention for Eagle training
#536
opened Apr 16, 2026 by
uygnef
Collaborator
Loading…
6 tasks
fix: EAGLE-3 training compatibility with multimodal-wrapped targets and large vocabs
#535
opened Apr 16, 2026 by
elad-inferize
Loading…
4 tasks done
Preserve drafter vocab mapping when fine-tuning from a checkpoint
#534
opened Apr 15, 2026 by
luv-bansal
Loading…
[Fix] preserve image data in preprocess for VLM training on multimodal data
#532
opened Apr 13, 2026 by
jamesahou
Loading…
2 of 6 tasks
fix: Bump sglang version from 0.5.9 to 0.5.10
#529
opened Apr 13, 2026 by
moehanabi
Contributor
Loading…
1 of 6 tasks
Reduce peak GPU memory in Eagle3 online target generation by avoiding an extra logits copy
#528
opened Apr 9, 2026 by
zijiexia
Loading…
1 of 6 tasks
Fix VLM preprocessing and add mRoPE position handling in target head
#527
opened Apr 8, 2026 by
liusy58
Loading…
6 tasks
Fix multimodal hidden-state preparation for Qwen3-VL models
#526
opened Apr 8, 2026 by
liusy58
Loading…
6 tasks
feat: reduce Eagle3 training memory spike via all-to-all sharding
#524
opened Apr 5, 2026 by
laoconeth
Loading…
2 of 6 tasks
[Feature] Train infer disaggregated
#523
opened Apr 2, 2026 by
jiapingW
Collaborator
Loading…
5 tasks
fix: Make template override arg work correctly
#522
opened Apr 1, 2026 by
moehanabi
Contributor
Loading…
1 of 6 tasks
Previous Next
ProTip!
Adding no:label will show everything without a label.