#
modelopt
Here are 4 public repositories matching this topic...
Guide for serving fine-tuned Qwen3.5-27B (dense, NVFP4) on DGX Spark via native vLLM. Includes critical config fixes for modelopt export_hf_checkpoint() that prevent silent FP32 dequantization.
quantization fine-tuning llama-cpp vllm llm-inference qwen speculative-decoding nvfp4 dgx-spark modelopt
-
Updated
Apr 21, 2026 - Python
-
Updated
Apr 14, 2026 - Python
Improve this page
Add a description, image, and links to the modelopt topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the modelopt topic, visit your repo's landing page and select "manage topics."