modelopt

Here are 4 public repositories matching this topic...

AEON-7 / supergemma4-26b-abliterated-multimodal-nvfp4

NVFP4 AWQ Full quantization of SuperGemma4-26B-Abliterated-Multimodal for Blackwell GPUs — pre-built vLLM container + patches included

moe quantization multimodal blackwell awq llm vllm nvfp4 dgx-spark gemma4 modelopt

Updated Apr 15, 2026
Python

AImindPalace / dgx-spark-nvfp4-serving

Star

Guide for serving fine-tuned Qwen3.5-27B (dense, NVFP4) on DGX Spark via native vLLM. Includes critical config fixes for modelopt export_hf_checkpoint() that prevent silent FP32 dequantization.

quantization fine-tuning llama-cpp vllm llm-inference qwen speculative-decoding nvfp4 dgx-spark modelopt

Updated Apr 21, 2026
Python

AEON-7 / Gemma-4-E4B-DECKARD-HERETIC-Uncensored-NVFP4

Star

EAGLE E4B speculative decoding drafter for Gemma 4 31B DECKARD HERETIC Uncensored NVFP4 — optimized for NVIDIA DGX Spark

eagle drafter blackwell awq vllm speculative-decoding nvfp4 dgx-spark gemma4 modelopt

Updated Apr 13, 2026

AEON-7 / modelopt-fast-moe

Star

nvidia calibration moe quantization gemma mixture-of-experts awq llm nvfp4 modelopt

Updated Apr 14, 2026
Python

Improve this page

Add a description, image, and links to the modelopt topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the modelopt topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modelopt

Here are 4 public repositories matching this topic...

AEON-7 / supergemma4-26b-abliterated-multimodal-nvfp4

AImindPalace / dgx-spark-nvfp4-serving

AEON-7 / Gemma-4-E4B-DECKARD-HERETIC-Uncensored-NVFP4

AEON-7 / modelopt-fast-moe

Improve this page

Add this topic to your repo