Insight for: Support native NVFP4 / ModelOpt checkpoints (e.g. Qwen3.5-9B-NVFP4)

OBLITERATUS support for native NVFP4 / ModelOpt checkpoints.

Analyzed: Mar 31, 2026

This issue identifies a critical compatibility gap in OBLITERATUS: its inability to support native NVFP4 / ModelOpt checkpoints. This format is crucial for running 'stronger models on consumer GPUs' by optimizing VRAM usage. The current system, designed for `torch.float16` or `bitsandbytes` quantization, fails to load NVFP4 models due to 'parameter shape mismatches.' This directly blocks local 'abliteration workflows' for a growing segment of users leveraging these efficient formats. The lack of support for modern quantization techniques limits OBLITERATUS's market relevance and accessibility, particularly for users seeking to maximize performance on constrained hardware. This represents a significant technical debt impacting product utility and adoption.

NVFP4 ModelOpt checkpoints torch_dtype=torch.float16 bitsandbytes 4-bit fallback BitsAndBytesConfig NF4 HF checkpoints BnB quantization AxionML/Qwen3.5-9B-NVFP4 consumer GPUs VRAM local abliteration workflows accelerate model load parameter shape mismatches checkpoint config compressed-tensors nvidia-modelopt tensors

GitHub Issue

Parent Entity

Support native NVFP4 / ModelOpt checkpoints (e.g. Qwen3.5-9B-NVFP4)

State: Open