GitHub Issue

Don't work on vulkan device

Discovered On Mar 28, 2026
Primary Metric open
~/Scaricati/llama-cpp-turboquant/build/bin$ sudo ./llama-server -m /media/vincenzo/Dati/models/unsloth/Qwen3.5-27B-GGUF/Qwen3.5-27B-Q6_K.gguf -ctk turbo3 -ctv turbo3 ggml_vulkan: Found 2 Vulkan devices: ggml_vulkan: 0 = AMD Ryzen 9 9900X 12-Core Processor (RADV RAPHAEL_MENDOCINO) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 65536 | int dot: 0 | matrix cores: none ggml_vulkan: 1 = AMD Radeon RX 7900 XTX (RADV NAVI31) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 0 | matrix cores: KHR_coopmat main: n_parallel is set to auto, using n_parallel = 4 and kv_unified = true build: 8621 (a52586e2a) with GNU 15.2.0 for Linux x86_64 system info: n_threads = 12, n_threads_batch = 12, total_threads = 24 system_info: n_threads = 12 (n_threads_batch = 12) / 24 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | AVX512_BF16 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 | init: using 23 threads for HTTP server start: binding port with default address family main: loading model srv load_model: loading model '/media/vincenzo/Dati/models/unsloth/Qwen3.5-27B-GGUF/Qwen3.5-27B-Q6_K.gguf' common_init_result: fitting params to device memory, for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on /home/vincenzo/Scaricati/llama-cpp-turboquant/ggml/src/ggml-backend.cpp:809...
View Raw Thread