SVD-based mixed-precision quantization framework for MoE models: separates shared basis from expert-specific factors, allocates bits via integer linear programming. +27.83pp accuracy over GPTQ under 2-bit Qwen3-30B-A3B, 12.3x decoding acceleration. Code public.
Single story / full context
BitsMoE: Spectral Energy-Guided Bit Allocation for MoE LLM Quantization (+27.83pp over GPTQ at 2-bit)
SVD-based mixed-precision quantization framework for MoE models: separates shared basis from expert-specific factors, allocates bits via integer linear programming. +27.83pp accuracy over GPTQ under 2-bit Qwen3-30B-A3B, 12.3x decoding acceleration. Code public.