Tail-Aware HiFloat4: W4A4 Post-Training Quantization for Wan2.2
The article discusses a new method called Tail-Aware HiFloat4 for post-training quantization in low-bit text-to-video generation. This method adapts existing quantization techniques to improve performance while maintaining high precision in sensitive areas. The design aims to reduce the impact of calibration outliers during the quantization process.
- ▪Tail-Aware HiFloat4 is a submission to a low-bit text-to-video generation quantization challenge.
- ▪The method uses W4A4 HiFloat4 fake quantization for the main linear layers in Wan2.2 transformer modules.
- ▪An activation-tail-aware percentile calibration module is introduced to enhance channel-mask construction.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.26628 (cs) [Submitted on 26 May 2026] Title:Tail-Aware HiFloat4: W4A4 Post-Training Quantization for Wan2.2 Authors:Zhanfeng Feng, Shuai Guo, Xin Di, Long Peng, Yang Cao, Zhengjun Zha View a PDF of the paper titled Tail-Aware HiFloat4: W4A4 Post-Training Quantization for Wan2.2, by Zhanfeng Feng and 5 other authors View PDF HTML (experimental) Abstract:This report describes Tail-Aware HiFloat4, our submission to the low-bit text-to-video generation quantization challenge. Our method adapts the public ViDiT-Q post-training quantization pipeline to Wan2.2 under the HiFloat4 numerical format.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.