5× faster fast_blur in image-rs
The article details a performance optimization of the fast_blur function in the Rust image-rs crate, achieving up to 5.9× faster execution for u8 pixel images. The improvement stems from replacing floating-point operations with integer arithmetic in the blur's hot path. This change reduces reliance on costly float conversions and rounding functions while maintaining the algorithm's visual quality.
- ▪The fast_blur function in image-rs was optimized to be up to 5.9× faster for u8 pixel images.
- ▪The optimization replaced floating-point arithmetic with integer arithmetic to avoid costly roundf and to_f32 operations.
- ▪fast_blur approximates a Gaussian blur using three successive box blurs, maintaining O(1) per-pixel complexity regardless of blur radius.
Opening excerpt (first ~120 words) tap to expand
Arthur PastelBlogAbout5× faster fast_blur in image-rs A few weeks ago, I went looking for something to optimize, for fun. The Rust image crate jumped out: I'd used it before, it's one of the most downloaded crates, and image work is heavy by nature. I quickly found a method called fast_blur. With a name like that, it seemed worth trying to optimize. The result: up to 5.9x faster on images with u8 pixels. fast_blur σ=3×5.9 fasterbefore51.1 msafter8.6 msfast_blur σ=7×5.7 fasterbefore51.5 msafter9.0 msfast_blur σ=50×5.1 fasterbefore52.2 msafter10.2 ms(lower is better)Wall-time measurement on x86_64 Before we get into the optimizations, let's take a step back and understand how blurs work, the tradeoffs between different algorithms, and where fast_blur fits in the spectrum of blur quality and…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Arthur Pastel.