Base64 encoding and decoding at almost the speed of a memory copy
Researchers have developed a method to perform base64 encoding and decoding at speeds close to that of a memory copy on modern Intel processors. The technique leverages the AVX-512 SIMD instruction set to significantly reduce the number of instructions required compared to prior approaches. The implementation is adaptable to various base64 variants by changing constants, even at runtime.
- ▪Base64 encoding is commonly used to embed binary data like images in text-based formats such as HTML, JSON, and email.
- ▪The new method achieves performance near that of memcpy, especially when data exceeds the processor's L1 cache.
- ▪The approach uses AVX-512 SIMD instructions and generates several times fewer instructions than previous SIMD-based codecs.
- ▪The implementation can be adapted to any base64 variant by modifying constants, including at runtime.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Distributed, Parallel, and Cluster Computing arXiv:1910.05109 (cs) [Submitted on 2 Oct 2019] Title:Base64 encoding and decoding at almost the speed of a memory copy Authors:Wojciech Muła, Daniel Lemire View a PDF of the paper titled Base64 encoding and decoding at almost the speed of a memory copy, by Wojciech Mu{\l}a and 1 other authors View PDF Abstract:Many common document formats on the Internet are text-only such as email (MIME) and the Web (HTML, JavaScript, JSON and XML). To include images or executable code in these documents, we first encode them as text using base64. Standard base64 encoding uses 64~ASCII characters: both lower and upper case Latin letters, digits and two other symbols.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv.org.