Repository contains code for encoding and decoding base64 using SIMD instructions. Depending on CPU's architecture, vectorized encoding is faster than scalar versions by factor from 2 to 4; decoding is faster 2 .. 2.7 times.
There are several versions of procedures utilizing following instructions sets:
Vectorization approaches were described in a series of articles:
Daniel Lemire and I wrote also paper Faster Base64 Encoding and Decoding Using AVX2 Instructions which was published by ACM Transactiona on the Web.
Performance results from various machines are located in subdirectories results
.
There are separate subdirectories for both algorithms, however both have the same structure. Each project contains four programs:
verify
--- does simple validation of particular parts of algorithms,check
--- validates whole procedures,speed
--- compares speed of different variants of procedures,benchmark
--- similarly to speed
but works on small buffers and calculates CPU cycle rate (available only for Intel architectures).Change to either directory encode
or decode
and then use following make
commands.
make
verify
, check
, speed
, benchmark
scalar, SSE, BMI2 make avx2
verify_avx2
, check_avx2
, speed_avx2
, benchmark_avx2
scalar, SSE, BMI2, AVX2 make avx512
verify_avx512
, check_avx512
, speed_avx512
, benchmark_avx512
scalar, SSE, BMI2, AVX2, AVX512F make avx512bw
verify_avx512bw
, check_avx512bw
, speed_avx512bw
, benchmark_avx512bw
scalar, SSE, BMI2, AVX2, AVX512F, AVX512BW make avx512vbmi
verify_avx512vbmi
, check_avx512vbmi
, benchmark_avx512vbmi
scalar, SSE, BMI2, AVX2, AVX512F, AVX512BW, AVX512VBMI make xop
verify_xop
, check_xop
, speed_xop
, benchmark_xop
scalar, SSE and AMD XOP make arm
verify_arm
, check_arm
, speed_arm
scalar, ARM Neon
Type make run
(for SSE) or make run_ARCH
to run all programs for given instruction sets; ARCH
can be "sse", "avx2", "avx512", "avx512bw", "avx512vbmi", "avx512vl".
BMI2 presence is determined based on /proc/cpuinfo
or a counterpart. When an AVX2 or AVX512 targets are used then BMI2 is enabled by default.
To compile AVX512 versions of the programs at least GCC 5.3 is required. GCC 4.9.2 doesn't have AVX512 support.
Please download Intel Software Development Emulator in order to run AVX512 variants via make run_avx512
, run_avx512bw
or run_avx512vbmi
. The emulator path should be added to the PATH
.
Both encoding and decoding don't match the base64 specification, there is no processing of data tail, i.e. encoder never produces '=' chars at the end, and decoder doesn't handle them at all.
All these shortcoming are not present in a brilliant library by Alfred Klomp: https://github.com/aklomp/base64.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4