Base32 Encoder & Decoder

Base32 encoding is a binary-to-text encoding scheme that represents binary data with a set of 32 different printable characters. It's particularly useful for applications where human readability and error resistance are important factors.

How Base32 Encoding Works

Base32 encoding converts binary data into a limited character set, making it safe for protocols that only support a subset of ASCII characters. The process involves several specific steps:

Divide the input data into 5-byte (40-bit) groups
Split each 40-bit group into eight 5-bit chunks
Convert each 5-bit value (0-31) to the corresponding Base32 character
If the final group has fewer than 5 bytes, pad with '=' signs as needed

The Base32 Encoding Process

Input bytes:    | Byte 1 | Byte 2 | Byte 3 | Byte 4 | Byte 5 |
                |76543210|76543210|76543210|76543210|76543210|
                +--------+--------+--------+--------+--------+
                |              40 bits of input               |
                +--------+--------+--------+--------+--------+
                |43210|43210|43210|43210|43210|43210|43210|43210|
Output chars:   |  A  |  B  |  C  |  D  |  E  |  F  |  G  |  H  |

Encoding 5 bytes into 8 Base32 characters

Padding in Base32

When the input data's length isn't a multiple of 5 bytes, padding is added to ensure the output follows the Base32 format rules. The number of padding characters ('=') depends on how many bytes are in the final incomplete group:

Input Bytes in Final Group	Output Characters	Padding Characters
1 byte (8 bits)	2 chars	6 padding chars (======)
2 bytes (16 bits)	4 chars	4 padding chars (====)
3 bytes (24 bits)	5 chars	3 padding chars (===)
4 bytes (32 bits)	7 chars	1 padding char (=)
5 bytes (40 bits)	8 chars	No padding

Common Applications of Base32

Two-Factor Authentication (TOTP)

Google Authenticator and other TOTP apps use Base32 for encoding shared secrets. The Base32 format makes it easier for users to manually enter these keys when setting up a new device.

Backup and Recovery Codes

Many services provide backup codes in Base32 format for account recovery purposes, as they're easier to transcribe correctly than Base64 or hex.

Tor .onion Addresses

Tor hidden service addresses use a modified form of Base32 encoding to generate their .onion domain names.

File Systems

Some file systems use Base32 for representing file names or identifiers that need to be case-insensitive but still readable.

Base32 Variants

Several variations of Base32 exist for specific use cases:

Base32hex: Uses the digits 0-9 and letters A-V, making it more suitable for hexadecimal-familiar users
z-base-32: Designed to use easier-to-distinguish characters for human recognition
Crockford's Base32: Uses carefully chosen characters to minimize transcription errors
Base32 for Geohashing: A specialized variant used in geohashing coordinates

Base32 vs Base64 Size Comparison

Base32 encoding increases the data size by about 60% (5 bytes → 8 characters), while Base64 increases it by about 33% (3 bytes → 4 characters). The tradeoff is readability and error resistance versus compactness.

Conclusion

Base32 encoding provides a good balance between data density and human readability. While not as compact as Base64, it offers benefits in specific scenarios where manual transcription is common, or where case-insensitivity and URL safety are important. Understanding when to use Base32 instead of other encoding schemes is a valuable skill for developers working on systems where human interaction with encoded data is necessary.

What is Base32 Encoding?

Why Use Base32?

Base32 Alphabet

Base32 vs Base64

Understanding Base32 Encoding and Decoding