Base32 encodes binary data into a safe text format using 32 ASCII characters.
Base32 encoding is a method of encoding binary data into an ASCII string format, similar to Base64, but using a 32-character alphabet. It is commonly used in applications where a compact and case-insensitive encoding is required, such as in URL shortening, file integrity checks, or encoding data for QR codes.
Base32 encoding works by dividing the input binary data into chunks of 5 bytes (40 bits). These 40 bits are then split into eight 5-bit groups, with each group being represented by a Base32 character. If the input data is not a multiple of 5 bytes, padding is added at the end to ensure the output is a multiple of 8 characters.
The Base32 alphabet consists of 32 characters. These characters are:
This results in a total of 32 characters, hence the name "Base32."
Let's take a simple example with the input string "Cat":
The ASCII values of "C," "a," and "t" are 67, 97, and 116, respectively. In binary, these values are:
Combining these 3 bytes (24 bits):
01000011 01100001 01110100
Breaking this into eight 5-bit groups (padding will be required to make the total bits divisible by 5):
01000 01101 10001 01110 10000
Convert these groups into decimal:
Thus, the Base32-encoded string for "Cat" is "INROQ".
When the input data is not a multiple of 5 bytes, Base32 encoding uses padding to maintain the structure of the output. For instance:
Base32 encoding is used in a variety of applications, particularly when there is a need for a compact, case-insensitive representation of binary data. Some common uses include:
Decoding Base32 is the reverse process. It takes the encoded string, maps the Base32 characters back to their 5-bit representations, combines the bits, and restores the original binary data. Padding characters are removed during the decoding process.
In summary, Base32 encoding is a useful method for transforming binary data into a compact, case-insensitive text representation that can be safely transmitted or stored in systems designed for text-based data. It has practical applications in areas like file integrity, URL shortening, and cryptography.