How Base64 encoding works: the algorithm that turns binary into text step by step

Base64 encoding takes arbitrary binary data and produces text using a 64-character alphabet (A-Z, a-z, 0-9, +, /). The algorithm processes input 3 bytes (24 bits) at a time, splits them into four 6-bit groups, and maps each group to a character in the alphabet. Every 3 bytes of input produce exactly 4 characters of output.

Encoding focus
How Base64 works
base64-algorithm
Category
Fundamentals
Encoding concepts and theory

How this is calculated

When the input isn't a multiple of 3 bytes, Base64 adds padding. One extra byte produces two Base64 characters plus == padding. Two extra bytes produce three Base64 characters plus = padding. The = characters signal to the decoder how many padding bytes to strip. Some variants (URL-safe Base64, AWS Signature V4) omit padding entirely because the length of the encoded string already implies the original length. The algorithm is deterministic and reversible: the same input always produces the same output, and decoding always recovers the original bytes.

Verdict

Base64 encoding is a straightforward 3-to-4 byte expansion with optional padding. Understanding the algorithm helps you debug encoding issues and explains why Base64 output is always a multiple of 4 characters. For practical use, rely on your language's standard library rather than implementing it yourself.

More Encoding scenarios

Frequently asked questions

How do I convert text to Base64?
Paste your string into the Text field and the Base64 output appears instantly. The tool uses standard Base64 (RFC 4648), so the output is identical to Linux's base64 command and every major language's built-in Base64 encoder.
What's the difference between Base64 and hex encoding?
Both represent binary data as text, but with different alphabets. Base64 uses 64 characters and needs roughly 4 chars per 3 bytes (33% overhead). Hex uses 16 characters and needs exactly 2 chars per byte (100% overhead). Base64 is denser, while hex is easier to read byte by byte.
Why does my UTF-8 text break when converted to binary?
UTF-8 encodes non-ASCII characters as multibyte sequences, so a single emoji or accented letter becomes 2-4 bytes. The binary output will be longer than the character count suggests, that's correct behavior, not a bug.
Is it safe to paste sensitive data into the converter?
Yes. The encoding conversion runs entirely in your browser with JavaScript, nothing is sent to our servers, logged, or stored. You can verify this with your browser's Network tab: no requests fire when you type.
What is URL-safe Base64?
A variant that replaces `+` with `-` and `/` with `_` so the result can be safely placed in URLs without percent-encoding. JWT tokens use URL-safe Base64. Standard Base64 is fine for most other uses.
Can I decode Base64 back to the original text?
Yes, the converter is bidirectional. Paste Base64 into the Base64 field and you'll get the original UTF-8 string back. If decoding fails silently, the input isn't valid Base64 (wrong characters, bad padding, or it was double-encoded).