Why Base64 was invented
Binary data — images, audio, compiled code — contains bytes with values that collide with control characters in older text protocols. SMTP (email), HTTP/1.0 headers, and early XML parsers could all corrupt raw binary in transit.
Base64 solves this by mapping every 3 bytes of input into 4 printable ASCII characters. The output uses only letters, digits, +, and / — characters safe in every 7-bit ASCII channel. The trade-off is a 33% size increase.
Where you see Base64 every day
Data URIs embed images directly in HTML or CSS: <img src="data:image/png;base64,iVBORw0...">. Eliminates an HTTP request for small assets.
HTTP Basic Authentication encodes username:password in the Authorization: Basic header. This is encoding, not encryption — Base64 is trivially reversible.
JSON Web Tokens (JWTs) use URL-safe Base64 (Base64url) for their header and payload segments.
Email attachments (MIME) have used Base64 since the 1990s to transmit binary files as text.
SSH keys — ~/.ssh/id_rsa.pub contains a Base64-encoded public key.
Standard vs URL-safe Base64
Standard Base64 uses + and /. Both characters have special meaning in URLs: + represents a space in query strings and / is a path separator. If you embed standard Base64 in a URL without percent-encoding these characters, the URL breaks.
URL-safe Base64 (Base64url, RFC 4648 §5) substitutes + → - and / → _, and drops the = padding. It is directly embeddable in query parameters and path segments. JWTs, OAuth tokens, and signed URLs always use Base64url.
The 33% overhead explained
The math: 3 input bytes → 24 bits → 4 groups of 6 bits → 4 Base64 characters.
If the input is not a multiple of 3 bytes, padding (=) fills the last group:
- 1 leftover byte → 2 Base64 chars +
== - 2 leftover bytes → 3 Base64 chars +
=
A 100-byte input → ~133 characters. A 1 MB image encodes to ~1.33 MB. Gzip compression of Base64 output recovers much of this, since the character set is small and repetitive.
Common mistakes
Encoding an already-encoded string is the most frequent error. If you encode SGVsbG8= (which is already Base64 for "Hello"), you get U0dWc2JHOD0= — double-encoded. The tool auto-detects the likely direction to help, but the detection is heuristic.
Line breaks in base64 — some tools insert newlines every 76 characters (PEM format). The decoder here normalises these automatically, but if you paste PEM-formatted Base64 into a system expecting a single line, strip the newlines first.
Decoding binary files as text — Base64-decoding a PNG gives you raw bytes, not a printable string. The tool shows a hex representation when the output is not valid UTF-8.