Skip to content
Toova
All Tools

Understanding Base64 Encoding — Complete Developer Guide

Toova

Base64 encoding appears in almost every area of web development — JWT tokens, data URIs, email attachments, API payloads, cryptographic signatures, and configuration files. Despite being ubiquitous, it is frequently misunderstood: developers sometimes confuse it with encryption, or use it in situations where it makes things worse. This guide explains exactly how Base64 works, when to reach for it, when to avoid it, and how to use it correctly in JavaScript.

What Base64 Encoding Is (and Is Not)

Base64 is a binary-to-text encoding scheme. Its only job is to convert arbitrary binary data into a string of printable ASCII characters. It does not compress data, it does not encrypt data, and it does not validate data. It is purely a representation transformation — taking bytes that might include null bytes, control characters, or values that would break text-based protocols, and converting them to a safe character set.

The name comes from the 64 printable characters used as the encoding alphabet: A–Z (26 characters), a–z (26 characters), 0–9 (10 characters), plus + and / (2 characters). The = character is used for padding. Because there are 64 possible values per character position and 2^6 = 64, each Base64 character encodes exactly 6 bits of the original data.

How Base64 Works — "Hello" Step by Step

The best way to understand Base64 is to trace a concrete example. Let's encode the string Hello.

Step 1: Convert to bytes

Each character in Hello maps to its ASCII code, which is then expressed in binary (8 bits per byte):

H  e  l  l  o
72 65 6C 6C 6F   (ASCII / hex)
01001000 01100101 01101100 01101100 01101111   (binary)

Step 2: Group into 6-bit chunks

Concatenate all the bits: 0100100001100101011011000110110001101111 (40 bits). Base64 processes 3 bytes (24 bits) at a time and maps them to 4 Base64 characters (4 × 6 = 24 bits). Group the 40 bits into 6-bit chunks, padding the last group with zeros to fill it:

010010 000110 010101 101100 011011 000110 1111
18     6      21     44     27     6      (last group padded to 6 bits: 111100 = 60)

Step 3: Map to the Base64 alphabet

Each 6-bit value maps to a character in the Base64 alphabet (A=0, B=1, ... Z=25, a=26, ... z=51, 0=52, ... 9=61, +=62, /=63). Since 5 bytes is not a multiple of 3, one = padding character is added:

Index: 18  6   21  44  27  6   60
Char:  S   G   V   s   b   G   8
Result: SGVsbG8=   (= is padding)

Result

"Hello" → "SGVsbG8="

You can verify this instantly with the Toova Base64 encoder/decoder — paste Hello, click Encode, and you get SGVsbG8=. Click Decode to reverse it.

The 33% Size Overhead

Base64 encoding always increases data size by approximately 33%. The math is straightforward: 3 bytes of input (24 bits) produce 4 Base64 characters (24 bits encoded in 4 × 6-bit groups). That is a 4/3 ratio, or 33.3% overhead. Add padding characters and the true overhead is between 33% and 36%, depending on the input length.

This matters significantly for performance. A 1 MB image embedded as a Base64 data URI in HTML becomes roughly 1.37 MB. An API that encodes all binary payloads in Base64 sends 33% more data than necessary. For small values like short tokens or checksums, the overhead is negligible. For large files, it is a real cost.

The URL-Safe Variant

Standard Base64 uses + and / as the last two alphabet characters. Both of these are problematic in URLs:

  • + is decoded as a space character in query strings
  • / is a path separator in URLs

URL-safe Base64 (also called Base64url, defined in RFC 4648 Section 5) replaces + with - and / with _. Padding (=) is usually omitted in URL-safe contexts because it can also be misinterpreted by some URL parsers.

JWT tokens use Base64url without padding. When you decode a JWT header or payload manually, you must handle both the character substitution and the missing padding. Here is how to do it in JavaScript:

// URL-safe Base64 (replace + with - and / with _)
function toBase64Url(base64) {
  return base64.replace(/\+/g, '-').replace(/\//g, '_').replace(/=/g, '');
}

function fromBase64Url(base64url) {
  const padded = base64url.padEnd(
    base64url.length + (4 - base64url.length % 4) % 4,
    '='
  );
  return atob(padded.replace(/-/g, '+').replace(/_/g, '/'));
}

The Toova Base64 encoder/decoder supports both standard and URL-safe variants with a single toggle.

Base64 in JavaScript

Browser: btoa() and atob()

Browsers provide two built-in functions: btoa() (binary to ASCII, i.e., encode) and atob() (ASCII to binary, i.e., decode). Despite the confusing name order, they have been available in browsers for over a decade.

// Browser — atob / btoa (strings only, ASCII-safe)
const encoded = btoa("Hello");
console.log(encoded); // "SGVsbG8="

const decoded = atob("SGVsbG8=");
console.log(decoded); // "Hello"

Important limitation: btoa() only accepts strings with characters in the Latin-1 range (code points 0–255). If you pass a string with Unicode characters like emojis or CJK characters, it throws a DOMException. To encode arbitrary binary data, convert it to a Uint8Array first:

// Browser — encoding arbitrary binary (Uint8Array)
const bytes = new Uint8Array([72, 101, 108, 108, 111]);
const encoded = btoa(String.fromCharCode(...bytes));
console.log(encoded); // "SGVsbG8="

For encoding arbitrary strings that may contain Unicode characters, the recommended modern approach is to use TextEncoder to get a Uint8Array first, then encode as shown above.

Node.js: Buffer

Node.js provides the Buffer class, which handles binary data correctly and supports multiple encodings including Base64 and Base64url:

// Node.js — Buffer (handles binary safely)
const encoded = Buffer.from('Hello').toString('base64');
console.log(encoded); // "SGVsbG8="

const decoded = Buffer.from('SGVsbG8=', 'base64').toString('utf8');
console.log(decoded); // "Hello"

// URL-safe variant in Node.js
const urlSafe = Buffer.from('Hello').toString('base64url');
console.log(urlSafe); // "SGVsbG8" (no padding)

The base64url encoding option (available since Node.js 16) handles the character substitution and padding removal automatically, making it much easier than doing the transformation manually.

For browser environments that need to handle large binary files, the FileReader.readAsDataURL method encodes the file as a Base64 data URI without loading everything into memory at once.

When to Use Base64

Embedding binary data in text-only protocols

The original use case for Base64 was encoding binary email attachments in SMTP, a protocol that only supports 7-bit ASCII text. The same principle applies anywhere you need to include binary data in a format that cannot handle raw bytes: JSON API payloads, XML documents, HTML attributes, CSS values, HTTP headers.

Data URIs for small assets

CSS and HTML allow you to embed images, fonts, and SVGs as Base64 data URIs. This eliminates an HTTP round-trip for small assets like icons and eliminates flash-of-unstyled-content for critical above-the-fold images.

<!-- Inline SVG icon as Base64 data URI -->
<img src="data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0c..." alt="icon" />

<!-- Inline CSS background -->
.icon {
  background-image: url("data:image/png;base64,iVBORw0KGgo...");
}

The trade-off: Base64 data URIs cannot be cached separately from the HTML/CSS file that contains them. If the image never changes but the surrounding HTML does, the browser re-downloads the image data on every page reload. Use data URIs only for small assets (ideally under 4 KB) where the elimination of a round-trip outweighs the cache penalty.

Encoding binary data for JSON or URL parameters

JSON is a text format — it cannot represent raw binary bytes directly. When an API needs to transmit binary data (image thumbnails, cryptographic signatures, compressed data), Base64 is the standard way to include it in a JSON payload. Similarly, if you need to pass binary data in a URL query parameter, Base64url encoding ensures the data survives percent-encoding without corruption.

JWT and other token formats

JWT tokens use Base64url to encode their header and payload sections. This makes the token a printable, URL-safe string that can be passed in HTTP headers, cookies, or URL parameters. The encoding is not for security (the payload is readable by anyone with the token) — it is purely for safe transmission.

When NOT to Use Base64

Security or confidentiality

Base64 provides zero security. It is trivially reversible in milliseconds. Do not use it to "obfuscate" passwords, API keys, or sensitive configuration values. Any developer who sees a Base64 string will decode it immediately. If you need confidentiality, use encryption.

Password storage

Storing Base64-encoded passwords is the same as storing them in plain text — the encoding is instantly reversible. Passwords must be hashed with a proper password hashing function like bcrypt, Argon2, or scrypt.

Large binary files

Encoding a 10 MB file as Base64 produces a 13.7 MB string. If you store that in a database column, search through it, or transmit it over an API, you pay the 33% overhead every time. For large binary data, use dedicated binary storage: database BLOB/BYTEA columns, object storage like S3 or GCS, or stream the binary directly.

Situations where you can use binary directly

If your protocol or format supports raw binary — for example, a WebSocket with binary message type, a multipart/form-data HTTP upload, or a binary file format — use binary directly. Base64 is only necessary when the transport medium genuinely cannot handle raw bytes.

Common Pitfalls

Confusing encoding with encryption

This is the most common mistake. Base64 is visible. It is not a security mechanism. Code comments like "password is stored Base64-encoded for security" indicate a serious misunderstanding that should be caught in code review.

Using btoa() with Unicode strings

Calling btoa() on a string that contains characters with code points above 255 throws a DOMException: Failed to execute 'btoa': The string to be encoded contains characters outside of the Latin1 range. Always convert to Uint8Array via TextEncoder before encoding strings that may contain Unicode characters.

Forgetting padding when decoding

Base64 strings must have lengths that are multiples of 4. If a Base64 string was generated without padding (common in URL-safe encoding), you must add back the correct number of = characters before decoding. A Base64 string with length n needs (4 - n % 4) % 4 padding characters. Forgetting this causes decode errors that can be hard to diagnose.

Double-encoding

A Base64 string is itself valid ASCII, so btoa(btoa(data)) works without errors but produces double-encoded output. When passing Base64 values through multiple layers of serialization (JSON inside JSON, for example), it is easy to encode the same data twice. Always decode the exact number of times you encoded.

Quick Reference: Base64 in Practice

For encoding and decoding in the browser without writing code, the Toova Base64 encoder/decoder runs entirely in your browser — no server round-trip. It supports standard and URL-safe variants, file upload for encoding binary files, and both text and hex output for decoded data.

If you are working with encoded content inside URLs, the URL encoder/decoder handles percent-encoding separately from Base64. For HTML entities, the HTML entities converter handles character escaping in HTML contexts. These are distinct encoding schemes — each has a specific use case.

The canonical reference for Base64 is RFC 4648, which defines standard Base64 (Section 4), Base64url (Section 5), and Base32 (Sections 6–7). For the btoa() and atob() browser APIs, the MDN documentation for btoa() covers browser compatibility and the Unicode limitation in detail.

Summary

Base64 encoding converts binary data to printable ASCII using a 64-character alphabet. It increases data size by 33%, is completely reversible, and provides no security. Use it when you need to embed binary data in a text-based format — JSON payloads, HTML data URIs, JWT tokens, email attachments, URL parameters. Avoid it when you need security, when the transport supports binary directly, or when the 33% overhead matters at scale.

Understanding what Base64 is — and what it is not — prevents the most common mistakes: using it for security, applying it to large files unnecessarily, and confusing it with other encoding schemes like URL encoding or HTML entities. Each encoding scheme solves a specific problem. Base64 solves exactly one: making binary data safe for text-only channels.

Ready to encode or decode? Try the Toova Base64 encoder/decoder — paste text or drop a file, toggle standard or URL-safe, and copy the result. No account, no server, no limits.