Hashing vs Encryption vs Encoding

These three words get used interchangeably in code reviews, Stack Overflow answers, and incident postmortems — and almost always wrong. Someone "encrypts" a password with MD5. Someone calls Base64 "encryption" because the output looks scrambled. Someone "decrypts" a hash, which is impossible by definition. Each mix-up is a security bug waiting to ship.

The confusion is understandable: all three transform readable data into something that looks like noise. But they exist for completely different reasons, and using one where another belongs is one of the most common (and dangerous) mistakes junior developers make. This guide draws the lines precisely, with the specs and the broken-by-now algorithms named explicitly.

TL;DR: what's the actual difference?

Encoding is reversible with no key — it changes data format so a system can transport or display it (Base64, URL-encoding, HTML entities). Encryption is reversible but requires a secret key — it provides confidentiality so only key-holders can read the data (AES, RSA). Hashing is one-way and keyless — it maps any input to a fixed-length fingerprint for integrity and verification (SHA-256), and you cannot get the original back.

The litmus test is two questions:

Can you get the original back? If no → it's a hash. If yes → it's encoding or encryption.
Do you need a secret to get it back? If no → encoding. If yes → encryption.

That's the whole map. Everything below is detail.

Property	Encoding	Hashing	Encryption
Reversible?	Yes	No (one-way)	Yes
Needs a key/secret?	No	No	Yes
Output length	Grows with input (~133% for Base64)	Fixed (e.g. 256 bits for SHA-256)	~Input size + overhead
Same input → same output?	Always	Always (deterministic)	No — varies with key & IV/nonce
Purpose	Data format / transport	Integrity & verification	Confidentiality
Examples	Base64, URL-encode, hex, HTML entities	SHA-256, SHA-512, bcrypt, HMAC	AES, RSA, ChaCha20, ECC

What is encoding, and why is it not security?

Encoding transforms data from one representation to another using a public, fixed scheme — there is no secret involved, so anyone can reverse it instantly. Its job is to make data safe to transport or display in a context that can't handle the raw bytes, not to hide anything.

The three you'll meet constantly:

Base64 maps every 3 bytes of binary to 4 ASCII characters, defined in RFC 4648. It's how you put binary into JSON, embed an image in CSS, or carry a key in an HTTP header. Output grows by ~33%. Encode and decode it with the Base64 encoder and Base64 decoder — and read our full Base64 encoding guide for the byte-level mechanics.
Percent-encoding (URL-encoding), defined in RFC 3986 §2.1, replaces reserved and unsafe characters with %XX hex so they survive inside a URL — a space becomes %20, an ampersand becomes %26. Use the URL encoder to see it.
HTML entities turn characters that would otherwise be parsed as markup into named or numeric references — < becomes <, & becomes & — so they render as text instead of breaking the DOM. The HTML entity converter handles both directions.
Hex writes each byte as two hexadecimal digits. It's how hashes, MAC addresses, and color values get displayed.

Here is the entire "security" of Base64, in two lines of browser JavaScript:

btoa("password123");        // "cGFzc3dvcmQxMjM="  — looks scrambled
atob("cGFzc3dvcmQxMjM=");   // "password123"       — anyone can reverse it

Base64 is encoding, not encryption. There is no key, no secret, and no protection — it is a public transformation that any tool reverses in microseconds.

If you ever find yourself Base64-encoding a secret and feeling safer, stop. You've changed the format of the secret, not its confidentiality. The credential is exactly as exposed as it was before — just in a form that's slightly less obvious to a human skimming a log file.

What is hashing, and why can't you reverse it?

Hashing runs input through a one-way function that produces a fixed-length fingerprint (a "digest"), and there is no inverse function — you cannot compute the input from the output. Its purpose is verification and integrity: prove two things are identical, detect tampering, or check a password without storing it.

A cryptographic hash function has five defining properties:

Deterministic — the same input always yields the same digest.
Fixed-length output — "a" and a 2 GB video both hash to exactly 256 bits under SHA-256.
One-way (preimage resistance) — given a digest, it's computationally infeasible to find an input that produces it.
Avalanche effect — flipping a single input bit changes roughly half the output bits, unpredictably.
Collision resistance — it's infeasible to find two different inputs that hash to the same digest.

The modern default is the SHA-2 family — SHA-256 and SHA-512 — standardized in NIST FIPS 180-4. SHA-3 (FIPS 202) exists as a structurally different backup, but SHA-256 is what TLS certificates, Git commits, and blockchains run on today.

You can compute these directly in the browser with the Web Crypto API — no library needed:

async function sha256(message) {
  const data = new TextEncoder().encode(message);
  const buffer = await crypto.subtle.digest("SHA-256", data);
  return [...new Uint8Array(buffer)]
    .map((b) => b.toString(16).padStart(2, "0"))
    .join("");
}

await sha256("hello");
// "2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824"

Watch the avalanche effect: change "hello" to "hellp" and the digest is completely different, with no resemblance to the original. The hash generator lets you try this across MD5, SHA-1, SHA-256, and SHA-512 side by side.

MD5 and SHA-1 are broken — don't use them for security

MD5 and SHA-1 are cryptographically broken: practical collisions exist, so they must not be used where collision resistance matters (signatures, certificates, integrity verification against an adversary).

MD5 fell in 2004, when Wang and Yu published a method to generate colliding inputs by hand (How to Break MD5 and Other Hash Functions, EUROCRYPT 2005). Collisions are now generatable in seconds on a laptop.
SHA-1 fell in 2017, when Google and CWI Amsterdam produced two distinct PDF files with the same SHA-1 digest — the SHAttered attack. Browsers and certificate authorities have rejected SHA-1 certificates ever since.

MD5 and SHA-1 are fine only as fast, non-adversarial checksums — detecting accidental file corruption where no attacker is trying to forge a match. For anything security-sensitive, use SHA-256 or stronger.

Hashing passwords is a special case — use a slow, salted KDF

Never hash passwords with a plain fast hash like SHA-256. Fast hashes are the problem: an attacker with a leaked database can try billions of guesses per second on a GPU. Password storage needs a deliberately slow, salted key derivation function (KDF).

Per the OWASP Password Storage Cheat Sheet, use one of, in order of preference:

Argon2id — the Password Hashing Competition winner (2015) and the current first choice; memory-hard, so GPUs and ASICs gain little.
scrypt — also memory-hard; a solid choice where Argon2 isn't available.
bcrypt — older but still acceptable; note its ~72-byte input limit.

Two ideas make these work. A salt is a unique random value mixed into each password before hashing, so identical passwords produce different stored hashes and precomputed "rainbow tables" are useless. A high work factor (iteration count / memory cost) makes each guess expensive — you tune it so one verification takes ~100–500 ms, which is invisible to a legitimate login but catastrophic for a brute-force attacker.

For the upstream problem — generating and judging passwords in the first place — use the password generator and the password strength meter.

HMAC: hashing with a key for message authentication

When you need to prove a message came from someone who holds a shared secret and wasn't tampered with, you use an HMAC (Hash-based Message Authentication Code), defined in RFC 2104. HMAC combines a hash function with a secret key — HMAC-SHA256(key, message) — so only key-holders can produce or verify the tag.

This is what verifies webhook payloads (Stripe, GitHub) and signs the most common kind of JWT. Compute one with the HMAC generator. Note the subtlety: HMAC uses a hash and uses a key, but it's still one-way — it authenticates, it doesn't conceal. The message travels in the clear alongside its tag.

What is encryption, and how do symmetric and asymmetric differ?

Encryption transforms plaintext into ciphertext using a key, and the original is recoverable only by someone holding the correct key. Its single purpose is confidentiality — keeping data unreadable to anyone without the secret. Unlike a hash, it is fully reversible; unlike encoding, that reversal is gated by a key.

There are two families:

Symmetric encryption uses one key for both encrypting and decrypting. The standard is AES (Advanced Encryption Standard), specified in NIST FIPS 197, typically run in an authenticated mode like AES-GCM. It's fast and handles large data, but both parties must already share the secret key — which is the hard part.

Asymmetric (public-key) encryption uses a key pair: a public key that anyone can use to encrypt (or verify), and a private key that only the owner holds to decrypt (or sign). RSA and elliptic-curve cryptography (ECC) are the common choices. It solves the key-distribution problem — you can publish your public key freely — but it's slow and size-limited, so it's rarely used to encrypt bulk data directly.

In practice they work together. TLS (HTTPS) uses asymmetric crypto in the handshake to authenticate the server and agree on a shared secret, then switches to fast symmetric AES for the actual traffic. JWT signing follows the same split: symmetric HMAC (HS256) when one party holds the secret, or asymmetric RSA/ECDSA (RS256/ES256) when many services need to verify tokens a single issuer signs.

The worked example that ties all three together: a JWT

A JSON Web Token is the cleanest real-world example of all three concepts living in one string — and the source of a very common misconception. A JWT has three Base64URL parts separated by dots: header.payload.signature.

eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjMiLCJuYW1lIjoiQWxpY2UifQ.dBjftJeZ4...
└────── header ──────┘ └───────────── payload ─────────────┘ └─ signature ─┘

Here is exactly which concept each part uses:

The header and payload are only Base64URL-encoded, not encrypted. Anyone can decode them — no key required. Paste a token into the JWT decoder and you'll read every claim in plaintext. A standard JWT is not confidential. Never put a password, a credit-card number, or any secret in a JWT payload, because it is trivially readable by anyone who intercepts the token.
The signature uses hashing-with-a-key or encryption. With HS256 it's an HMAC (keyed hash); with RS256 it's an RSA signature (asymmetric). Either way, the signature proves the token was issued by someone holding the secret and hasn't been altered. It provides integrity and authenticity — not confidentiality.

So a single JWT demonstrates the full taxonomy: encoding makes the claims transportable as text, and hashing/encryption makes them tamper-evident. What it deliberately does not do is hide the claims. Decode one yourself in the JWT decoder to see all three ideas at once.

What are the most common mistakes?

Almost every cryptography bug in application code traces back to using one of these three for a job that belongs to another. The big four:

Storing passwords with encryption. If you can decrypt your users' passwords, so can an attacker who steals your key — and a database breach plus a key breach is one incident, not two. Passwords must be hashed with a salted KDF (Argon2/scrypt/bcrypt), never encrypted, never stored as plaintext you can recover.
Storing passwords with a fast hash (MD5, SHA-1, plain SHA-256). These are reversible at scale: a GPU tries billions of candidates per second, and unsalted hashes fall to rainbow tables instantly. "Hashed" is not "safe" unless the hash is slow and salted.
Thinking Base64 is encryption. Base64 has no key and protects nothing. btoa(secret) is the same secret in a different alphabet. Tokens, API keys, and config values that are "just Base64" are plaintext.
"Encrypting" or "decrypting" a hash. You cannot decrypt a hash — it's one-way by design, with information permanently discarded. "Decrypting an MD5" actually means looking it up in a precomputed table of known inputs, which is exactly why fast unsalted hashes are unsafe for secrets.

A useful sanity check before you write the code: name the property you need. Need to hide data from people without a secret? Encryption. Need to verify data without storing the original? Hashing. Just need to move bytes through a text channel? Encoding. Pick the property first; the algorithm follows.

TL;DR

Encoding, hashing, and encryption all turn readable data into noise, but they solve different problems and confusing them ships bugs. Encoding (Base64, URL, hex, HTML entities) is reversible with no key — it's for format, not security, so btoa(password) protects nothing. Hashing (SHA-256, HMAC, and slow salted KDFs like Argon2 for passwords) is one-way with no key — it's for integrity and verification, and MD5/SHA-1 are broken for adversarial use. Encryption (AES symmetric, RSA/ECC asymmetric) is reversible with a key — it's for confidentiality. The litmus test never fails: can you get the original back (no → hash), and do you need a secret to (no → encoding, yes → encryption)? A JWT shows all three at once — its payload is Base64-encoded and readable by anyone, while its signature is a keyed hash or asymmetric signature for tamper-evidence.

Run the concepts hands-on with the hash generator, HMAC generator, Base64 encoder, and JWT decoder — or browse every developer utility in the tools directory.

Hashing vs Encryption vs Encoding: What's the Difference?