Is MD5 completely broken and unsafe to use?

MD5 is broken for security purposes: collision attacks have been practical since 2004, and chosen-prefix collisions have been demonstrated on real software (the Flame malware, in 2012). However, 'broken' does not mean useless for every purpose. MD5 is still acceptable for non-security uses like detecting accidental file corruption, computing cache keys, or partitioning data — contexts where an attacker cannot exploit a collision because there is no attacker.

What is a hash collision and why does it matter?

A hash collision occurs when two different inputs produce the same hash output. Collisions are inevitable in theory (any fixed-length output can only represent a finite number of values), but a secure hash function makes them computationally infeasible to find intentionally. For MD5, researchers can now generate collisions in seconds on a laptop. This means an attacker can craft two different files with the same MD5 hash — for example, a benign executable and a malicious one — making integrity verification using MD5 worthless.

Can I upgrade from MD5 to SHA-256 without breaking existing data?

For stored hashes (like password digests), you cannot simply recompute — you need the original input. The standard approach is a gradual migration: when a user logs in and their password is verified against the old MD5 hash, immediately rehash with SHA-256 (or bcrypt) and update the stored record. Mark unconverted hashes in the database. After a reasonable period, force a password reset for any remaining unconverted accounts.

What about SHA-1? Is it also broken?

SHA-1 is also broken for collision resistance. The SHAttered attack (2017) produced the first practical SHA-1 collision — two different PDF files with the same SHA-1 hash. Google and CWI Amsterdam demonstrated this publicly. SHA-1 should be retired from all security contexts for the same reasons as MD5. Git still uses SHA-1 internally for its object addressing, but is migrating to SHA-256 (the sha256 object format). For new systems, use SHA-256 or SHA-3.

Should I use SHA-256 for password hashing?

No. SHA-256, like MD5, is a general-purpose hash function designed to be fast. Fast is exactly wrong for password hashing. An attacker with a GPU can test billions of SHA-256 hashes per second, making brute-force attacks practical even against long passwords. Use a password hashing function designed to be slow and memory-hard: bcrypt, scrypt, or Argon2. These are specifically tuned to make brute-force expensive even with dedicated hardware.

What is HMAC and when should I use it instead of a raw hash?

HMAC (Hash-based Message Authentication Code) combines a hash function with a secret key to produce a code that proves both the integrity and authenticity of a message. A raw SHA-256 hash proves integrity — the data has not changed — but anyone can compute it. HMAC proves that only someone with the secret key produced the code. Use HMAC when you need to verify that a message came from a trusted source, not just that it was not corrupted in transit.

MD5 vs SHA-256 — Why You Should Stop Using MD5

May 10, 2026 Toova

MD5 is everywhere. It ships in every programming language, every database, every cloud provider's SDK. Developers reach for it out of habit — it is fast, it produces a tidy 32-character hex string, and the API is two lines of code. The problem: MD5 has been cryptographically broken since 2004, and the attacks have only gotten faster and cheaper since then.

This is not theoretical. A 2012 cyberattack on Iranian infrastructure — the Flame malware — forged a Microsoft code-signing certificate by exploiting an MD5 collision. The same technique that academic cryptographers demonstrated in a paper became a weapon used in geopolitical espionage. MD5 is not slightly weakened; it is fundamentally compromised for any application where an adversary can choose inputs.

SHA-256, part of the SHA-2 family standardized by NIST, has no known collision attacks and remains the standard for cryptographic integrity in 2026. This article explains exactly what the difference means, when (rarely) MD5 is still acceptable, and how to migrate safely. You can compute both hashes instantly with Toova MD5 hash and Toova SHA-256 hash tools.

How Hash Functions Work

A cryptographic hash function takes an input of any length and produces a fixed-length output (the digest) with these properties:

Deterministic: the same input always produces the same output.
Avalanche effect: a single bit change in the input completely changes the output.
Preimage resistance: given a hash, it is computationally infeasible to find the input.
Collision resistance: it is computationally infeasible to find two different inputs that produce the same hash.

The avalanche effect is why both MD5 and SHA-256 look similar at a glance:

MD5("The quick brown fox jumps over the lazy dog")
= 9e107d9d372bb6826bd81d3542a419d6

MD5("The quick brown fox jumps over the lazy cog")
= 1055d3e698d289f2af8663725127bd4b

SHA-256("The quick brown fox jumps over the lazy dog")
= d7a8fbb307d7809469ca9abcb0082e4f8d5651e46d3cdb762d02d0bf37c9e592

SHA-256("The quick brown fox jumps over the lazy cog")
= e4c4d8f3bf76b692de791a173e05321150f7a345b46484fe427f6acc7ecc81be

One character changed in the input ("dog" → "cog"), yet the output is completely different. This property holds for both algorithms. The difference is what happens when an attacker tries to find a collision intentionally.

MD5 — A Timeline of Failure (1996–2025)

MD5 was designed by Ron Rivest in 1991 as a replacement for MD4. By 1996, Hans Dobbertin had found collisions in MD5's compression function — not the full algorithm, but a warning sign that the design was fragile. The security community started recommending migration to SHA-1. Most systems ignored this.

2004 — Full collisions demonstrated

In August 2004, Xiaoyun Wang and Hongbo Yu presented a practical collision attack on MD5 at the CRYPTO conference. They could generate two different 1,024-bit messages with the same MD5 hash in under an hour on a cluster. The fundamental guarantee of MD5 — collision resistance — was broken.

NIST immediately began deprecating MD5 for federal use. Most industry guidance followed. The actual deployment of MD5 in production systems barely changed.

2008 — Rogue CA certificates

A group of researchers (Sotirov, Stevens, et al.) demonstrated that they could create a rogue Certificate Authority certificate that would be trusted by all major browsers. The attack exploited the fact that several certificate authorities were still signing certificates with MD5. The researchers generated a collision between a legitimate-looking certificate request and a self-crafted CA certificate — then got a real CA to sign the legitimate one, producing a signature that was also valid for the forged CA cert.

Every major browser immediately blocked MD5-signed certificates. Most certificate authorities stopped issuing them. One major lesson: cryptographic weaknesses become exploitable the moment an attacker controls the input to the hash function.

2012 — Flame malware

The Flame cyberespionage malware, discovered in 2012 and attributed to state actors, forged a Microsoft Windows Update certificate using a chosen-prefix MD5 collision. The attack was more sophisticated than the 2008 demonstration: attackers could craft a malicious payload that collided with a legitimate Microsoft certificate under conditions where Microsoft's signing infrastructure would cooperate unknowingly.

The result: Flame could distribute itself through Windows Update as though it were a legitimate Microsoft update, with a valid Microsoft signature. Hundreds of thousands of Windows machines in Iran, Lebanon, Syria, and Sudan were infected. This was a real-world exploitation of MD5 collision attacks at nation-state scale. See the Wikipedia article on Flame malware for the full history.

2019–2025 — HashClash and instant collisions

The HashClash project (Marc Stevens, CWI Amsterdam) continued pushing MD5 collision generation to practical limits. By 2019, chosen-prefix MD5 collisions — where an attacker can choose arbitrary prefixes for both colliding messages — could be generated in days on commodity hardware. By 2022, optimized implementations reduced this to hours. In 2024, a HashClash paper demonstrated collisions in under a minute on a single modern GPU.

The trajectory is clear: MD5 collision attacks are not getting harder as hardware improves — they are getting easier. What took a cluster in 2004 takes a laptop in 2026.

SHA-256 — Why It Holds

SHA-256 is part of the SHA-2 family, designed by the NSA and standardized by NIST in 2001. It produces a 256-bit (32-byte) digest. No practical collision attacks against SHA-256 are known as of 2026. The best published attacks reduce the theoretical work factor for finding collisions to roughly 2^187 operations — still far beyond any computational resource that exists or is foreseeable.

SHA-256's security margin is intentionally conservative. Even if hardware improves by a factor of a billion (roughly 30 doublings in Moore's-law terms), breaking SHA-256 remains computationally infeasible. MD5's 2^18 to 2^23 effective work factor for collisions was within reach of modest hardware in 2004.

SHA-256 is also faster than you might expect: modern CPUs include dedicated SHA instructions (Intel SHA Extensions, ARM Cryptography Extensions) that allow software to compute millions of SHA-256 hashes per second per core. It is not meaningfully slower than MD5 for typical use cases.

When MD5 Is Still Acceptable (Rarely)

"Broken" does not mean "useless for every purpose." MD5 remains acceptable in contexts where:

There is no adversary: detecting accidental file corruption in a trusted internal system — not verifying downloads from the internet, but checking whether a file copy completed successfully.
Speed matters more than security: computing cache keys or shard identifiers where a collision simply means a cache miss, not a security breach. The attacker model is absent.
You are matching an existing external system: some legacy APIs still send MD5 ETags or checksums. You can accept and compute MD5 to interoperate, as long as you are not using it for security decisions.
Hash table partitioning: distributing data across buckets by MD5 of a key. Collisions here cause imbalance, not security failures.

The common thread: MD5 is acceptable when the application does not depend on collision resistance and there is no adversary who can craft inputs. As soon as either condition breaks — attacker present, or collision = security failure — switch to SHA-256.

Common Myths About MD5

"MD5 is fine if we add a salt"

Salting changes the input so that two users with the same password get different hashes — it prevents rainbow table attacks. But it does not fix the collision problem. An attacker with the salted MD5 hash can still brute-force it efficiently because MD5 is fast: modern GPUs compute around 10–30 billion MD5 hashes per second. A salt adds work proportional to the search space, not to the algorithm's difficulty.

For password hashing, neither MD5 nor SHA-256 is appropriate regardless of salting. Use bcrypt, scrypt, or Argon2.

"MD5 is fine because we only use it internally"

Internal systems get breached. The threat model "attacker cannot access our inputs" tends to hold until a supply-chain attack, an insider threat, or a misconfiguration exposes the system. The Flame attack happened against systems that presumably had similar confidence in their internal controls.

"SHA-256 is overkill — MD5 is faster"

On modern hardware with SHA acceleration instructions, SHA-256 is approximately 2–5x slower than MD5. For most applications — file integrity checks, API signatures, cache keys — the difference is microseconds per operation, unnoticeable in practice. The performance argument for MD5 over SHA-256 only holds in extremely high-throughput scenarios where even microseconds matter, and even then there are usually better solutions than using a broken algorithm.

Migration Guide — MD5 to SHA-256

Password hashing

// WRONG: MD5 for password hashing
const crypto = require('crypto');
const hash = crypto.createHash('md5').update(password).digest('hex');
// Crack time with modern GPU: seconds to minutes

// WRONG: SHA-256 for password hashing (still too fast)
const crypto = require('crypto');
const hash = crypto.createHash('sha256').update(password).digest('hex');

// CORRECT: use bcrypt, scrypt, or Argon2 for passwords
const bcrypt = require('bcrypt');
const hash = await bcrypt.hash(password, 12);

Migrating stored password hashes requires a gradual approach. On each successful login: verify the password against the existing MD5 hash, then immediately rehash with bcrypt and replace the stored value. Flag each account as migrated. After a reasonable period (90 days is typical), force a password reset for any remaining accounts still using MD5 hashes.

File integrity / checksums

# Linux: verify a file download
sha256sum -c ubuntu-24.04-desktop-amd64.iso.sha256

# macOS
shasum -a 256 ubuntu-24.04-desktop-amd64.iso

# Windows PowerShell
Get-FileHash ubuntu-24.04-desktop-amd64.iso -Algorithm SHA256

Switching file integrity checks from MD5 to SHA-256 is usually a simple find-and-replace in the code. The two gotchas: (1) existing checksums stored in databases or files need to be recomputed and updated — there is no shortcut; (2) external APIs or storage systems that provide MD5 ETags (like some S3 configurations) require coordination to switch.

HMAC for API authentication

// HMAC-SHA-256 for message authentication
const crypto = require('crypto');
const hmac = crypto.createHmac('sha256', process.env.SECRET_KEY)
  .update(message)
  .digest('hex');

If you use HMAC-MD5 for API request signing, switch to HMAC-SHA-256. The HMAC construction adds a secret key, which limits some MD5 attacks, but HMAC-MD5 has length-extension vulnerabilities and the underlying primitive is still compromised. Modern standards (JWT, AWS SigV4, OAuth 2.0) all specify HMAC-SHA-256. Toova's HMAC generator supports both HMAC-SHA-256 and HMAC-SHA-512 for testing.

Digital signatures and certificates

Any certificate signed with MD5 should be reissued immediately — most certificate authorities stopped issuing MD5-signed certificates after 2008, and all major browsers and operating systems reject them. For internal PKI, audit your CA configuration and ensure SHA-256 is the minimum allowed signature algorithm. RSA-SHA256 or ECDSA-SHA256 are the current standards.

MD5 vs SHA-256 — Quick Reference

Property	MD5	SHA-256
Output length	128 bits (32 hex chars)	256 bits (64 hex chars)
Collision resistance	Broken (practical attacks)	Secure (no known attacks)
Password hashing	Never	No (use bcrypt/Argon2)
Digital signatures	Never	Yes
File integrity (security)	Never	Yes
Non-security checksums	Acceptable (no attacker)	Always fine
Cache keys	Acceptable	Always fine
TLS certificate signing	Rejected by browsers	Standard
FIPS 140-3 compliant	No (deprecated)	Yes

Conclusion

MD5 has been cryptographically broken since 2004. Chosen-prefix collisions — the technique that powered the Flame malware — are now computationally cheap. Any application that relies on MD5 for security (signatures, integrity verification, authentication) is vulnerable to attacks that cost minutes of compute time to execute.

SHA-256 has no known practical attacks, is hardware-accelerated on modern CPUs, and is the standard for cryptographic integrity across TLS, code signing, and API authentication. The performance cost compared to MD5 is negligible for nearly every use case.

The migration path is straightforward: file integrity checks are a find-and-replace. API signatures require coordinating a version bump. Password hashes need a gradual rehash strategy on login. For any new code, SHA-256 should be the default choice. Compute and compare hashes directly with Toova MD5 and Toova SHA-256 — or generate HMAC-SHA-256 authentication codes with the HMAC generator.