MD5 Hash Generator

0 chars· or drag a file here
MD5

About the MD5 Generator

The MD5 Generator computes 128-bit MD5 digests of text or files (up to 5 MiB) using a lazy-loaded spark-md5 implementation that stays out of the main bundle until you visit this page. MD5 has been cryptographically broken since 2004 — Wang and collaborators demonstrated practical collisions, and the 2008 chosen-prefix attack reduced collision generation to seconds on commodity hardware. **Use MD5 only for non-adversarial integrity checks: cache deduplication, ETag generation, legacy checksum matching, and forensic comparisons against historical evidence.** Never use MD5 for password storage, digital signatures, or anywhere an attacker controls input.

Updated: May 8, 2026

How to use the MD5 generator

  • Paste text or drop a file (up to 5 MiB). Text is encoded as UTF-8 before hashing; files are read as raw bytes.
  • The digest renders in lowercase hex by default (16 bytes / 32 hex characters). Toggle to uppercase hex or Base64.
  • Click Copy for clipboard. Click Compare to paste an expected hash and confirm a match (useful for legacy checksum verification against an older release page that only published MD5).
  • Performance: spark-md5 hashes a 1 MB input in roughly 30 ms on baseline hardware. The library is lazy-loaded the first time you visit the page (~3 KB gzipped) and cached for subsequent visits.

Common use cases (non-adversarial only)

  • Legacy checksum matching. Some Apache mirrors, mainframe-era release pages, and older Linux distribution archives still publish only MD5. Verify the bytes match what was originally distributed.
  • Cache deduplication keys. When you need a fast, fixed-size identifier for a chunk of cache content and an attacker cannot influence the input, MD5 is fine — even faster than CRC64 in many implementations.
  • Git LFS pointer hashing. Git LFS uses SHA-256, but some adjacent legacy systems (Subversion content-addressable backups, Perforce archives) still use MD5 for chunk addressing.
  • ETag generation. HTTP `ETag` headers can use any opaque string; a content-derived MD5 gives you a stable identifier that changes when the body changes. Adversarial collisions are not a concern in this context.
  • Forensic chain-of-custody for legacy evidence. Older evidence repositories often catalog files by MD5 because that is what was computed at collection time. Reproducing the same MD5 confirms the file is the same one originally cataloged.

Privacy and security

Hashing runs entirely in your browser via the lazy-loaded spark-md5 module. The MD5 algorithm itself has been cryptographically broken since 2004 (Wang et al. demonstrated collisions; the 2008 chosen-prefix attack made collision generation routine), but for non-adversarial integrity use the broken-ness is irrelevant — MD5 still maps a given byte stream to a stable 128-bit value. Input bytes never leave your tab; spark-md5 is a pure-JavaScript implementation with no network calls.

Tips and pitfalls

  • Never use MD5 for security. Digital signatures, password hashing, JWT signing, certificate fingerprints, code signing — all require collision-resistant hashes. Use SHA-256 (or SHA-512 if you need a wider digest) for anything where an attacker can influence either input.
  • Cache keys and dedup are fine. If both inputs are produced by the same trusted system, MD5 collisions are not a concern; you only need a stable, fast, fixed-size hash. CRC32 is even faster but has only 32 bits of output (high collision rate at scale).
  • MD5 is faster than SHA-256 in most implementations. About 2-3× faster on commodity hardware. For non-adversarial use this is the only practical reason to pick MD5.
  • Replacement for legacy code. If you find MD5 in an existing security-relevant context (passwords, signatures, certificates), replace it with SHA-256 or migrate to a password-hashing scheme like bcrypt / argon2 — depending on what the MD5 was protecting.
  • Not the same as MD4. MD4 was the predecessor; broken much earlier (1996). MD5 was Rivest's 1992 patch. Both are obsolete for security; MD5 is fine for non-adversarial integrity.

Frequently Asked Questions

When is MD5 acceptable to use?
Cache deduplication keys, ETag generation, legacy checksum verification, content-addressed storage where the producer is trusted, and forensic comparison against historical evidence. In all these contexts, an attacker cannot influence the inputs to produce a collision.
When should I never use MD5?
Password storage, digital signatures, certificate fingerprints, code signing, JWT signing, HMAC keys, anywhere an attacker controls either input. The 2008 chosen-prefix attack lets attackers create two different inputs with the same MD5 in seconds — devastating for any security claim that depends on MD5 being a cryptographic hash.
What is the output length?
16 bytes — 32 hex characters or 24 Base64 characters (with padding) / 22 Base64URL characters (without padding). Half the size of SHA-256.
What is the maximum input size?
5 MiB for files in this browser tool. The MD5 algorithm itself has no upper bound; for larger inputs use `md5sum` on Linux, `md5` on macOS, `Get-FileHash -Algorithm MD5` on Windows.
What does "collision resistance broken" mean?
It means an attacker can construct two different inputs that produce the same MD5 hash. The 2004 Wang attack and the 2008 chosen-prefix attack made this routine. Any security property that depends on "if hashes match, the inputs are equal" is invalid for MD5 — but properties that depend on "the same input produces the same hash" still hold.
MD5 vs CRC32 for cache keys?
MD5 has 128 bits of output; CRC32 has 32 bits. CRC32 is faster but collides much more often: with ~65,000 entries you have a 50% chance of any collision. MD5 is fine for billions of keys without collision concerns. CRC32 is appropriate for small caches and short-lived deduplication; MD5 for anything larger.
How do I replace MD5 in legacy code?
For security-relevant code: replace with SHA-256 (or migrate passwords to bcrypt / argon2). For non-adversarial code (cache keys, ETags): leave it alone — switching to SHA-256 just makes things slower without improving the property you actually need. The right replacement depends on what the MD5 was guarding.
Where is MD5 specified?
RFC 1321 (1992, Ronald Rivest). The Wang collision attack was published at EUROCRYPT 2005 and the chosen-prefix attack by Stevens et al. at CRYPTO 2009. NIST formally deprecated MD5 for security applications in 2010.