Why You Should Never Roll Your Own Crypto
1. Why Should You Care?
You understand AES. Youโve read about RSA. You know the math works. So why not implement it yourself?
Because cryptography is the only field where being 99% correct means being 100% broken.
A single bit of timing difference. An unchecked error condition. A predictable random number. Any of these can turn your โsecureโ system into a welcome mat for attackers.
2. The Fundamental Problem
Cryptography Has No Partial Credit
Regular software:
- Bug โ Wrong output โ User complains โ You fix it
- Visible, debuggable, fixable
Cryptographic software:
- Bug โ Looks correct โ Attacker exploits โ Data breached
- Silent, invisible, catastrophic
The encryption might "work" perfectly in all your tests
while being completely broken in ways you can't see.Why Smart People Still Fail
Cryptographic security depends on:
1. Mathematical correctness (the easy part)
2. Implementation correctness (the hard part)
3. Environmental correctness (the invisible part)
You can ace #1 and still fail completely on #2 and #3.3. Historical Disasters
Case 1: PlayStation 3 ECDSA Failure (2010)
What Sony did:
- Used ECDSA to sign games (proper algorithm)
- ECDSA requires a random nonce k for each signature
- Sony used the SAME k for every signature
The math:
signature = (r, s) where s = kโปยน(hash + privateKey ร r)
With two signatures using same k:
sโ = kโปยน(hashโ + privateKey ร r)
sโ = kโปยน(hashโ + privateKey ร r)
Subtract:
sโ - sโ = kโปยน(hashโ - hashโ)
k = (hashโ - hashโ) / (sโ - sโ)
Once you have k:
privateKey = (sโ ร k - hashโ) / r
Result:
- PS3 master private key extracted
- Anyone could sign "official" games
- Entire security model collapsed
- Cost Sony billionsCase 2: Debian OpenSSL Disaster (2008)
What happened:
- Debian maintainer removed "uninitialized memory" warning
- Removed two lines that looked like bugs
- Actually removed the only source of entropy
The "fix":
// Before (correct but triggers warnings)
MD_Update(&m, buf, j); // Uses uninitialized memory for entropy
MD_Update(&m, buf, j);
// After (broken)
// Lines removed because Valgrind complained
Result:
- All keys generated on Debian-based systems for 2 years
- Could only have ~32,768 possible values (15 bits of entropy)
- Instead of 2^128 possibilities
- Two years of SSL certificates, SSH keys compromised
- Required mass revocation and regenerationCase 3: Cryptocat Encryption Flaw (2013)
What Cryptocat did:
- Built encrypted chat application
- Implemented their own crypto in JavaScript
- Made a mistake in the ECC implementation
The bug:
// Generating random values for elliptic curve
// Used Math.random() instead of crypto.getRandomValues()
var random = Math.floor(Math.random() * max);
Math.random() properties:
- Not cryptographically secure
- Predictable given enough samples
- Different implementations have different periods
Result:
- Private keys could be predicted
- All "encrypted" messages could be decrypted
- Activists and journalists who relied on it were exposedCase 4: WEP Wi-Fi Encryption (1997-2004)
WEP design flaws:
1. 24-bit IV (initialization vector) too short
- Only 16 million possible IVs
- Reuse inevitable on busy networks
- Same IV + same key = same keystream
2. IV sent in plaintext
- Attacker knows the IV
- Can collect packets with same IV
- XOR them together to eliminate keystream
3. CRC32 for integrity (not cryptographic)
- Attacker can modify packets
- Recalculate CRC without knowing key
- No authentication of packet source
4. Key scheduling weakness in RC4
- Certain IVs reveal key bits
- Collect ~40,000 packets with weak IVs
- Statistically recover the key
Timeline:
1997: WEP standardized
2001: First practical attacks published
2004: Full key recovery in minutes
2007: Attack takes seconds
Lessons:
- Short IVs guarantee reuse
- CRC is not a MAC
- RC4 has statistical biases
- "Good enough" security isn't4. Implementation Attacks
Timing Attacks
# VULNERABLE: Early exit reveals password length/correctness
def check_password_bad(input_password, stored_hash):
if len(input_password) != len(stored_hash):
return False # Reveals length!
for i in range(len(input_password)):
if input_password[i] != stored_hash[i]:
return False # Reveals position of first mismatch!
return True
# Timing difference:
# Wrong length: ~100ns
# Wrong first char: ~150ns
# Wrong second char: ~200ns
# ...attacker can deduce password character by character
# SECURE: Constant-time comparison
import hmac
def check_password_good(input_password, stored_hash):
return hmac.compare_digest(
input_password.encode(),
stored_hash.encode()
)
# Always takes the same time regardless of where mismatch occursPadding Oracle Attacks
CBC mode encryption with PKCS#7 padding:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Plaintext blocks get padded before encryption โ
โ โ
โ "HELLO" โ "HELLO\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b" โ
โ (11 bytes of padding with value 0x0b) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
The attack:
1. Send modified ciphertext to server
2. Server decrypts, checks padding
3. IF server returns different error for "bad padding" vs "bad data"...
4. Attacker can decrypt the entire message byte by byte!
Vulnerable response patterns:
- "Padding error" vs "Decryption failed" (different messages)
- 400 Bad Request vs 500 Internal Error (different status codes)
- Fast response vs slow response (timing difference)
ANY observable difference enables the attack.
Famous victims:
- ASP.NET (2010): Microsoft's web framework
- Java Server Faces (2010)
- Ruby on Rails (2013)
- Many TLS implementationsSide-Channel Attacks
Side channels leak information through:
1. Timing
- How long operations take
- Cache hit/miss patterns
- Branch prediction
2. Power consumption
- Different operations use different power
- Measurable with oscilloscope
- Smart cards especially vulnerable
3. Electromagnetic emanations
- CPUs emit radio signals
- Signals vary with operations
- Can be measured from meters away
4. Sound
- Computers make different sounds for different operations
- RSA keys extracted from laptop cooling fans
- Yes, really (2013 research)
5. Error messages
- Different errors for different conditions
- Padding oracle is a side channel
- "Invalid username" vs "Invalid password"
6. Cache timing
- Memory access patterns
- Spectre/Meltdown exploited this
- Affects all modern CPUs5. Why Even Experts Fail
The OpenSSL Heartbleed Bug (2014)
// Simplified vulnerable code
struct heartbeat_message {
uint8_t type;
uint16_t payload_length; // Attacker-controlled!
uint8_t payload[];
};
// The bug: trusting user-provided length
void process_heartbeat(struct heartbeat_message *msg) {
// Allocate response buffer based on CLAIMED length
response = malloc(msg->payload_length);
// Copy CLAIMED number of bytes
memcpy(response, msg->payload, msg->payload_length);
// Send response
send(response, msg->payload_length);
}
// The attack:
// Attacker sends: payload_length = 65535, actual payload = 1 byte
// Server copies 65535 bytes from memory (mostly not the payload)
// Server sends back 65535 bytes including:
// - Private keys
// - Session cookies
// - Passwords
// - Other users' data
// This was in production OpenSSL for 2 years
// Written by experienced security developers
// Reviewed by many eyes
// Still missedThe Lesson
OpenSSL is:
- Written by cryptography experts
- Open source (many reviewers)
- Widely deployed (battle-tested)
- Still had a critical vulnerability for 2 years
If OpenSSL experts miss buffer overflows,
what makes you think you'll catch timing attacks?6. The Right Approach
Use Established Libraries
# Don't implement AES
# Use a library that's been audited
# Python: cryptography library
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
key = AESGCM.generate_key(bit_length=256)
aesgcm = AESGCM(key)
ciphertext = aesgcm.encrypt(nonce, plaintext, associated_data)
# The library handles:
# - Constant-time operations
# - Proper random number generation
# - Memory safety
# - Side-channel resistance
# - Correct implementation of the standardUse High-Level APIs
# Don't combine primitives yourself
# BAD: DIY authenticated encryption
def encrypt_bad(key, plaintext):
iv = os.urandom(16)
cipher = AES.new(key, AES.MODE_CBC, iv)
# Padding? HMAC? Order? You'll get it wrong.
ciphertext = cipher.encrypt(pad(plaintext))
mac = hmac.new(key, ciphertext, sha256).digest()
return iv + ciphertext + mac
# GOOD: Use AEAD that handles everything
from cryptography.fernet import Fernet
key = Fernet.generate_key()
f = Fernet(key)
token = f.encrypt(plaintext)
# Fernet handles: key derivation, IV, encryption, authenticationWhen You Must Go Low-Level
# If you absolutely must use low-level primitives:
# 1. Use hazmat module (the name is a warning!)
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.kdf.hkdf import HKDF
# 2. Follow standards exactly (RFC, NIST)
# 3. Get security audit from professionals
# 4. Assume you made mistakes
# 5. Have incident response ready7. Red Flags in Crypto Code
Warning Signs
# ๐จ RED FLAG: Custom encryption algorithm
def my_encrypt(data, key):
result = []
for i, byte in enumerate(data):
result.append(byte ^ key[i % len(key)])
return bytes(result)
# This is XOR with a repeating key. Broken since the 1800s.
# ๐จ RED FLAG: Using ECB mode
cipher = AES.new(key, AES.MODE_ECB) # ECB is almost never correct
# ๐จ RED FLAG: Using MD5 or SHA1 for security
hash = hashlib.md5(password).hexdigest() # Broken
hash = hashlib.sha1(password).hexdigest() # Deprecated
# ๐จ RED FLAG: Using random instead of secrets
import random # NOT cryptographically secure
key = bytes([random.randint(0, 255) for _ in range(32)])
# ๐จ RED FLAG: Comparing secrets with ==
if token == expected: # Timing attack!
grant_access()
# ๐จ RED FLAG: Reusing nonces/IVs
iv = b"constant_iv_1234" # Must be unique per encryption!
# ๐จ RED FLAG: Encrypting without authenticating
ciphertext = aes_encrypt(key, plaintext) # No integrity check!
# ๐จ RED FLAG: "I improved the algorithm"
def improved_aes(data, key):
# Adding my own twist...
# NO. Stop. You're making it weaker.8. What You Should Do Instead
The Decision Tree
Do you need encryption?
โ
โโ For data at rest?
โ โโ Use your platform's secure storage
โ - iOS Keychain
โ - Android Keystore
โ - Windows DPAPI
โ - Cloud KMS
โ
โโ For data in transit?
โ โโ Use TLS
โ - Don't implement yourself
โ - Use your language's standard library
โ - Let infrastructure handle it
โ
โโ For passwords?
โ โโ Use password hashing
โ - Argon2id
โ - bcrypt
โ - Never encrypt, always hash
โ
โโ For tokens/sessions?
โ โโ Use established libraries
โ - JWT libraries (carefully)
โ - Session management frameworks
โ - OAuth/OIDC libraries
โ
โโ For something custom?
โโ Consult a cryptographer
- Get a professional audit
- Use established building blocks
- Prepare for it to be wrongLibraries to Trust
General purpose:
- libsodium (NaCl) - Easy-to-use, hard to misuse
- OpenSSL/BoringSSL - Battle-tested (despite bugs)
Python:
- cryptography - Modern, well-maintained
- PyNaCl - Python binding for libsodium
JavaScript:
- Web Crypto API - Browser built-in
- noble-* libraries - Audited, modern
Go:
- crypto/* - Standard library, excellent
- golang.org/x/crypto - Extended algorithms
Rust:
- ring - BoringSSL-derived
- RustCrypto - Pure Rust implementations9. Summary
Three things to remember:
Cryptographic implementation is unforgiving. One timing difference, one predictable bit, one unchecked errorโand your security is gone. The algorithm can be perfect while the implementation is broken.
Even experts fail regularly. OpenSSL, Sony, Debianโall had cryptographic failures despite expert review. You are not smarter than the collective security community.
Use established, audited libraries. The only winning move is not to play. Use libraries that have been reviewed by cryptographers, tested in production, and survived attacks.
10. Whatโs Next
Understanding why not to implement crypto is the first step. But even when using proper libraries, systems still fail. Why?
In the next article: Encryption โ Securityโsystem-level failures where the cryptography was fine but everything else was broken.
