Password Storage: Why You Should Never Encrypt Passwords
1. Why Should You Care?
Youโre building an application with user accounts. Users enter passwords, and you need to store something in the database to verify them later.
If your database gets breached (and assume it will), what happens to your usersโ passwords?
Bad password storage has led to billions of leaked credentials. Letโs understand what โgoodโ looks like.
2. Why Not Encryption?
The Problem with Encrypting Passwords
If you encrypt passwords:
Storage: AES-GCM(key, "password123") โ ciphertext
Verify: AES-GCM-decrypt(key, ciphertext) โ "password123"
Problems:
1. You have the decryption key
2. Anyone who gets the key gets ALL passwords
3. You can see users' actual passwords
4. Key management becomes critical weakness
This is WRONG. You should never be able to recover passwords.What We Actually Need
Requirements for password storage:
1. Verify: Can check if entered password is correct
2. One-way: Cannot recover original password from storage
3. Unique: Same password โ different storage values (per user)
4. Slow: Expensive to compute (resists brute force)
5. Future-proof: Can increase difficulty over time3. Why Not Plain Hash?
The Naive Approach (Very Broken)
import hashlib
# WRONG: Plain hash
def store_password(password):
return hashlib.sha256(password.encode()).hexdigest()
def verify_password(password, stored):
return hashlib.sha256(password.encode()).hexdigest() == stored
# Problem: Same password = same hash
store_password("password123") # Always same output!Attack 1: Rainbow Tables
Pre-compute hashes for common passwords:
Rainbow Table:
password123 โ ef92b778bafe771e89245b89ecbc08a44a4e166c06659911881f383d4473e94f
123456 โ 8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92
qwerty โ 65e84be33532fb784c48129675f9eff3a682b27168c0ea744b2cf58ee02337c5
...millions more...
Attack: Look up hash in table โ instant password recoveryAttack 2: Same Hash = Same Password
Database leak:
user1: ef92b778bafe771e89245b89ecbc08a44a4e166c06659911881f383d4473e94f
user2: ef92b778bafe771e89245b89ecbc08a44a4e166c06659911881f383d4473e94f
user3: 8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92
Attacker sees: user1 and user2 have the same password!
Crack one, get both.4. Salting: Unique Per User
Adding a Salt
import hashlib
import os
def store_password(password):
salt = os.urandom(16) # Random per user
hash_input = salt + password.encode()
password_hash = hashlib.sha256(hash_input).hexdigest()
return salt.hex() + ":" + password_hash
def verify_password(password, stored):
salt_hex, stored_hash = stored.split(":")
salt = bytes.fromhex(salt_hex)
hash_input = salt + password.encode()
computed_hash = hashlib.sha256(hash_input).hexdigest()
return computed_hash == stored_hash
# Now same password โ different hashes
print(store_password("password123")) # Different each time!
print(store_password("password123")) # Different again!Salt Solves Some Problems
With salt:
โ Rainbow tables useless (need table per salt)
โ Same password โ different stored values
โ Can't identify users with same password
Still broken:
โ SHA-256 is too fast!
โ GPU can compute billions of hashes/second
โ Brute force still practical5. The Speed Problem
Modern GPU Attack Speeds
Hashcat on RTX 4090 (approximate):
SHA-256: 22,000,000,000 H/s (22 billion/second)
MD5: 164,000,000,000 H/s
For 8-char lowercase password (26^8 = 208 billion):
SHA-256: 208B / 22B = ~10 seconds
MD5: 208B / 164B = ~1.3 seconds
For 8-char mixed case + digits (62^8 = 218 trillion):
SHA-256: 218T / 22B = ~2.7 hours
MD5: 218T / 164B = ~22 minutes
This is why we need SLOW hash functions!The Solution: Work Factors
Password hashing algorithms include deliberate slowness:
bcrypt: Cost factor (2^cost iterations)
scrypt: CPU cost, memory cost, parallelization
Argon2: Time cost, memory cost, parallelism
Goal: Make each hash attempt take ~100ms-1s
Attacker doing 1 billion attempts now takes 3+ years6. bcrypt
How bcrypt Works
bcrypt design:
1. Based on Blowfish cipher
2. Expensive key setup phase
3. Cost factor controls iterations (2^cost)
4. Built-in salt (22 chars)
5. Output: 60 characters
Format: $2b$cost$salt(22)hash(31)
Example: $2b$12$LQv3c1yqBWVHxkd0LHAkCOYz6TtxMQJqhN8/BoIYq6h.Cg0f3Fy/q
โโฌโโโฌโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ salt hash
โ โโโ cost factor (12 = 2^12 = 4096 iterations)
โโโ algorithm version (2b = modern bcrypt)bcrypt in Python
import bcrypt
def hash_password(password: str) -> str:
"""Hash a password for storage"""
# Generate salt and hash (cost factor 12 is good default)
password_bytes = password.encode('utf-8')
salt = bcrypt.gensalt(rounds=12) # 2^12 = 4096 iterations
hashed = bcrypt.hashpw(password_bytes, salt)
return hashed.decode('utf-8')
def verify_password(password: str, hashed: str) -> bool:
"""Verify a password against stored hash"""
password_bytes = password.encode('utf-8')
hashed_bytes = hashed.encode('utf-8')
return bcrypt.checkpw(password_bytes, hashed_bytes)
# Usage
stored = hash_password("my_secure_password")
print(f"Stored: {stored}")
# $2b$12$LQv3c1yqBWVHxkd0LHAkCOYz6TtxMQJqhN8/BoIYq6h.Cg0f3Fy/q
# Verify
print(verify_password("my_secure_password", stored)) # True
print(verify_password("wrong_password", stored)) # Falsebcrypt Limitations
bcrypt issues:
- 72-byte password limit (truncates longer passwords)
- Fixed memory usage (not memory-hard)
- Can be accelerated with specialized hardware
Workaround for long passwords:
def hash_long_password(password: str) -> str:
# Pre-hash to handle any length
import hashlib
pre_hash = hashlib.sha256(password.encode()).digest()
import base64
shortened = base64.b64encode(pre_hash)[:72]
return hash_password(shortened.decode())7. Argon2 (Recommended)
Why Argon2?
Argon2 won the Password Hashing Competition (2015):
Three variants:
- Argon2d: Maximum GPU resistance, vulnerable to side-channels
- Argon2i: Side-channel resistant, for password hashing
- Argon2id: Hybrid (recommended), best of both
Features:
- Memory-hard (configurable memory usage)
- Time-configurable (iterations)
- Parallelism-configurable (CPU threads)
- No password length limitArgon2 in Python
from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError
# Create hasher with recommended parameters
ph = PasswordHasher(
time_cost=3, # Number of iterations
memory_cost=65536, # 64 MB of memory
parallelism=4, # 4 parallel threads
hash_len=32, # Output length
salt_len=16 # Salt length
)
def hash_password(password: str) -> str:
"""Hash a password using Argon2id"""
return ph.hash(password)
def verify_password(password: str, hashed: str) -> bool:
"""Verify a password against stored hash"""
try:
ph.verify(hashed, password)
return True
except VerifyMismatchError:
return False
def needs_rehash(hashed: str) -> bool:
"""Check if hash needs to be updated with new parameters"""
return ph.check_needs_rehash(hashed)
# Usage
stored = hash_password("my_secure_password")
print(f"Stored: {stored}")
# $argon2id$v=19$m=65536,t=3,p=4$c2FsdHNhbHRzYWx0$hash...
print(verify_password("my_secure_password", stored)) # True
# Upgrade parameters over time
if verify_password("my_secure_password", stored) and needs_rehash(stored):
new_hash = hash_password("my_secure_password")
# Store new_hash in databaseChoosing Argon2 Parameters
OWASP recommendations (2024):
Minimum:
- Argon2id
- m=19456 (19 MB), t=2, p=1
Recommended:
- Argon2id
- m=65536 (64 MB), t=3, p=4
High security:
- Argon2id
- m=262144 (256 MB), t=4, p=8
Tuning approach:
1. Set memory to maximum your server can spare
2. Increase time_cost until hash takes ~0.5-1 second
3. Set parallelism to number of available cores8. scrypt
When to Use scrypt
scrypt advantages:
- Memory-hard (like Argon2)
- Well-studied since 2009
- Used in some cryptocurrencies
When to use:
- When Argon2 isn't available
- For key derivation (HKDF-like use cases)
- Compatibility with existing systemsscrypt in Python
from cryptography.hazmat.primitives.kdf.scrypt import Scrypt
import os
def hash_password_scrypt(password: str) -> tuple[bytes, bytes]:
"""Hash password with scrypt"""
salt = os.urandom(16)
kdf = Scrypt(
salt=salt,
length=32,
n=2**17, # CPU/memory cost (must be power of 2)
r=8, # Block size
p=1 # Parallelization
)
key = kdf.derive(password.encode())
return salt, key
def verify_password_scrypt(password: str, salt: bytes, stored_key: bytes) -> bool:
"""Verify password with scrypt"""
kdf = Scrypt(
salt=salt,
length=32,
n=2**17,
r=8,
p=1
)
try:
kdf.verify(password.encode(), stored_key)
return True
except Exception:
return False
# Usage
salt, key = hash_password_scrypt("my_password")
print(verify_password_scrypt("my_password", salt, key)) # True9. Comparison
โโโโโโโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโ
โ Algorithm โ Memory โ Parallelismโ Recommended โ Notes โ
โ โ Hard โ Resistant โ โ โ
โโโโโโโโโโโโโโโผโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโค
โ Argon2id โ โ โ โ โ โโโ โ Best choice โ
โ scrypt โ โ โ Partial โ โโ โ Good fallback โ
โ bcrypt โ โ โ Partial โ โ โ Still OK โ
โ PBKDF2 โ โ โ โ โ Legacy only โ Use 600k iters โ
โ SHA-256 โ โ โ โ โ โ โ Never use โ
โ MD5 โ โ โ โ โ โโโ โ Never use โ
โโโโโโโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโ
Memory-hard: Requires significant RAM, harder to parallelize on GPUs
Parallelism resistant: Difficult to speed up with multiple cores/GPUs10. Complete Implementation
"""
Production-ready password hashing module
"""
from argon2 import PasswordHasher, Type
from argon2.exceptions import VerifyMismatchError, InvalidHashError
import secrets
import hmac
class PasswordManager:
"""Secure password hashing with Argon2id"""
def __init__(
self,
time_cost: int = 3,
memory_cost: int = 65536, # 64 MB
parallelism: int = 4,
pepper: bytes = None # Server-side secret
):
self.hasher = PasswordHasher(
time_cost=time_cost,
memory_cost=memory_cost,
parallelism=parallelism,
hash_len=32,
salt_len=16,
type=Type.ID # Argon2id
)
self.pepper = pepper
def _apply_pepper(self, password: str) -> str:
"""Add pepper to password before hashing"""
if self.pepper:
# HMAC prevents length extension attacks
peppered = hmac.new(
self.pepper,
password.encode(),
'sha256'
).hexdigest()
return peppered
return password
def hash(self, password: str) -> str:
"""Hash a password for storage"""
if not password:
raise ValueError("Password cannot be empty")
peppered = self._apply_pepper(password)
return self.hasher.hash(peppered)
def verify(self, password: str, hash: str) -> bool:
"""Verify a password against a hash"""
if not password or not hash:
return False
peppered = self._apply_pepper(password)
try:
self.hasher.verify(hash, peppered)
return True
except (VerifyMismatchError, InvalidHashError):
return False
def needs_rehash(self, hash: str) -> bool:
"""Check if hash needs updating with new parameters"""
try:
return self.hasher.check_needs_rehash(hash)
except InvalidHashError:
return True
def verify_and_rehash(self, password: str, hash: str) -> tuple[bool, str | None]:
"""Verify password and return new hash if parameters changed"""
if not self.verify(password, hash):
return False, None
if self.needs_rehash(hash):
return True, self.hash(password)
return True, None
# Usage example
def example_usage():
# Initialize with optional pepper (store in env var, not code!)
pepper = secrets.token_bytes(32) # In production: from environment
pm = PasswordManager(pepper=pepper)
# Registration
password = "user_password_123"
hashed = pm.hash(password)
print(f"Stored hash: {hashed[:50]}...")
# Login
is_valid = pm.verify(password, hashed)
print(f"Password valid: {is_valid}")
# Check for rehash (after upgrading parameters)
is_valid, new_hash = pm.verify_and_rehash(password, hashed)
if new_hash:
print("Hash upgraded, store new_hash in database")
if __name__ == "__main__":
example_usage()11. Common Mistakes
Mistake 1: Comparing Hashes Insecurely
# WRONG: Timing attack vulnerability
def verify_bad(password, stored_hash):
computed = hash_password(password)
return computed == stored_hash # String comparison leaks timing
# RIGHT: Use constant-time comparison
import hmac
def verify_good(password, stored_hash):
computed = hash_password(password)
return hmac.compare_digest(computed, stored_hash)
# BEST: Use library's built-in verify function
# (bcrypt.checkpw, argon2.verify already handle this)Mistake 2: Hardcoding Parameters
# WRONG: Parameters in code
def hash_password(pwd):
return argon2.hash(pwd, time_cost=2, memory_cost=32768)
# RIGHT: Configurable, allows upgrades
class PasswordConfig:
TIME_COST = int(os.environ.get('ARGON2_TIME_COST', 3))
MEMORY_COST = int(os.environ.get('ARGON2_MEMORY_KB', 65536))
PARALLELISM = int(os.environ.get('ARGON2_PARALLELISM', 4))Mistake 3: Not Handling Upgrades
# Always check if rehashing is needed after successful login
def login(username, password):
user = get_user(username)
if not verify_password(password, user.password_hash):
return False
# Upgrade hash if using old parameters
if needs_rehash(user.password_hash):
user.password_hash = hash_password(password)
save_user(user)
return True12. Summary
Three things to remember:
Never encrypt passwords, never use plain hashes. Encryption is reversible, plain hashes are too fast. Use purpose-built password hashing algorithms.
Argon2id is the best choice. Itโs memory-hard, configurable, and won the Password Hashing Competition. Use bcrypt if Argon2 isnโt available.
Tune parameters for ~0.5-1 second hash time. This makes brute force impractical while keeping login acceptable. Increase parameters over time as hardware improves.
13. Whatโs Next
We can hash passwords securely. But where do we store the pepper? How do we manage encryption keys? What happens when keys need to be rotated?
In the next article: Key Managementโgenerating, storing, rotating, and destroying cryptographic keys safely.
