Symmetric Encryption in Real-World Systems
1. Why Should You Care?
You already know AES-GCM is a good choice and ECB is a disaster. But when you face real engineering problems:
- โWhere should I store the IV?โ
- โHow do I generate and store keys?โ
- โHow many times do I need to encrypt?โ
- โWill performance be an issue?โ
These questions are rarely covered in textbooks, but in production they determine whether your system is secure or vulnerable.
2. Symmetric Encryption in HTTPS/TLS
Why HTTPS Uses Symmetric Encryption
TLS handshake uses asymmetric encryption for key exchange. But once the handshake completes, all data transfer uses symmetric encryption. Why?
Asymmetric encryption (RSA-2048): ~1 MB/s
Symmetric encryption (AES-256-GCM): ~1 GB/s
1000x difference!TLS 1.3 Symmetric Encryption Phase
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ After TLS 1.3 Handshake โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Client โ Server: โ
โ Application Data โ
โ encrypted with client_application_traffic_secret โ
โ using AES-256-GCM or ChaCha20-Poly1305 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Server โ Client: โ
โ Application Data โ
โ encrypted with server_application_traffic_secret โ
โ using AES-256-GCM or ChaCha20-Poly1305 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโTLS Key Derivation
TLS doesnโt use the exchanged key directly. It uses HKDF (HMAC-based Key Derivation Function) to derive multiple keys:
Master Secret
โ
โโโโบ client_handshake_traffic_secret
โโโโบ server_handshake_traffic_secret
โโโโบ client_application_traffic_secret
โโโโบ server_application_traffic_secret
Separate keys for each direction
Prevents reflection attacksTLS Nonce Management
TLS 1.3 uses implicit nonce:
nonce = static_IV XOR record_sequence_number
record_sequence_number starts at 0 and increments
Each connection has different static_IV
Guarantees nonce never repeats3. File Encryption Best Practices
Basic Architecture
Original File
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 1. Generate random DEK (Data Encryption Key)โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 2. Encrypt file with DEK (AES-GCM) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 3. Encrypt DEK with KEK (Key Encryption Key)โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 4. Store: encrypted DEK + IV + encrypted fileโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโWhy Two-Layer Keys
Single key only:
- Changing key requires re-encrypting all files
- Key leak = all data leaked
Two-layer keys (DEK + KEK):
- Each file has its own DEK
- Only need to re-encrypt DEK (very small)
- Can rotate keys without touching dataImplementation Example
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
from cryptography.hazmat.primitives.kdf.scrypt import Scrypt
import os
import json
import base64
class FileEncryptor:
def __init__(self, password: str):
"""Derive KEK from password"""
self.salt = os.urandom(16)
self.kek = self._derive_kek(password, self.salt)
def _derive_kek(self, password: str, salt: bytes) -> bytes:
"""Use scrypt to derive key from password"""
kdf = Scrypt(
salt=salt,
length=32,
n=2**20, # CPU/memory cost
r=8,
p=1,
)
return kdf.derive(password.encode())
def encrypt_file(self, plaintext: bytes) -> dict:
"""Encrypt a file"""
# 1. Generate random DEK
dek = AESGCM.generate_key(bit_length=256)
# 2. Encrypt data with DEK
data_nonce = os.urandom(12)
data_cipher = AESGCM(dek)
encrypted_data = data_cipher.encrypt(data_nonce, plaintext, None)
# 3. Encrypt DEK with KEK
key_nonce = os.urandom(12)
key_cipher = AESGCM(self.kek)
encrypted_dek = key_cipher.encrypt(key_nonce, dek, None)
# 4. Package result
return {
'version': 1,
'salt': base64.b64encode(self.salt).decode(),
'key_nonce': base64.b64encode(key_nonce).decode(),
'encrypted_dek': base64.b64encode(encrypted_dek).decode(),
'data_nonce': base64.b64encode(data_nonce).decode(),
'encrypted_data': base64.b64encode(encrypted_data).decode(),
}
def decrypt_file(self, encrypted: dict) -> bytes:
"""Decrypt a file"""
# Decode
key_nonce = base64.b64decode(encrypted['key_nonce'])
encrypted_dek = base64.b64decode(encrypted['encrypted_dek'])
data_nonce = base64.b64decode(encrypted['data_nonce'])
encrypted_data = base64.b64decode(encrypted['encrypted_data'])
# 1. Decrypt DEK
key_cipher = AESGCM(self.kek)
dek = key_cipher.decrypt(key_nonce, encrypted_dek, None)
# 2. Decrypt data
data_cipher = AESGCM(dek)
plaintext = data_cipher.decrypt(data_nonce, encrypted_data, None)
return plaintextLarge File Handling
For GB-scale files, you canโt read the entire file into memory:
def encrypt_large_file(input_path: str, output_path: str, key: bytes):
"""Stream encrypt large files"""
CHUNK_SIZE = 64 * 1024 # 64KB chunks
# Use AES-GCM-SIV or AES-CTR + HMAC
# Note: Standard AES-GCM isn't suitable for streaming because it
# needs the complete data to compute the tag
# Better option: Use purpose-built file encryption formats
# like age (https://age-encryption.org/)
passRecommendation: For large files, use mature tools like age or gpg rather than implementing yourself.
4. Database Encryption
Encryption Layers
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 1. Transport Encryption (TLS) โ
โ - Encrypts communication between client and database โ
โ - Prevents network eavesdropping โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 2. Transparent Data Encryption (TDE) โ
โ - Database file-level encryption โ
โ - Protects against disk theft โ
โ - Transparent to applications โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 3. Field-Level Encryption โ
โ - Application-level encryption โ
โ - Only encrypts sensitive fields โ
โ - Even database admins can't see plaintext โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโField-Level Encryption Example
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
import os
import base64
class EncryptedField:
def __init__(self, key: bytes):
self.cipher = AESGCM(key)
def encrypt(self, value: str) -> str:
"""Encrypt field value"""
nonce = os.urandom(12)
ciphertext = self.cipher.encrypt(nonce, value.encode(), None)
# Format: nonce + ciphertext, base64 encoded
return base64.b64encode(nonce + ciphertext).decode()
def decrypt(self, encrypted_value: str) -> str:
"""Decrypt field value"""
data = base64.b64decode(encrypted_value)
nonce = data[:12]
ciphertext = data[12:]
plaintext = self.cipher.decrypt(nonce, ciphertext, None)
return plaintext.decode()
# Usage
field_key = os.urandom(32)
encrypted_field = EncryptedField(field_key)
# Store to database
ssn = "123-45-6789"
encrypted_ssn = encrypted_field.encrypt(ssn)
# INSERT INTO users (encrypted_ssn) VALUES ('...')
# Read from database
decrypted_ssn = encrypted_field.decrypt(encrypted_ssn)Querying Encrypted Fields Problem
-- This doesn't work!
SELECT * FROM users WHERE encrypted_ssn = ?
-- Because same plaintext produces different ciphertext (different nonce)Solutions:
1. Blind Index
- Compute HMAC of plaintext
- Store HMAC as searchable index
- Compute HMAC during query for matching
2. Deterministic Encryption
- Fixed nonce or SIV mode
- Same plaintext produces same ciphertext
- Can do exact matching
- Leaks equality information
3. Homomorphic Encryption
- Can compute on ciphertext
- Very high performance overhead
- Still in research stageBlind Index Implementation
import hmac
import hashlib
def create_blind_index(value: str, key: bytes) -> str:
"""Create searchable blind index"""
h = hmac.new(key, value.encode(), hashlib.sha256)
# Only take first 16 bytes, reduces storage, adds some fuzziness
return base64.b64encode(h.digest()[:16]).decode()
# Usage
index_key = os.urandom(32) # Different from encryption key!
ssn = "123-45-6789"
ssn_index = create_blind_index(ssn, index_key)
# Store
# INSERT INTO users (encrypted_ssn, ssn_index) VALUES (?, ?)
# Query
search_index = create_blind_index("123-45-6789", index_key)
# SELECT * FROM users WHERE ssn_index = ?5. Key Management
Key Lifecycle
Generate โ Distribute โ Use โ Rotate โ Revoke โ Destroy
Security considerations at each stage:
- Generate: Must use CSPRNG
- Distribute: Must transport securely
- Use: Must restrict access
- Rotate: Must support multiple versions
- Revoke: Must take effect quickly
- Destroy: Must be unrecoverableKey Storage Options
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Development Environment โ
โ - Environment variables โ
โ - Config files (don't commit to Git!) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Production Environment โ
โ - Cloud key management services (AWS KMS, GCP KMS, Azure) โ
โ - HashiCorp Vault โ
โ - Hardware Security Modules (HSM) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโCloud KMS Example
import boto3
class KMSKeyManager:
def __init__(self, key_id: str):
self.kms = boto3.client('kms')
self.key_id = key_id
def generate_data_key(self) -> tuple:
"""Generate data key"""
response = self.kms.generate_data_key(
KeyId=self.key_id,
KeySpec='AES_256'
)
return (
response['Plaintext'], # Use for encryption
response['CiphertextBlob'] # Store this
)
def decrypt_data_key(self, encrypted_key: bytes) -> bytes:
"""Decrypt data key"""
response = self.kms.decrypt(
KeyId=self.key_id,
CiphertextBlob=encrypted_key
)
return response['Plaintext']
# Usage
km = KMSKeyManager('alias/my-key')
# During encryption
plaintext_key, encrypted_key = km.generate_data_key()
# Use plaintext_key to encrypt data
# Store encrypted_key and encrypted data
# During decryption
plaintext_key = km.decrypt_data_key(encrypted_key)
# Use plaintext_key to decrypt data6. Performance Considerations
Hardware Acceleration
# Check if CPU supports AES-NI
import subprocess
result = subprocess.run(['grep', 'aes', '/proc/cpuinfo'], capture_output=True)
has_aesni = b'aes' in result.stdout
# Almost all modern CPUs support it
# AES-NI can make AES encryption 10x+ fasterEncryption Performance Impact
Operation | No Encryption | AES-256-GCM
-----------------------------------------------------
File I/O | 1.0x | ~1.1x
Network Transfer | 1.0x | ~1.05x
Database Query | 1.0x | 1.0x (TDE)
Field Encrypt/Decrypt | 1.0x | ~1.5-2x
Conclusion: For most applications, encryption overhead is negligibleWhen Performance Becomes an Issue
1. Many Small Files
- Each encryption requires initialization
- Consider batch processing
2. Real-time Data Streams
- Latency sensitive
- Consider ChaCha20-Poly1305 (faster without AES-NI)
3. Database Field Encryption + High Query Volume
- Each access requires encrypt/decrypt
- Consider caching decrypted values7. Common Mistakes Summary
| Mistake | Consequence | Correct Approach |
|---|---|---|
| Hardcoding keys in code | Keys leak with code | Use env vars or key management services |
| Using password directly as key | Key space too small | Use KDF (PBKDF2, scrypt, Argon2) |
| Not storing IV/nonce | Cannot decrypt | IV can be stored with ciphertext |
| Treating IV as secret | Unnecessary, adds complexity | IV doesnโt need secrecy, just uniqueness |
| Encrypting too much data with one key | GCM has data limits | Rotate keys periodically |
| Implementing encryption logic yourself | Almost certainly has vulnerabilities | Use mature libraries |
8. Summary
Three things to remember:
HTTPS demonstrates symmetric encryption best practices. Key derivation, nonce management, authenticated encryptionโTLS design is worth studying.
File encryption uses two-layer keys (DEK + KEK). This makes key rotation simple and provides better security isolation.
Database encryption has multiple layers. Transport encryption, transparent data encryption, and field encryption each have their uses. Field encryption must consider query problems.
9. Whatโs Next
Weโve completed our deep dive into symmetric encryption. But symmetric encryption has a fundamental problem: both parties need to share a key in advance.
In the next section, weโll enter the world of asymmetric encryption: RSAโs core ideaโwhy โfactoring large numbersโ is so hard, and where public and private keys come from.
