HMAC and Data Integrity: Detecting Tampering with Shared Secrets
1. Why Should You Care?
Youโre building an API. Clients send requests like:
{"action": "transfer", "amount": 1000, "to": "attacker"}How do you know this request wasnโt modified in transit? How do you know it came from an authorized client?
If you already share a secret key with the client (like an API key), you can use a Message Authentication Code (MAC) to verify both authenticity and integrityโfaster than digital signatures.
2. Definition
A Message Authentication Code (MAC) is a short piece of information used to authenticate a message and verify its integrity.
HMAC (Hash-based MAC) constructs a MAC using a cryptographic hash function and a secret key.
MAC Properties:
- Anyone with the key can generate
- Anyone with the key can verify
- Without the key, cannot forge
Digital Signature vs MAC:
- Signature: Only signer can create, anyone can verify
- MAC: Anyone with key can create AND verify
- Signature: Non-repudiation
- MAC: No non-repudiation (both parties can create)3. Why Not Just Hash?
The Naive Approach (Broken)
# WRONG: Simple hash is not authentication
import hashlib
def naive_integrity(message):
return hashlib.sha256(message).hexdigest()
# Attacker can compute hash of any message!
# This provides NO authenticationThe Problem
Hash alone proves:
โ Nothing about who created the message
โ Nothing about authorization
Because:
- Hash functions are public
- Anyone can compute SHA256(any_message)
- Attacker can forge: message' + SHA256(message')Why HMAC Works
HMAC includes a secret key:
HMAC(key, message) = hash(key || hash(key || message))
Only those who know the key can:
- Compute valid MACs
- Verify MACs
Without the key, attacker cannot:
- Compute MAC for modified message
- Find a different message with same MAC4. HMAC Construction
The HMAC Formula
HMAC(K, m) = H((K' โ opad) || H((K' โ ipad) || m))
Where:
K = secret key
K' = key padded to block size
H = hash function (SHA-256, etc.)
โ = XOR
opad = outer padding (0x5c repeated)
ipad = inner padding (0x36 repeated)
m = messageWhy This Structure?
Simple approaches have flaws:
H(key || message): Length extension attacks
H(message || key): Collision issues
H(key || message || key): Still vulnerable
HMAC's nested structure:
- Prevents length extension
- Provides security proof
- Standard since RFC 2104 (1997)5. Using HMAC in Python
Basic HMAC
import hmac
import hashlib
key = b"super-secret-api-key"
message = b"action=transfer&amount=1000&to=alice"
# Compute HMAC
mac = hmac.new(key, message, hashlib.sha256).hexdigest()
print(f"HMAC: {mac}")
# Verify HMAC (timing-safe comparison)
def verify_hmac(key, message, received_mac):
expected_mac = hmac.new(key, message, hashlib.sha256).hexdigest()
return hmac.compare_digest(expected_mac, received_mac)
# Usage
is_valid = verify_hmac(key, message, mac)
print(f"Valid: {is_valid}")Common Mistake: Timing Attacks
# WRONG: Vulnerable to timing attack
def insecure_verify(expected, received):
return expected == received # Leaks length info!
# RIGHT: Constant-time comparison
import hmac
def secure_verify(expected, received):
return hmac.compare_digest(expected, received)
# Why timing matters:
# == operator returns early on first mismatch
# Attacker can measure time to guess MAC byte by byte
# compare_digest always takes same time6. Real-World Applications
API Request Signing
import hmac
import hashlib
import time
import base64
class APIClient:
def __init__(self, api_key: str, api_secret: str):
self.api_key = api_key
self.api_secret = api_secret.encode()
def sign_request(self, method: str, path: str, body: str = "") -> dict:
timestamp = str(int(time.time()))
# Create string to sign
string_to_sign = f"{method}\n{path}\n{timestamp}\n{body}"
# Compute HMAC
signature = hmac.new(
self.api_secret,
string_to_sign.encode(),
hashlib.sha256
).hexdigest()
return {
"X-API-Key": self.api_key,
"X-Timestamp": timestamp,
"X-Signature": signature
}
class APIServer:
def __init__(self, secrets: dict):
self.secrets = secrets # api_key -> api_secret
def verify_request(self, method: str, path: str, body: str,
api_key: str, timestamp: str, signature: str) -> bool:
# Check timestamp (prevent replay attacks)
if abs(time.time() - int(timestamp)) > 300: # 5 minute window
return False
# Get secret for this key
api_secret = self.secrets.get(api_key)
if not api_secret:
return False
# Recompute signature
string_to_sign = f"{method}\n{path}\n{timestamp}\n{body}"
expected = hmac.new(
api_secret.encode(),
string_to_sign.encode(),
hashlib.sha256
).hexdigest()
return hmac.compare_digest(expected, signature)Cookie Authentication
import hmac
import hashlib
import base64
import json
import time
class SecureCookie:
def __init__(self, secret_key: bytes):
self.secret_key = secret_key
def create(self, data: dict, max_age: int = 3600) -> str:
"""Create a signed cookie value"""
payload = {
"data": data,
"exp": int(time.time()) + max_age
}
payload_json = json.dumps(payload, sort_keys=True)
payload_b64 = base64.b64encode(payload_json.encode()).decode()
# Sign the payload
signature = hmac.new(
self.secret_key,
payload_b64.encode(),
hashlib.sha256
).hexdigest()
return f"{payload_b64}.{signature}"
def verify(self, cookie_value: str) -> dict | None:
"""Verify and decode a signed cookie"""
try:
payload_b64, signature = cookie_value.rsplit(".", 1)
# Verify signature
expected = hmac.new(
self.secret_key,
payload_b64.encode(),
hashlib.sha256
).hexdigest()
if not hmac.compare_digest(expected, signature):
return None
# Decode and check expiration
payload = json.loads(base64.b64decode(payload_b64))
if time.time() > payload["exp"]:
return None
return payload["data"]
except Exception:
return None
# Usage
cookie = SecureCookie(b"my-super-secret-key")
value = cookie.create({"user_id": 123, "role": "admin"})
print(f"Cookie: {value}")
data = cookie.verify(value)
print(f"Data: {data}")Webhook Verification
import hmac
import hashlib
def verify_github_webhook(payload: bytes, signature: str, secret: str) -> bool:
"""Verify GitHub webhook signature"""
# GitHub sends: sha256=<hex_digest>
if not signature.startswith("sha256="):
return False
expected = "sha256=" + hmac.new(
secret.encode(),
payload,
hashlib.sha256
).hexdigest()
return hmac.compare_digest(expected, signature)
def verify_stripe_webhook(payload: bytes, signature: str, secret: str) -> bool:
"""Verify Stripe webhook signature"""
# Stripe format: t=timestamp,v1=signature
parts = dict(p.split("=") for p in signature.split(","))
# Stripe signs: timestamp.payload
signed_payload = f"{parts['t']}.{payload.decode()}"
expected = hmac.new(
secret.encode(),
signed_payload.encode(),
hashlib.sha256
).hexdigest()
return hmac.compare_digest(expected, parts["v1"])7. HMAC vs Other MACs
Comparison
HMAC:
- Based on hash function (SHA-256, etc.)
- Well-studied, conservative choice
- Slightly slower than some alternatives
Poly1305:
- Designed for speed
- Used with ChaCha20 (ChaCha20-Poly1305)
- One-time key per message
CMAC/OMAC:
- Based on block cipher (AES)
- Used in some standards
- Similar security to HMAC
GMAC:
- MAC part of GCM mode
- Very fast with AES-NI
- Requires unique nonceWhen to Use What
Use HMAC-SHA256 when:
- Need standalone MAC
- Maximum compatibility
- Conservative security choice
Use Poly1305 when:
- Using ChaCha20 for encryption
- Need maximum speed
- Part of authenticated encryption
Use GCM/GMAC when:
- Using AES for encryption
- Need authenticated encryption
- Have hardware AES support8. Security Considerations
Key Management
HMAC key requirements:
- Must be secret (obviously)
- Should be random, not derived from passwords
- Minimum 128 bits, prefer 256 bits
- Different keys for different purposes
Key derivation if needed:
from cryptography.hazmat.primitives.kdf.hkdf import HKDF
from cryptography.hazmat.primitives import hashes
def derive_hmac_key(master_key: bytes, purpose: str) -> bytes:
return HKDF(
algorithm=hashes.SHA256(),
length=32,
salt=None,
info=purpose.encode()
).derive(master_key)What HMAC Doesnโt Provide
HMAC does NOT provide:
โ Confidentiality (message is plaintext)
โ Non-repudiation (both parties can create MAC)
โ Replay protection (need timestamp/nonce)
For confidentiality + integrity:
โ Use authenticated encryption (AES-GCM, ChaCha20-Poly1305)
For non-repudiation:
โ Use digital signatures
For replay protection:
โ Include timestamp or sequence number in messageCommon Mistakes
# WRONG: Using password as key
hmac.new(b"password123", message, hashlib.sha256)
# RIGHT: Derive key from password
from cryptography.hazmat.primitives.kdf.scrypt import Scrypt
kdf = Scrypt(salt=salt, length=32, n=2**20, r=8, p=1)
key = kdf.derive(b"password123")
hmac.new(key, message, hashlib.sha256)
# WRONG: Same key for encryption and MAC
aes_key = os.urandom(32)
encrypted = aes_encrypt(aes_key, plaintext)
mac = hmac.new(aes_key, encrypted, hashlib.sha256)
# RIGHT: Separate keys
aes_key = os.urandom(32)
mac_key = os.urandom(32)
encrypted = aes_encrypt(aes_key, plaintext)
mac = hmac.new(mac_key, encrypted, hashlib.sha256)
# BEST: Use authenticated encryption
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
key = AESGCM.generate_key(bit_length=256)
aesgcm = AESGCM(key)
ciphertext = aesgcm.encrypt(nonce, plaintext, associated_data)9. HMAC in Protocols
TLS
TLS uses HMAC (or AEAD) for record authentication:
TLS Record:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Content Type โ Version โ Length โ Payload โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Encrypted + MAC โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
TLS 1.2: MAC-then-encrypt or AEAD
TLS 1.3: AEAD only (GCM or ChaCha20-Poly1305)JWT
JWT structure:
header.payload.signature
For HS256 (HMAC-SHA256):
signature = HMAC-SHA256(
secret,
base64url(header) + "." + base64url(payload)
)
Verification:
1. Split token into parts
2. Recompute HMAC
3. Compare signatures (constant-time!)AWS Signature V4
AWS request signing uses HMAC chains:
DateKey = HMAC-SHA256("AWS4" + SecretKey, Date)
RegionKey = HMAC-SHA256(DateKey, Region)
ServiceKey = HMAC-SHA256(RegionKey, Service)
SigningKey = HMAC-SHA256(ServiceKey, "aws4_request")
Signature = HMAC-SHA256(SigningKey, StringToSign)10. Complete Example: Signed Messages
import hmac
import hashlib
import json
import time
import os
import base64
class SignedMessageProtocol:
"""Complete protocol for authenticated messages"""
def __init__(self, shared_secret: bytes):
self.key = shared_secret
def create_message(self, payload: dict) -> str:
"""Create authenticated message"""
# Add metadata
message = {
"payload": payload,
"timestamp": int(time.time()),
"nonce": base64.b64encode(os.urandom(16)).decode()
}
# Serialize
message_json = json.dumps(message, sort_keys=True)
message_b64 = base64.b64encode(message_json.encode()).decode()
# Create MAC
mac = hmac.new(
self.key,
message_b64.encode(),
hashlib.sha256
).hexdigest()
return f"{message_b64}.{mac}"
def verify_message(self, signed_message: str,
max_age: int = 300) -> dict | None:
"""Verify and extract message"""
try:
# Split
message_b64, received_mac = signed_message.rsplit(".", 1)
# Verify MAC
expected_mac = hmac.new(
self.key,
message_b64.encode(),
hashlib.sha256
).hexdigest()
if not hmac.compare_digest(expected_mac, received_mac):
return None
# Decode
message_json = base64.b64decode(message_b64)
message = json.loads(message_json)
# Check timestamp
age = time.time() - message["timestamp"]
if age < 0 or age > max_age:
return None
return message["payload"]
except Exception:
return None
# Usage
secret = os.urandom(32)
protocol = SignedMessageProtocol(secret)
# Sender
msg = protocol.create_message({
"action": "transfer",
"amount": 100,
"to": "[email protected]"
})
print(f"Signed message: {msg[:50]}...")
# Receiver
payload = protocol.verify_message(msg)
if payload:
print(f"Verified payload: {payload}")
else:
print("Verification failed!")11. Summary
Three things to remember:
HMAC provides authentication AND integrity. Unlike plain hashing, HMAC requires knowledge of the secret key. Without the key, attackers cannot forge valid MACs.
Always use constant-time comparison. Use
hmac.compare_digest()to prevent timing attacks. Regular string comparison leaks information through timing.HMAC doesnโt replace encryption. It verifies integrity but doesnโt hide content. For confidentiality + integrity, use authenticated encryption (AES-GCM).
12. Whatโs Next
Weโve now covered the fundamental cryptographic primitives: symmetric encryption, asymmetric encryption, digital signatures, certificates, and MACs.
In the next section, weโll see how these pieces come together in real protocols: TLS Deep Diveโhow HTTPS actually works, from handshake to secure data transfer.
