ARC Cache (Adaptive Replacement)
ARC Cache: The Grand Strategist
The Story: The Two Advisors
Imagine a King (the Cache) who has limited space in his castle. He has two advisors:
- The Pragmatist (LRU): âKeep the people who visited recently! They are active!â
- The Historian (LFU): âKeep the people who visit often! They are loyal!â
For years, Kings had to choose one advisor and ignore the other. But in 2003, researchers at IBM proposed a new system: Listen to both, and punish the one who is wrong.
The King maintains two âGhost Listsâ outside the castle walls. These lists track the names of people who were recently evicted.
- If a person on the LRU Ghost List comes back, the King realizes: âI shouldnât have listened to the Historian! I need more space for recent items.â -> He expands the LRU section.
- If a person on the LFU Ghost List comes back, the King realizes: âI shouldnât have listened to the Pragmatist! I need more space for frequent items.â -> He expands the LFU section.
This is ARC (Adaptive Replacement Cache). It self-tunes in real-time.
Why do we need it?
- Database Servers: Workloads change. During the day, users might scan recent records (LRU). At night, a backup job might hit the same old blocks repeatedly (LFU). ARC adapts automatically without a restart.
- ZFS Filesystem: ARC is famous for powering ZFS, one of the most advanced file systems in the world. It provides high hit rates for complex, mixed workloads.
How the Algorithm âThinksâ
ARC divides the cache into two dynamic regions:
- T1 (Recency): Items seen only once recently.
- T2 (Frequency): Items seen at least twice.
And two ghost regions (history of evictions):
- B1 (Ghost Recency): Items evicted from T1.
- B2 (Ghost Frequency): Items evicted from T2.
The Magic Logic:
- Let
pbe the target size of T1. - If we get a hit in B1 (we shouldnât have evicted it from Recency), we increase
p(give more space to Recency). - If we get a hit in B2 (we shouldnât have evicted it from Frequency), we decrease
p(give more space to Frequency).
Itâs a feedback loop. The cache literally learns from its own eviction mistakes.
Engineering Context: The Patent Wall
For a long time, ARC was patented by IBM, which prevented many open-source projects (like Linux Kernel or PostgreSQL) from using it.
- The Alternative: This led to the invention of LIRS and Clock-Pro, which try to approximate ARCâs genius without infringing on the patent.
- Status: The patent expired around 2020, so ARC is now free for everyone to use!
Implementation (Conceptual Python)
Implementing a full ARC is complex (4 linked lists!). Here is a conceptual sketch of the adaptation logic.
class ARCCache:
def __init__(self, capacity):
self.c = capacity
self.p = 0 # Target size of T1 (Recency)
self.t1 = [] # Recency List (Real Data)
self.t2 = [] # Frequency List (Real Data)
self.b1 = [] # Ghost Recency (Metadata only)
self.b2 = [] # Ghost Frequency (Metadata only)
def access(self, key):
# Case 1: Cache Hit in T1 or T2
if key in self.t1:
self.t1.remove(key)
self.t2.append(key) # Promote to Frequency
return "Hit T1"
if key in self.t2:
self.t2.remove(key)
self.t2.append(key) # Update position
return "Hit T2"
# Case 2: Ghost Hit (The Learning Moment)
if key in self.b1:
# We evicted it from T1 too soon! Increase T1 size.
delta = 1
if len(self.b1) < len(self.b2):
delta = len(self.b2) / len(self.b1)
self.p = min(self.c, self.p + delta)
self.replace(key)
# Move from Ghost to Real Frequency
self.b1.remove(key)
self.t2.append(key)
return "Ghost Hit B1 - Adapted towards Recency"
if key in self.b2:
# We evicted it from T2 too soon! Decrease T1 size (favors T2).
delta = 1
if len(self.b2) < len(self.b1):
delta = len(self.b1) / len(self.b2)
self.p = max(0, self.p - delta)
self.replace(key)
# Move from Ghost to Real Frequency
self.b2.remove(key)
self.t2.append(key)
return "Ghost Hit B2 - Adapted towards Frequency"
# Case 3: Total Miss
# Add to T1 (New item)
if len(self.t1) + len(self.b1) == self.c:
if len(self.t1) < self.c:
self.b1.pop(0)
self.replace(key)
else:
self.t1.pop(0)
self.t1.append(key)
return "Miss"
def replace(self, key):
# Decide who to kill based on parameter 'p'
if len(self.t1) > 0 and (len(self.t1) > self.p or (key in self.b2 and len(self.t1) == self.p)):
old = self.t1.pop(0)
self.b1.append(old) # Move to Ghost Recency
else:
old = self.t2.pop(0)
self.b2.append(old) # Move to Ghost FrequencySummary
ARC teaches us that rigid rules fail in dynamic worlds. By admitting that it might be wrong and keeping track of its âregretsâ (Ghost Lists), ARC achieves a balance that neither pure LRU nor pure LFU can match. It reminds us that the mark of true intelligence is not just knowing, but learning.
