Segment Tree
Segment Tree: The Calculator of Ranges
The Story: The Dynamic Warehouse
Imagine a warehouse with 100 shelves. Each shelf has a certain number of boxes. The manager asks two types of questions:
- Update: “Shelf 5 just got 3 more boxes.”
- Query: “What is the total number of boxes from Shelf 10 to Shelf 50?”
- If you use a simple Array, Update is fast (), but Query is slow ( - you have to walk 40 shelves).
- If you use a Pre-computed Sum Array, Query is fast (), but Update is slow ( - you have to re-calculate all sums after Shelf 5).
The Segment Tree is the middle ground. It makes both operations fast (). It divides the shelves into groups (1-50, 51-100), and subgroups (1-25, 26-50), and remembers the sum for each group.
Why do we need it?
- Computational Geometry: Finding overlapping rectangles.
- Analytics: “What was the peak stock price between 10:00 AM and 2:00 PM?” (Range Max Query).
- Game Development: “Is there any wall in this range of X coordinates?”
How the Algorithm “Thinks”
The algorithm is a Hierarchical Aggregator.
- The Leaves: The original array elements are the leaves of the tree.
- The Internal Nodes: Each parent stores the Sum (or Min/Max) of its two children.
- The Query: To find the sum of range , we don’t look at individual leaves. We look for the biggest pre-calculated “blocks” that fit inside and sum them up.
- The Update: When a leaf changes, we walk up to the root, updating only the affected parents.
Engineering Context: Lazy Propagation
The advanced version uses Lazy Propagation.
- If you want to “Add 5 to every shelf from 1 to 50,” doing it one by one is slow.
- Instead, you put a “Sticky Note” (Lazy Tag) on the node representing [1-50] saying “+5”. You don’t actually update the children until someone asks for them. This makes range updates too.
Implementation (Python)
class SegmentTree:
def __init__(self, data):
self.n = len(data)
# Tree size is roughly 4*n
self.tree = [0] * (4 * self.n)
self._build(data, 1, 0, self.n - 1)
def _build(self, data, node, start, end):
if start == end:
self.tree[node] = data[start]
else:
mid = (start + end) // 2
self._build(data, 2 * node, start, mid)
self._build(data, 2 * node + 1, mid + 1, end)
# Example: Sum Query
self.tree[node] = self.tree[2 * node] + self.tree[2 * node + 1]
def update(self, idx, val):
self._update(1, 0, self.n - 1, idx, val)
def _update(self, node, start, end, idx, val):
if start == end:
self.tree[node] = val
else:
mid = (start + end) // 2
if start <= idx <= mid:
self._update(2 * node, start, mid, idx, val)
else:
self._update(2 * node + 1, mid + 1, end, idx, val)
self.tree[node] = self.tree[2 * node] + self.tree[2 * node + 1]
def query(self, l, r):
return self._query(1, 0, self.n - 1, l, r)
def _query(self, node, start, end, l, r):
if r < start or end < l:
return 0 # Out of range
if l <= start and end <= r:
return self.tree[node] # Fully inside range
mid = (start + end) // 2
p1 = self._query(2 * node, start, mid, l, r)
p2 = self._query(2 * node + 1, mid + 1, end, l, r)
return p1 + p2
# Usage
data = [1, 3, 5, 7, 9, 11]
st = SegmentTree(data)
print(f"Sum of index 1 to 3: {st.query(1, 3)}") # 3+5+7 = 15
st.update(1, 10) # Change 3 to 10
print(f"New sum: {st.query(1, 3)}") # 10+5+7 = 22Summary
The Segment Tree teaches us that aggregation saves time. By pre-calculating summaries of groups, we can answer questions about the “whole” without inspecting every “part.” It reminds us that in analytics, the ability to zoom out and see the big picture is just as important as the raw data itself.
