K-D Tree
K-D Tree: The Dimensional Cutter
The Story: Slicing the Cake
Imagine you have a block of cheese (3D space) filled with raisins (data points). You want to organize it so you can quickly find raisins.
Instead of cutting it into tiny cubes (like a grid), you decide to make one slice at a time.
- Cut 1 (X-axis): Slice the cheese in half vertically. Left is “West”, Right is “East”.
- Cut 2 (Y-axis): Now, for each half, slice it horizontally. Top is “North”, Bottom is “South”.
- Cut 3 (Z-axis): Now, for each quarter, slice it by depth. Front vs Back.
- Repeat: Go back to X-axis.
This is the K-D Tree (K-Dimensional Tree). It partitions space by cycling through dimensions. It is the standard algorithm for “Nearest Neighbor Search.”
Why do we need it?
- Astronomy: “Find the 5 stars closest to Earth in 3D space.”
- Machine Learning (KNN): “Find the 10 users most similar to Luke.” (If a user is a vector of [Age, Income, Location], this is a 3D search).
- Color Matching: “Find the closest match to this shade of Red in our palette” (RGB is 3D space).
How the Algorithm “Thinks”
The algorithm is an Alternating Binary Search Tree.
- The Pivot: At each level
depth, we pick a dimensionaxis = depth % k(e.g., if k=2, axes are 0->X, 1->Y, 0->X…). - The Median: We sort the points along that axis and pick the median point as the node.
- The Recursion:
- Points with smaller coordinate go Left.
- Points with larger coordinate go Right.
Nearest Neighbor Search Logic:
- It’s like finding a leaf in a BST.
- But as we walk back up (unwind recursion), we check: “Could there be a closer point on the other side of the cut?”
- If the distance to the “cutting plane” is less than the distance to our “current best neighbor,” we must look on the other side. Otherwise, we prune that entire branch.
Engineering Context: The Curse of High Dimensions
K-D Trees are amazing for low dimensions (2D, 3D). But they fail when dimensions get high ().
- The Problem: In 100-dimensional space, almost every branch needs to be checked because “everything is far.”
- The Solution: For high dimensions (e.g., Face Recognition vectors with 128 dims), engineers use LSH (Locality Sensitive Hashing) or HNSW (Hierarchical Navigable Small World graphs) instead of exact trees.
Implementation (Python)
import operator
class Node:
def __init__(self, point, left=None, right=None):
self.point = point
self.left = left
self.right = right
def build_kdtree(points, depth=0):
if not points:
return None
k = len(points[0]) # Number of dimensions
axis = depth % k # Alternating axis: 0, 1, 2...
# Sort points by the current axis and pick median
points.sort(key=lambda x: x[axis])
median = len(points) // 2
return Node(
point=points[median],
left=build_kdtree(points[:median], depth + 1),
right=build_kdtree(points[median+1:], depth + 1)
)
def distance_squared(p1, p2):
return sum((a - b)**2 for a, b in zip(p1, p2))
def closest_point(root, target, depth=0, best=None):
if root is None:
return best
k = len(target)
axis = depth % k
next_best = None
next_branch = None
opposite_branch = None
# Update best if current node is closer
dist = distance_squared(root.point, target)
if best is None or dist < distance_squared(best, target):
next_best = root.point
else:
next_best = best
# Decide which way to go
if target[axis] < root.point[axis]:
next_branch = root.left
opposite_branch = root.right
else:
next_branch = root.right
opposite_branch = root.left
# Recurse down the "good" side first
next_best = closest_point(next_branch, target, depth + 1, next_best)
# Check if we need to cross the boundary?
# Is the distance to the plane smaller than the distance to current best?
if (target[axis] - root.point[axis])**2 < distance_squared(next_best, target):
next_best = closest_point(opposite_branch, target, depth + 1, next_best)
return next_best
# Usage
points = [(2, 3), (5, 4), (9, 6), (4, 7), (8, 1), (7, 2)]
tree = build_kdtree(points)
target = (9, 2)
print(f"Closest to {target}: {closest_point(tree, target)}")Summary
The K-D Tree teaches us that complexity can be managed by taking turns. By simplifying a multi-dimensional problem into a sequence of 1-dimensional decisions, we can navigate complex spaces efficiently. It reminds us that no matter how complex the world is, we can usually slice it one angle at a time.
