Load Balancing Algorithms
Load Balancing Algorithms
Introduction: The “Single Point of Failure”
Imagine a restaurant with 100 customers but only 1 waiter. The waiter will be overwhelmed while the customers wait forever. If you hire 5 waiters, you need a way to decide which waiter serves which table.
Load Balancing is the process of distributing incoming network traffic across a group of backend servers (a Server Pool). It ensures that no single server bears too much demand, improving responsiveness and availability.
What Problem does it solve?
- Input: A stream of incoming requests.
- Output: The optimal backend server to handle each request.
- The Promise: Scalability (add more servers to handle more traffic) and Reliability (if one server fails, others pick up the slack).
Common Algorithms
1. Round Robin
- Logic: Simply go down the list: Server 1, then 2, then 3, then back to 1.
- Best for: When all servers have identical hardware and the requests are similar in size.
2. Weighted Round Robin (WRR)
- Logic: Assign a “Weight” to each server based on its capacity. A server with weight 10 gets twice as many requests as weight 5.
- Best for: Heterogeneous clusters (e.g., some servers have 64GB RAM, others have 16GB).
3. Least Connections
- Logic: Send the next request to the server with the fewest active connections.
- Best for: Long-lived requests (like video streaming or persistent WebSocket connections) where some requests take much longer than others.
4. IP Hash (Source Hash)
- Logic: Use the client’s IP address to determine the server (e.g.,
hash(Client_IP) % N). - Best for: “Sticky Sessions.” Ensuring a user stays on the same server to preserve local cache or session data.
Typical Business Scenarios
✅ Web Servers: Nginx or HAProxy distributing HTTP requests to Node.js/Python backends.
✅ Database Read Replicas: Spreading “SELECT” queries across 5 read-only database nodes.
✅ API Gateways: Routing traffic to different microservices based on the path.
❌ State Management: If you use Round Robin, you cannot store user sessions in a server’s memory. You must use a shared Redis session store or use IP Hash.
Performance & Complexity
- Efficiency: Very high. Most algorithms are or .
- Overhead: Minimal compared to the benefit of horizontal scaling.
Summary
“Load Balancing is the ‘Traffic Cop’ of the internet. It ensures your system can grow by adding more workers and protects you from the death of any single server.”
