I continue to read System Design Interview — An insider's guide. Today I learned about Rate Limiting algorithms:

  • Token Bucket
  • Leaking Bucket
  • Fixed Window Counter
  • Sliding Window Log
  • Sliding Window Counter.

Below is a short description of each of them.

Token bucket

In Token Bucket algorithm, we have a bucket which starts as full. It has a capacity of X tokens. When request arrives, we check if there is a token inside the bucket. If it is, then we take out the token and proceed with request, if no token is available we drop the request. Additionally, there is a separate process called refiller which adds tokens once every N seconds. This way we can control the number of processed tokens.

Pros

  • easy to implement
  • low memory consumption

cons

  • it may be hard to adjust parameters to meet desired rate limiting

leaking bucket

In Leaking Bucket algorithms, the idea is very similar to the Token Bucket. We have a bucket of known capacity, and it's leaking the requests in consistent rate. We implement that using FIFO queue. Every time new request arrives, we check if the queue is full and if it is, then we drop the request. On the other hand when the queue is not full we add the request to the queue. In constant intervals, we are picking request from the queue, and we are processing them.

Pros

  • easy to implement
  • low memory consumption assuming restricted queue size

cons

  • during traffic bursts we can end up with queue full of old requests
  • it may be hard to adjust parameters of bucket capacity and time interval to meet our requirements

fixed window counter

In Fixed Window Counter algorithm, we have a fixed time window and a counter. The counter is incremented with every request. When time window ends, the counter is set to 0 and the cycle starts again.

Pros

  • easy to implement and understand
  • low memory consumption

cons

  • during traffic bursts on the window edges we can pass more requests than expected

sliding window log

In this algorithm we keep a log of incoming requests. Every time request appears, we save the time in the log. We then check if log contains outdated requests and if it does, then we remove them from the log. At the end we verify if the log contains allowed number of requests, if it does then we can process the request, otherwise, we can drop it.

Pros

  • can't overflow during traffic bursts

cons

  • high memory usage

sliding window counter

This algorithm is a combination of a Fixed Window Counter and Sliding Window Log. We keep a counter per window and the window slides. When new request arrives and the window is not yet at the full cycle, then we calculate the counter using following formula: requests in old cycle + requests in new cycle * percentage overlap of the old cycle.

For example if you have request limit of 7 requests per window and in the old cycle (old window) we have 4 requests and the window is sliding, so we are 70% on the old cycle and 30% on new and on new cycle we already have 3 requests then we can calculate the limit for the next request like following: 4 + 3 * 0.7 = 6.1, we can round it depending on the needs, here he can round it to 6, so 6 < 7, thus our request passes.

Pros

  • low memory consumption
  • avoid overflows between windows

cons

  • may not be very accurate