Token bucket

In Token Bucket algorithm, we have a bucket which starts as full. It has a capacity of X tokens. When request arrives, we check if there is a token inside the bucket. If it is, then we take out the token and proceed with request, if no token is available we drop the request. Additionally, there is a separate process called refiller which adds tokens once every N seconds. This way we can control the number of processed tokens.

Pros

easy to implement
low memory consumption

cons

it may be hard to adjust parameters to meet desired rate limiting

leaking bucket

In Leaking Bucket algorithms, the idea is very similar to the Token Bucket. We have a bucket of known capacity, and it's leaking the requests in consistent rate. We implement that using FIFO queue. Every time new request arrives, we check if the queue is full and if it is, then we drop the request. On the other hand when the queue is not full we add the request to the queue. In constant intervals, we are picking request from the queue, and we are processing them.

Pros

easy to implement
low memory consumption assuming restricted queue size

cons

during traffic bursts we can end up with queue full of old requests
it may be hard to adjust parameters of bucket capacity and time interval to meet our requirements

fixed window counter

In Fixed Window Counter algorithm, we have a fixed time window and a counter. The counter is incremented with every request. When time window ends, the counter is set to 0 and the cycle starts again.

Pros

easy to implement and understand
low memory consumption

cons

during traffic bursts on the window edges we can pass more requests than expected

sliding window log

In this algorithm we keep a log of incoming requests. Every time request appears, we save the time in the log. We then check if log contains outdated requests and if it does, then we remove them from the log. At the end we verify if the log contains allowed number of requests, if it does then we can process the request, otherwise, we can drop it.

Pros

can't overflow during traffic bursts

cons

high memory usage

sliding window counter

This algorithm is a combination of a Fixed Window Counter and Sliding Window Log. We keep a counter per window and the window slides. When new request arrives and the window is not yet at the full cycle, then we calculate the counter using following formula: requests in old cycle + requests in new cycle * percentage overlap of the old cycle.

For example if you have request limit of 7 requests per window and in the old cycle (old window) we have 4 requests and the window is sliding, so we are 70% on the old cycle and 30% on new and on new cycle we already have 3 requests then we can calculate the limit for the next request like following: 4 + 3 * 0.7 = 6.1, we can round it depending on the needs, here he can round it to 6, so 6 < 7, thus our request passes.

Pros

low memory consumption
avoid overflows between windows

cons

may not be very accurate

Rate limiting algorithms

Token bucket

Pros

cons

leaking bucket

Pros

cons

fixed window counter

Pros

cons

sliding window log

Pros

cons

sliding window counter

Pros

cons