Token Bucket
How it works
Section titled “How it works”Token bucket maintains a virtual bucket of tokens that refills at a steady rate up to a maximum capacity. Each request consumes one or more tokens. If the bucket doesn’t have enough tokens, the request is denied and no tokens are consumed - only successful requests deduct from the bucket.
Max capacity: 20 tokensRefill rate: 5 tokens every 10 seconds
Time 0:00 - Bucket: 20/20 (full) 5 requests → Bucket: 15/20Time 0:10 - Refill +5 → Bucket: 20/20 (capped at max)Time 0:15 - 18 requests → Bucket: 2/20Time 0:20 - Refill +5 → Bucket: 7/20 (not full yet)There’s no actual bucket - the implementation stores a tokens count and last_refill timestamp, then calculates refilled tokens on each request.
SQL strategy
Section titled “SQL strategy”Token bucket requires reading the current state, calculating refilled tokens, and updating - a multi-step operation. It uses SELECT ... FOR UPDATE row-level locking within a transaction.
BEGIN; -- Ensure row exists so FOR UPDATE can lock it (concurrent-safe) INSERT INTO rate_limit_ephemeral (prefix, key, tokens, last_refill, expires_at) VALUES ($prefix, $key, $max_tokens, $now, $now + $ttl) ON CONFLICT (prefix, key) DO NOTHING;
SELECT tokens, last_refill, expires_at FROM rate_limit_ephemeral WHERE prefix = $prefix AND key = $key FOR UPDATE;
-- In application code: -- elapsed = now - last_refill -- refilled = floor(elapsed / interval) * refill_rate -- current_tokens = min(tokens + refilled, max_tokens) -- new_tokens = current_tokens - cost -- success = new_tokens >= 0 -- ttl = (max_tokens / refill_rate) * interval
-- Only write on success - denied requests are a no-op on DB state. -- Skipping the write avoids resetting last_refill, which would delay -- token refill for subsequent requests. IF success THEN INSERT INTO rate_limit_ephemeral (prefix, key, tokens, last_refill, expires_at) VALUES ($prefix, $key, $new_tokens, $now, $now + $ttl) ON CONFLICT (prefix, key) DO UPDATE SET tokens = $new_tokens, last_refill = $now, expires_at = $now + $ttl; END IF;COMMIT;The initial INSERT ... ON CONFLICT DO NOTHING ensures the row exists before the SELECT ... FOR UPDATE. New buckets start full at $max_tokens. Without this, concurrent first requests would all see “no row” and bypass the row lock.
Denied requests are a no-op on DB state - no write is issued. This avoids resetting last_refill, which would delay token refill for subsequent requests.
On denial, reset is the time until enough tokens refill to satisfy the request: ceil((cost - current_tokens) / refill_rate) * interval. On success, reset is now + ttl (the row’s expiry time).
The FOR UPDATE lock only affects the specific key being checked.
import { Pool } from "pg";import { Ratelimit } from "pg-ratelimit";
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const ratelimit = new Ratelimit({ pool, limiter: Ratelimit.tokenBucket(5, "10s", 20), prefix: "api",});
// Each request consumes 1 token by defaultconst result = await ratelimit.limit("user:123");
// Consume multiple tokens for expensive operationsconst result2 = await ratelimit.limit("user:123", { rate: 5 });When to use it
Section titled “When to use it”- Quota-style rate limiting (e.g., API credits, upload limits)
- When you want to allow short bursts while enforcing an average rate
- Scenarios where different operations have different costs (using weighted
rate)