🚦 Rate Limiting Your Express API: Because Not Everyone Deserves Unlimited Access
Picture this: it's 2 AM. You're asleep. Your API is not. Some bot — or worse, Kevin from accounting who discovered the /export endpoint — is making 10,000 requests per minute. Your server is gasping. Your database is on its knees. Your phone is blowing up with alerts.
This is the story rate limiting prevents. Let's talk about how to stop it.
Why Rate Limiting Isn't Optional
Most developers treat rate limiting like flossing — they know they should do it, they just... don't. Until something bad happens.
Here's what unthrottled APIs attract:
- Bots scraping your data for free
- Credential stuffing attacks hammering your login endpoint
- Accidental DDoS from a misconfigured client in a retry loop
- Kevin — always Kevin
Rate limiting puts a velvet rope outside your API. VIPs (legitimate users) get in fine. The mob gets turned away politely — or not so politely, depending on your mood.
The Quick Win: express-rate-limit
The fastest way to add rate limiting to an Express app is the express-rate-limit package. It takes about three minutes and makes you feel unreasonably accomplished.
npm install express-rate-limit
Now add it to your app:
import rateLimit from 'express-rate-limit';
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // limit each IP to 100 requests per window
standardHeaders: true, // send `RateLimit-*` headers
legacyHeaders: false, // disable `X-RateLimit-*` headers
message: {
status: 429,
error: 'Too many requests, slow down there champ.',
},
});
// Apply globally
app.use(limiter);
// Or surgically, on specific routes
app.use('/api/auth/login', rateLimit({
windowMs: 10 * 60 * 1000,
max: 10,
message: { error: 'Too many login attempts. Take a breath.' },
}));
The standardHeaders: true option is great because it tells clients exactly where they stand — how many requests they have left and when the window resets. Good APIs are transparent about their limits.
Different Routes, Different Rules
Not all endpoints are created equal. Your public health-check route (/ping) should probably be wide open. Your /export-all-user-data endpoint? Lock it down harder than a bank vault.
// Generous limit for read-heavy public endpoints
const publicReadLimit = rateLimit({
windowMs: 60 * 1000, // 1 minute
max: 60, // 1 request per second on average
});
// Strict limit for expensive or sensitive operations
const sensitiveLimit = rateLimit({
windowMs: 60 * 60 * 1000, // 1 hour
max: 5, // 5 attempts per hour, period
skipSuccessfulRequests: true, // don't count successful logins against the limit
});
// Password reset — abuse magnet
app.use('/api/auth/forgot-password', sensitiveLimit);
// Search endpoint — can get expensive fast
app.use('/api/search', publicReadLimit);
The skipSuccessfulRequests: true trick is clever for auth flows: legitimate users who log in successfully don't eat into their quota. Only failed attempts count. That way you punish brute-forcers without annoying real users.
Going Distributed: Redis-Backed Rate Limiting
Here's the dirty secret of in-memory rate limiting: it doesn't work when you have multiple server instances.
If you're running three Node.js processes behind a load balancer, each one has its own memory. Kevin can make 100 requests to server A, then 100 to server B, then 100 to server C — 300 requests total, zero limits triggered. Whoops.
The fix is to store rate limit state in a shared store like Redis:
import rateLimit from 'express-rate-limit';
import { RedisStore } from 'rate-limit-redis';
import { createClient } from 'redis';
const redisClient = createClient({ url: process.env.REDIS_URL });
await redisClient.connect();
const limiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 100,
standardHeaders: true,
legacyHeaders: false,
store: new RedisStore({
sendCommand: (...args) => redisClient.sendCommand(args),
}),
});
app.use('/api/', limiter);
Now all your server instances share one rate limit counter per IP. Kevin gets 100 requests total — not 100 per pod. This is the production-ready approach, and if you're already using Redis for caching (you should be), the marginal cost is nearly zero.
The UX Details That Matter
Rate limiting done badly makes users feel punished. Rate limiting done well feels invisible — until someone actually abuses your system.
A few things to get right:
Return 429, not 500. The Too Many Requests status code is exactly what the spec gives you. Use it. Clients can detect it and back off gracefully.
Tell clients when to retry. The Retry-After header (included automatically with standardHeaders: true) tells clients exactly how many seconds to wait. Good API clients will respect it.
Don't rate-limit your own monitoring. Healthcheck endpoints pinged by load balancers or uptime monitors should bypass rate limiting. Use skip to whitelist them:
const limiter = rateLimit({
windowMs: 60 * 1000,
max: 100,
skip: (req) => req.path === '/health' || req.ip === process.env.MONITOR_IP,
});
Log the hits. When the rate limiter fires, log it. An IP hitting the limit repeatedly is a signal worth investigating — it might be a bug in a client library, a misconfigured integration, or an actual attack.
What You Should Take Away
Rate limiting is one of those things that takes 20 minutes to add and saves you from a category of pain that's otherwise very hard to recover from. It's not a silver bullet — a sophisticated attacker will rotate IPs — but it handles 95% of real-world abuse scenarios effortlessly.
Start with express-rate-limit, apply stricter limits to your sensitive endpoints, and when you go to production with multiple instances, throw Redis in the mix. Your future self (and your database) will thank you.
Now go set up your rate limits. Kevin is already writing his script.
If this saved your API from becoming someone's personal data mine, share it with a teammate who still thinks "security" means just using HTTPS. They need this.