🪩 Your API is a Nightclub — And Rate Limiting is the Bouncer
Picture this: you've just launched your shiny new API. You're sipping coffee, basking in the glow of successful deploys, when suddenly your server starts groaning like a haunted house. CPU spikes. Memory evaporates. Your database is on its knees begging for mercy.
You check the logs. One user — one — is hammering your /search endpoint 10,000 times per minute.
Congratulations. You just learned why every API needs a bouncer.
🚪 What is Rate Limiting, Actually?
Rate limiting is the practice of restricting how many requests a client can make to your API within a given time window. Think of it as the velvet rope outside an exclusive club:
- Normal users: "Welcome in, enjoy the open bar."
- Scripts gone rogue: "Sorry buddy, you've had enough. Come back in 60 seconds."
- Actual DDoS attacks: bouncer just stares at them until they leave
Without it, your API is an all-you-can-eat buffet with no closing time. And we've all seen what happens to those places on a Saturday night.
🧱 The Bare Minimum: express-rate-limit
The fastest way to get a bouncer on the door in Express is the express-rate-limit package. It's lightweight, easy to configure, and doesn't require a PhD in distributed systems.
npm install express-rate-limit
import express from 'express';
import rateLimit from 'express-rate-limit';
const app = express();
// Global limiter — applies to every route
const globalLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // max 100 requests per window
standardHeaders: true, // sends RateLimit-* headers (RFC 6585)
legacyHeaders: false,
message: {
status: 429,
error: "Too many requests — slow your roll and try again shortly.",
},
});
app.use(globalLimiter);
// Stricter limiter for auth routes
const authLimiter = rateLimit({
windowMs: 60 * 1000, // 1 minute
max: 5, // only 5 login attempts per minute
message: {
status: 429,
error: "Too many login attempts. Are you a robot? 🤖",
},
});
app.post('/auth/login', authLimiter, (req, res) => {
res.json({ message: 'You made it in!' });
});
app.listen(3000);
That's it. Seriously. Three imports and two function calls and you've got basic protection running. Ship it? Not quite — let's talk about why the default in-memory store will eventually betray you.
🧠 The Scaling Problem: Memory Isn't Forever
The default express-rate-limit store keeps request counts in memory. This works great on your laptop. It works terribly the moment you have more than one server instance.
Why? Because Instance A doesn't know that User X already hammered Instance B 99 times. So User X just walks to the other door of the nightclub like it's nothing.
The fix is a shared store — something all your instances agree on. Redis is the classic choice:
import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
import { createClient } from 'redis';
const redisClient = createClient({ url: process.env.REDIS_URL });
await redisClient.connect();
const limiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 100,
standardHeaders: true,
legacyHeaders: false,
store: new RedisStore({
sendCommand: (...args) => redisClient.sendCommand(args),
}),
});
Now all your instances check the same Redis counter. One bouncer, one list, no VIP sneaking through the back door.
🎯 Smart Limiting: Not All Requests Are Equal
Here's where most tutorials stop and where real-world APIs get interesting. A flat limit of "100 requests per 15 minutes" is blunt. Effective rate limiting is nuanced:
By endpoint criticality:
GET /products— generous limits, it's just reading dataPOST /checkout— stricter, you don't want bots buying out your inventoryPOST /auth/forgot-password— very strict, abuse here causes real harm
By user tier:
- Free users: 60 requests/minute
- Pro users: 600 requests/minute
- Enterprise: basically unlimited (their SLA is your problem now)
You can achieve this with a custom keyGenerator function. By default, express-rate-limit keys on IP address. But authenticated users are better identified by their user ID:
const apiLimiter = rateLimit({
windowMs: 60 * 1000,
max: (req) => {
// Pull tier from JWT claims or session
if (req.user?.tier === 'enterprise') return 10000;
if (req.user?.tier === 'pro') return 600;
return 60; // free tier or unauthenticated
},
keyGenerator: (req) => {
// Authenticated users get their own bucket
// Anonymous users share IP-based buckets
return req.user?.id ?? req.ip;
},
});
Now your rate limiter is smart enough to let your paying customers breathe while keeping the freeloaders in check. This is the bouncer who actually knows the regulars by name.
📬 Tell Your Clients What's Happening
One thing that separates good APIs from frustrating ones: proper 429 responses. Don't just say "no." Say when they can try again.
The standardHeaders: true option in express-rate-limit automatically sends:
RateLimit-Limit: 100
RateLimit-Remaining: 0
RateLimit-Reset: 1710456000
Retry-After: 847
These headers let well-behaved clients (and SDKs) back off gracefully and retry at the right time, instead of hammering you harder in a panic. Implement them. Your users will thank you, and your on-call rotation will quietly weep tears of joy.
🚨 When Rate Limiting Isn't Enough
Rate limiting slows down abuse — it doesn't stop a determined attacker. For the full bouncer experience, pair it with:
- IP allowlists/blocklists for known bad actors
- CAPTCHA challenges on sensitive endpoints after X failures
- Exponential backoff signals in your responses (hint to clients to slow down)
- Request signatures or API keys so you can revoke access instantly
Rate limiting is your first line of defense, not your last.
🎉 Go Add a Bouncer Today
If your API doesn't have rate limiting yet, today's the day. Start with express-rate-limit and the defaults — imperfect protection deployed now beats a perfect solution sitting in a Jira backlog.
Then layer in Redis for multi-instance support, per-tier limits for fairness, and proper response headers for developer experience.
Your API is a nightclub. Make it a good one — great music, strong drinks, and a bouncer who knows when to say when.
Have a war story about rate limiting saving your app (or not having it causing chaos)? Drop it in the comments. Misery loves company, especially at 3am during an incident.