0x55aa
Back to Blog

🪩 Your API is a Nightclub — And Rate Limiting is the Bouncer

|
6 min read

Picture this: you've just launched your shiny new API. You're sipping coffee, basking in the glow of successful deploys, when suddenly your server starts groaning like a haunted house. CPU spikes. Memory evaporates. Your database is on its knees begging for mercy.

You check the logs. One user — one — is hammering your /search endpoint 10,000 times per minute.

Congratulations. You just learned why every API needs a bouncer.

🚪 What is Rate Limiting, Actually?

Rate limiting is the practice of restricting how many requests a client can make to your API within a given time window. Think of it as the velvet rope outside an exclusive club:

  • Normal users: "Welcome in, enjoy the open bar."
  • Scripts gone rogue: "Sorry buddy, you've had enough. Come back in 60 seconds."
  • Actual DDoS attacks: bouncer just stares at them until they leave

Without it, your API is an all-you-can-eat buffet with no closing time. And we've all seen what happens to those places on a Saturday night.

🧱 The Bare Minimum: express-rate-limit

The fastest way to get a bouncer on the door in Express is the express-rate-limit package. It's lightweight, easy to configure, and doesn't require a PhD in distributed systems.

npm install express-rate-limit
import express from 'express';
import rateLimit from 'express-rate-limit';

const app = express();

// Global limiter — applies to every route
const globalLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100,                  // max 100 requests per window
  standardHeaders: true,     // sends RateLimit-* headers (RFC 6585)
  legacyHeaders: false,
  message: {
    status: 429,
    error: "Too many requests — slow your roll and try again shortly.",
  },
});

app.use(globalLimiter);

// Stricter limiter for auth routes
const authLimiter = rateLimit({
  windowMs: 60 * 1000, // 1 minute
  max: 5,              // only 5 login attempts per minute
  message: {
    status: 429,
    error: "Too many login attempts. Are you a robot? 🤖",
  },
});

app.post('/auth/login', authLimiter, (req, res) => {
  res.json({ message: 'You made it in!' });
});

app.listen(3000);

That's it. Seriously. Three imports and two function calls and you've got basic protection running. Ship it? Not quite — let's talk about why the default in-memory store will eventually betray you.

🧠 The Scaling Problem: Memory Isn't Forever

The default express-rate-limit store keeps request counts in memory. This works great on your laptop. It works terribly the moment you have more than one server instance.

Why? Because Instance A doesn't know that User X already hammered Instance B 99 times. So User X just walks to the other door of the nightclub like it's nothing.

The fix is a shared store — something all your instances agree on. Redis is the classic choice:

import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
import { createClient } from 'redis';

const redisClient = createClient({ url: process.env.REDIS_URL });
await redisClient.connect();

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100,
  standardHeaders: true,
  legacyHeaders: false,
  store: new RedisStore({
    sendCommand: (...args) => redisClient.sendCommand(args),
  }),
});

Now all your instances check the same Redis counter. One bouncer, one list, no VIP sneaking through the back door.

🎯 Smart Limiting: Not All Requests Are Equal

Here's where most tutorials stop and where real-world APIs get interesting. A flat limit of "100 requests per 15 minutes" is blunt. Effective rate limiting is nuanced:

By endpoint criticality:

  • GET /products — generous limits, it's just reading data
  • POST /checkout — stricter, you don't want bots buying out your inventory
  • POST /auth/forgot-password — very strict, abuse here causes real harm

By user tier:

  • Free users: 60 requests/minute
  • Pro users: 600 requests/minute
  • Enterprise: basically unlimited (their SLA is your problem now)

You can achieve this with a custom keyGenerator function. By default, express-rate-limit keys on IP address. But authenticated users are better identified by their user ID:

const apiLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: (req) => {
    // Pull tier from JWT claims or session
    if (req.user?.tier === 'enterprise') return 10000;
    if (req.user?.tier === 'pro') return 600;
    return 60; // free tier or unauthenticated
  },
  keyGenerator: (req) => {
    // Authenticated users get their own bucket
    // Anonymous users share IP-based buckets
    return req.user?.id ?? req.ip;
  },
});

Now your rate limiter is smart enough to let your paying customers breathe while keeping the freeloaders in check. This is the bouncer who actually knows the regulars by name.

📬 Tell Your Clients What's Happening

One thing that separates good APIs from frustrating ones: proper 429 responses. Don't just say "no." Say when they can try again.

The standardHeaders: true option in express-rate-limit automatically sends:

RateLimit-Limit: 100
RateLimit-Remaining: 0
RateLimit-Reset: 1710456000
Retry-After: 847

These headers let well-behaved clients (and SDKs) back off gracefully and retry at the right time, instead of hammering you harder in a panic. Implement them. Your users will thank you, and your on-call rotation will quietly weep tears of joy.

🚨 When Rate Limiting Isn't Enough

Rate limiting slows down abuse — it doesn't stop a determined attacker. For the full bouncer experience, pair it with:

  • IP allowlists/blocklists for known bad actors
  • CAPTCHA challenges on sensitive endpoints after X failures
  • Exponential backoff signals in your responses (hint to clients to slow down)
  • Request signatures or API keys so you can revoke access instantly

Rate limiting is your first line of defense, not your last.

🎉 Go Add a Bouncer Today

If your API doesn't have rate limiting yet, today's the day. Start with express-rate-limit and the defaults — imperfect protection deployed now beats a perfect solution sitting in a Jira backlog.

Then layer in Redis for multi-instance support, per-tier limits for fairness, and proper response headers for developer experience.

Your API is a nightclub. Make it a good one — great music, strong drinks, and a bouncer who knows when to say when.


Have a war story about rate limiting saving your app (or not having it causing chaos)? Drop it in the comments. Misery loves company, especially at 3am during an incident.

Thanks for reading!

Back to all posts