0x55aa
← Back to Blog

🪩 Your API is a Nightclub — And Rate Limiting is the Bouncer

6 min read

Picture this: you've just launched your shiny new API. You're sipping coffee, basking in the glow of successful deploys, when suddenly your server starts groaning like a haunted house. CPU spikes. Memory evaporates. Your database is on its knees begging for mercy.

You check the logs. One user — one — is hammering your /search endpoint 10,000 times per minute.

Congratulations. You just learned why every API needs a bouncer.

🚪 What is Rate Limiting, Actually?

Rate limiting is the practice of restricting how many requests a client can make to your API within a given time window. Think of it as the velvet rope outside an exclusive club:

  • Normal users: "Welcome in, enjoy the open bar."
  • Scripts gone rogue: "Sorry buddy, you've had enough. Come back in 60 seconds."
  • Actual DDoS attacks: bouncer just stares at them until they leave

Without it, your API is an all-you-can-eat buffet with no closing time. And we've all seen what happens to those places on a Saturday night.

🧱 The Bare Minimum: express-rate-limit

The fastest way to get a bouncer on the door in Express is the express-rate-limit package. It's lightweight, easy to configure, and doesn't require a PhD in distributed systems.

npm install express-rate-limit
import express from 'express';
import rateLimit from 'express-rate-limit';

const app = express();

// Global limiter — applies to every route
const globalLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100,                  // max 100 requests per window
  standardHeaders: true,     // sends RateLimit-* headers (RFC 6585)
  legacyHeaders: false,
  message: {
    status: 429,
    error: "Too many requests — slow your roll and try again shortly.",
  },
});

app.use(globalLimiter);

// Stricter limiter for auth routes
const authLimiter = rateLimit({
  windowMs: 60 * 1000, // 1 minute
  max: 5,              // only 5 login attempts per minute
  message: {
    status: 429,
    error: "Too many login attempts. Are you a robot? 🤖",
  },
});

app.post('/auth/login', authLimiter, (req, res) => {
  res.json({ message: 'You made it in!' });
});

app.listen(3000);

That's it. Seriously. Three imports and two function calls and you've got basic protection running. Ship it? Not quite — let's talk about why the default in-memory store will eventually betray you.

🧠 The Scaling Problem: Memory Isn't Forever

The default express-rate-limit store keeps request counts in memory. This works great on your laptop. It works terribly the moment you have more than one server instance.

Why? Because Instance A doesn't know that User X already hammered Instance B 99 times. So User X just walks to the other door of the nightclub like it's nothing.

The fix is a shared store — something all your instances agree on. Redis is the classic choice:

import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
import { createClient } from 'redis';

const redisClient = createClient({ url: process.env.REDIS_URL });
await redisClient.connect();

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100,
  standardHeaders: true,
  legacyHeaders: false,
  store: new RedisStore({
    sendCommand: (...args) => redisClient.sendCommand(args),
  }),
});

Now all your instances check the same Redis counter. One bouncer, one list, no VIP sneaking through the back door.

🎯 Smart Limiting: Not All Requests Are Equal

Here's where most tutorials stop and where real-world APIs get interesting. A flat limit of "100 requests per 15 minutes" is blunt. Effective rate limiting is nuanced:

By endpoint criticality:

  • GET /products — generous limits, it's just reading data
  • POST /checkout — stricter, you don't want bots buying out your inventory
  • POST /auth/forgot-password — very strict, abuse here causes real harm

By user tier:

  • Free users: 60 requests/minute
  • Pro users: 600 requests/minute
  • Enterprise: basically unlimited (their SLA is your problem now)

You can achieve this with a custom keyGenerator function. By default, express-rate-limit keys on IP address. But authenticated users are better identified by their user ID:

const apiLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: (req) => {
    // Pull tier from JWT claims or session
    if (req.user?.tier === 'enterprise') return 10000;
    if (req.user?.tier === 'pro') return 600;
    return 60; // free tier or unauthenticated
  },
  keyGenerator: (req) => {
    // Authenticated users get their own bucket
    // Anonymous users share IP-based buckets
    return req.user?.id ?? req.ip;
  },
});

Now your rate limiter is smart enough to let your paying customers breathe while keeping the freeloaders in check. This is the bouncer who actually knows the regulars by name.

📬 Tell Your Clients What's Happening

One thing that separates good APIs from frustrating ones: proper 429 responses. Don't just say "no." Say when they can try again.

The standardHeaders: true option in express-rate-limit automatically sends:

RateLimit-Limit: 100
RateLimit-Remaining: 0
RateLimit-Reset: 1710456000
Retry-After: 847

These headers let well-behaved clients (and SDKs) back off gracefully and retry at the right time, instead of hammering you harder in a panic. Implement them. Your users will thank you, and your on-call rotation will quietly weep tears of joy.

🚨 When Rate Limiting Isn't Enough

Rate limiting slows down abuse — it doesn't stop a determined attacker. For the full bouncer experience, pair it with:

  • IP allowlists/blocklists for known bad actors
  • CAPTCHA challenges on sensitive endpoints after X failures
  • Exponential backoff signals in your responses (hint to clients to slow down)
  • Request signatures or API keys so you can revoke access instantly

Rate limiting is your first line of defense, not your last.

🎉 Go Add a Bouncer Today

If your API doesn't have rate limiting yet, today's the day. Start with express-rate-limit and the defaults — imperfect protection deployed now beats a perfect solution sitting in a Jira backlog.

Then layer in Redis for multi-instance support, per-tier limits for fairness, and proper response headers for developer experience.

Your API is a nightclub. Make it a good one — great music, strong drinks, and a bouncer who knows when to say when.


Have a war story about rate limiting saving your app (or not having it causing chaos)? Drop it in the comments. Misery loves company, especially at 3am during an incident.