0x55aa
← Back to Blog

🚦 Rate Limiting in Express: Stop the Stampede Before It Crushes Your Server

6 min read

🚦 Rate Limiting in Express: Stop the Stampede Before It Crushes Your Server

Picture this: You ship a new API endpoint on a Friday afternoon (first mistake, we know). By Monday morning, your server is gasping for air, your database is on its knees, and your on-call alert has been blowing up all weekend. The culprit? Someone found your /search endpoint and decided to hammer it 10,000 times a minute.

This is exactly the scenario rate limiting exists to prevent. And the good news? Adding it to an Express app takes about five minutes.

What Even Is Rate Limiting?

Rate limiting is a simple idea: a client can only make N requests in a given time window. Exceed that, and you get a polite (or not-so-polite) 429 Too Many Requests response.

Think of your API as a popular coffee shop. Without rate limiting, one guy can waltz in and order 500 lattes back-to-back, clogging up the entire counter while everyone else waits. Rate limiting is the barista saying, "Hey buddy — one at a time. Come back in a minute."

It protects against:

  • Brute force attacks (someone hammering your login endpoint)
  • Scrapers (bots hoovering up your entire product catalog)
  • Accidental DDoS (a client app with a bug making infinite retry loops)
  • The thundering herd (all your users refreshing simultaneously after a cache miss)

Setting Up with express-rate-limit

The most popular solution in the Express ecosystem is the aptly named express-rate-limit. It's mature, flexible, and installs without drama.

npm install express-rate-limit

Here's a basic setup that covers most use cases right out of the box:

import express from 'express';
import rateLimit from 'express-rate-limit';

const app = express();

// Global limiter — applies to every route
const globalLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100,                  // max 100 requests per window per IP
  standardHeaders: true,     // sends `RateLimit-*` headers (RFC 6585)
  legacyHeaders: false,      // drops the old `X-RateLimit-*` headers

  message: {
    error: 'Too many requests, please slow down.',
    retryAfter: 'Check the Retry-After header.',
  },
});

app.use(globalLimiter);

// Stricter limiter for sensitive endpoints
const authLimiter = rateLimit({
  windowMs: 60 * 1000, // 1 minute
  max: 5,              // only 5 login attempts per minute
  message: { error: 'Too many login attempts. Take a breath.' },
});

app.post('/auth/login', authLimiter, (req, res) => {
  // your login logic
  res.json({ token: 'your-jwt-here' });
});

app.listen(3000, () => console.log('Running on port 3000'));

Two limiters, total of about 20 lines. Not bad for meaningful protection.

Notice we're applying globalLimiter to every route via app.use(), then layering a stricter authLimiter on the login endpoint specifically. This is the layered defense approach — a light global limit to catch runaway clients, and tight limits on anything that could be brute-forced.

The Problem with In-Memory Limits

The default store for express-rate-limit keeps counters in memory. That's perfectly fine for a single-server setup, but the moment you scale horizontally (multiple Node processes, multiple servers behind a load balancer), you've got a problem.

Each instance has its own memory. User A hits server 1 five times, then server 2 five times — from the load balancer's perspective, that's 10 requests, but neither server has seen more than 5. The limit is useless.

The fix: use a shared store. Redis is the classic choice.

npm install @express-rate-limit/redis ioredis
import rateLimit from 'express-rate-limit';
import { RedisStore } from '@express-rate-limit/redis';
import Redis from 'ioredis';

const redis = new Redis({ host: 'localhost', port: 6379 });

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100,
  standardHeaders: true,
  legacyHeaders: false,

  // All counters now live in Redis — shared across every instance
  store: new RedisStore({
    sendCommand: (...args) => redis.call(...args),
  }),
});

export default limiter;

Now your counters are centralized. Scale to ten servers, run PM2 with eight workers — it doesn't matter. Every request gets counted in the same place.

Practical Tips That Save You Pain Later

1. Identify clients properly. By default, rate limiting uses req.ip. If you're behind a proxy (nginx, AWS ALB, Cloudflare), the IP will always look like your proxy's internal address. Fix this by setting app.set('trust proxy', 1) so Express reads the real IP from the X-Forwarded-For header.

2. Differentiate by user, not just IP. IPs can be shared (university networks, corporate NAT, VPNs). For authenticated routes, key your limits on the user ID instead:

const userLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 60,
  keyGenerator: (req) => req.user?.id ?? req.ip, // fall back to IP if not authed
});

3. Return useful headers. When you set standardHeaders: true, clients get RateLimit-Limit, RateLimit-Remaining, and RateLimit-Reset headers. Well-behaved API clients use these to self-throttle instead of hitting the wall.

4. Don't rate limit health checks. Your load balancer's health probes will spam /health every few seconds. Exclude them from limits to avoid false positives: apply your limiter after the health route, or use skip to whitelist it.

What Happens When You Get It Wrong?

Skip rate limiting entirely and you're one bad day away from a server funeral. But over-aggressive limits are their own problem — you'll end up blocking legitimate users and generating support tickets faster than your API handles requests.

The sweet spot: start permissive, monitor your logs, and tighten up once you know your real traffic patterns. A 100 req/15min global limit with 5 req/min on auth is a solid starting point for most apps. Adjust from there based on data, not vibes.

Wrapping Up

Rate limiting is one of those "boring" backend features that nobody thinks about until it's 2am and everything's on fire. Five minutes of setup now saves you from a very bad weekend later.

Here's your action plan:

  1. Install express-rate-limit
  2. Add a global limiter to app.use()
  3. Add tighter limits to sensitive endpoints (login, signup, password reset)
  4. If you're running multiple instances, wire up Redis
  5. Set trust proxy if you're behind a load balancer

Your server will thank you. Your on-call rotation will thank you even more.


What's your rate limiting setup look like? Got a creative keyGenerator or a horror story about skipping this step? Drop it in the comments — I want to hear the carnage.