0x55aa
Back to Blog

🚦 Node.js Rate Limiting: Stop Letting Users Wreck Your API

|
6 min read

🚦 Node.js Rate Limiting: Stop Letting Users Wreck Your API

Imagine opening a restaurant where the menu says "eat as much as you want." Sounds generous, right? Now imagine one table — one single table — orders 10,000 plates of pasta in 60 seconds. Your kitchen collapses. Every other customer leaves hungry. The chef quits. The building might be on fire.

That's your API without rate limiting.

Rate limiting is the velvet rope outside the club. It doesn't stop legitimate users from having a great time — it just ensures that one overzealous bot (or a buggy client in a while(true) loop) doesn't bring the whole place down. Let's talk about how to add it to your Node.js/Express app without losing your mind.


Why Rate Limiting Isn't Optional

Before we get to code, let's get real about why this matters:

  • Brute-force attacks — attackers hammering your /login endpoint with 50,000 password guesses per minute
  • Scraping — someone pulling your entire product catalog while your database weeps
  • Runaway clients — a misconfigured frontend calling your API in an infinite loop (we've all been there)
  • Cost explosions — if you're paying per DB query or per API call downstream, one bad actor can destroy your budget overnight

Rate limiting is both a security measure and an act of kindness toward your future self.


The Dead-Simple Way: express-rate-limit

The most popular solution for Express apps is express-rate-limit. It's battle-tested, configurable, and you can have it running in about 5 minutes.

npm install express-rate-limit

Here's a basic setup that covers 90% of use cases:

import express from 'express';
import rateLimit from 'express-rate-limit';

const app = express();

// Global limiter: 100 requests per 15 minutes per IP
const globalLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100,
  standardHeaders: true,   // Return rate limit info in `RateLimit-*` headers
  legacyHeaders: false,
  message: {
    status: 429,
    error: 'Too many requests — slow down, friend.',
  },
});

// Stricter limiter for auth endpoints
const authLimiter = rateLimit({
  windowMs: 10 * 60 * 1000, // 10 minutes
  max: 10,                   // Only 10 login attempts per window
  message: {
    status: 429,
    error: 'Too many login attempts. Take a breath.',
  },
});

app.use(globalLimiter);
app.post('/auth/login', authLimiter, loginHandler);
app.post('/auth/forgot-password', authLimiter, forgotPasswordHandler);

app.listen(3000);

Notice the two-tier approach: a lenient global cap for all routes, and a much tighter cap on the endpoints attackers actually care about. Your public /products endpoint and your /login endpoint should not have the same limits.


Level Up: Redis-Backed Limiting for Production

The default express-rate-limit stores request counts in memory. That works great for a single server. The moment you scale horizontally — two servers, three servers, a whole fleet — each instance has its own counter, and suddenly your "10 requests per minute" limit becomes "10 requests per minute per server instance." Oops.

For production with multiple instances, you need a shared store. Redis is the standard answer:

import rateLimit from 'express-rate-limit';
import { RedisStore } from 'rate-limit-redis';
import { createClient } from 'redis';

const redisClient = createClient({ url: process.env.REDIS_URL });
await redisClient.connect();

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100,
  standardHeaders: true,
  legacyHeaders: false,
  store: new RedisStore({
    sendCommand: (...args) => redisClient.sendCommand(args),
  }),
});

app.use(limiter);

Now all your server instances share the same counters. Scale to 50 instances and the rate limit still holds. This is the pattern you want in any serious deployment.

One gotcha: make sure your Redis client handles reconnection gracefully. If Redis goes down and your limiter throws, you'll either block all traffic or bypass limiting entirely — neither is good. ioredis has solid reconnection logic built in if you want a more resilient client.


Smart Limiting: Rate Limit by User, Not Just IP

IP-based limiting is the right default, but it has blind spots:

  • Corporate networks where thousands of employees share one NAT IP
  • Mobile carriers where millions of phones share a handful of addresses
  • Authenticated users who should get higher limits than anonymous callers

For authenticated routes, rate limit by user ID instead:

const userLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 60,
  keyGenerator: (req) => {
    // Use user ID if authenticated, fall back to IP
    return req.user?.id ?? req.ip;
  },
  skip: (req) => {
    // Premium users bypass the limit entirely
    return req.user?.plan === 'premium';
  },
});

app.use('/api', authenticate, userLimiter);

This is a much fairer system. Your free-tier user gets 60 requests/minute. A large corporation's employees don't accidentally block each other because they share an IP. And your premium customers? Smooth sailing.


Practical Insights You'll Learn the Hard Way

Tell clients what's happening. The standardHeaders: true option adds RateLimit-Limit, RateLimit-Remaining, and RateLimit-Reset headers automatically. Well-behaved clients can read these and back off gracefully instead of hammering you with retries. Enable this. Always.

Don't rate-limit health checks. Your load balancer pings /health every few seconds. If that endpoint is rate-limited, you'll get false alarms and your server will get cycled out of rotation. Exclude it explicitly.

Log when limits are hit. A rate limit being triggered once is normal. The same IP hitting the limit 500 times in an hour is an incident. Ship those events to your logging system and set up an alert.

Graduated responses beat hard bans. Consider slowing responses instead of rejecting them outright for borderline cases. A 2-second delay hurts scrapers far more than legitimate users.


The Payoff

Rate limiting is one of those things that feels like overhead until the day it saves you. When a client goes haywire at 3am and your API shrugs it off instead of falling over, you'll be glad you spent the 30 minutes setting this up.

Your API isn't a buffet. Put up the velvet rope.


What's Next?

Add rate limiting to your next Express app and pair it with structured logging so you can actually see who's hitting the limits and why. Once you've got that visibility, you'll find yourself tuning the thresholds based on real data rather than guessing — and that's when your API starts feeling genuinely robust.

Got a thorny rate-limiting scenario — dynamic limits based on load, per-endpoint granularity, or dealing with shared IPs? Drop a comment or reach out. I love talking about this stuff.

Happy shipping. 🚀

Thanks for reading!

Back to all posts