✈️ Node.js Graceful Shutdown: Stop Killing Requests Mid-Flight
Picture this: a user clicks "Place Order." Your Express server receives the request, begins charging the card, writes to the database, and then — right in the middle of all that — your deployment pipeline restarts the process. The charge went through. The database write didn't. The user's order is money gone and food un-ordered. They're going to file a chargeback and leave a one-star review.
This is what happens when you deploy without graceful shutdown.
Most Node.js tutorials end with app.listen(3000) and call it a day. They don't tell you what happens when your server needs to stop — during deployments, scaling events, crashes, or a Ctrl+C in production (don't do this). The default behavior is brutal: Node.js gets a signal, the process dies, and every in-flight request gets dropped like a hot plate.
Let's fix that.
🤔 What "Graceful" Actually Means
Graceful shutdown is the restaurant equivalent of a waiter who, when told the kitchen is closing, finishes serving every table that already ordered before turning off the lights. New customers are turned away at the door, but nobody's meal gets yanked mid-bite.
In server terms, this means:
- Stop accepting new connections the moment a shutdown signal arrives
- Let all in-flight requests finish naturally
- Close database connections, message queue consumers, and other resources cleanly
- Exit the process only when everything is done
The difference between a server that does this and one that doesn't is the difference between zero-downtime deployments and "we're going to put the site in maintenance mode for five minutes."
📡 The Signals You Need to Handle
Your operating system talks to Node.js processes through signals. The two you care about:
- SIGTERM — the polite signal. Kubernetes, Docker, systemd, and PM2 all send this when they want your process to stop. You have a window to clean up before they lose patience.
- SIGINT — what happens when someone presses
Ctrl+C. Useful for local dev.
There's also SIGKILL, which you cannot catch. It's the OS saying "I'm done being polite" and immediately terminating your process. Kubernetes sends SIGTERM first, waits (default 30 seconds), then fires SIGKILL. That 30-second window is your entire graceful shutdown budget.
Here's what handling these looks like at its simplest:
// server.js
import express from 'express';
import { createServer } from 'http';
const app = express();
const server = createServer(app);
app.get('/health', (req, res) => res.json({ status: 'ok' }));
app.get('/slow', async (req, res) => {
// Simulate a slow operation — DB query, payment processing, etc.
await new Promise(resolve => setTimeout(resolve, 5000));
res.json({ message: 'done' });
});
server.listen(3000, () => console.log('Listening on :3000'));
// Handle shutdown signals
async function shutdown(signal) {
console.log(`Received ${signal}. Shutting down gracefully...`);
server.close(async () => {
console.log('HTTP server closed. No new connections accepted.');
// Clean up resources here
process.exit(0);
});
// Force exit if cleanup takes too long
setTimeout(() => {
console.error('Shutdown timeout hit. Forcing exit.');
process.exit(1);
}, 25_000); // 5 seconds less than Kubernetes' 30s grace period
}
process.on('SIGTERM', () => shutdown('SIGTERM'));
process.on('SIGINT', () => shutdown('SIGINT'));
server.close() does the right thing: it stops the server from accepting new connections, then fires its callback once all existing connections have drained. The timeout is critical — if something goes wrong during cleanup (a hung database connection, a deadlocked promise), you still want the process to exit so your orchestrator can restart it.
🗄️ Closing What You Opened
Stopping the HTTP server is only half the job. Your app has probably opened other things that need cleaning up: database connection pools, Redis clients, message queue consumers. Closing these in the wrong order creates a different class of problem.
The right order is:
- Stop accepting HTTP requests first
- Then close dependencies (DB, Redis, queues)
If you close your database connection before in-flight requests finish, those requests will throw and return 500s — which is only slightly better than the original problem.
import express from 'express';
import { createServer } from 'http';
import mysql from 'mysql2/promise';
import { createClient } from 'redis';
const app = express();
const server = createServer(app);
const db = await mysql.createPool({
host: process.env.DB_HOST,
user: process.env.DB_USER,
password: process.env.DB_PASS,
database: process.env.DB_NAME,
waitForConnections: true,
connectionLimit: 10,
});
const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();
// ... your routes ...
async function shutdown(signal) {
console.log(`${signal} received. Starting graceful shutdown.`);
let exitCode = 0;
// Step 1: Stop the HTTP server — no new requests
await new Promise((resolve) => server.close(resolve));
console.log('HTTP server closed.');
// Step 2: Close dependencies in order
try {
await db.end();
console.log('Database pool closed.');
} catch (err) {
console.error('Error closing DB pool:', err);
exitCode = 1;
}
try {
await redis.quit();
console.log('Redis connection closed.');
} catch (err) {
console.error('Error closing Redis:', err);
exitCode = 1;
}
console.log('Shutdown complete.');
process.exit(exitCode);
}
// Timeout safety net
const SHUTDOWN_TIMEOUT = 25_000;
async function timedShutdown(signal) {
const timer = setTimeout(() => {
console.error(`Shutdown exceeded ${SHUTDOWN_TIMEOUT}ms. Forcing exit.`);
process.exit(1);
}, SHUTDOWN_TIMEOUT);
timer.unref(); // Don't let this timer keep the process alive
await shutdown(signal);
}
process.on('SIGTERM', () => timedShutdown('SIGTERM'));
process.on('SIGINT', () => timedShutdown('SIGINT'));
server.listen(3000, () => console.log('Listening on :3000'));
timer.unref() is a detail worth noting: it tells Node.js "this timer shouldn't prevent the process from exiting naturally." Without it, if everything shuts down in 2 seconds but your timeout is 25 seconds, your process would just sit there for 23 more seconds doing nothing. unref() fixes that.
☸️ Kubernetes Loves This
If you run on Kubernetes (or any container orchestrator), graceful shutdown isn't a nice-to-have — it's a correctness requirement. When Kubernetes decides to terminate a pod (rolling update, node drain, HPA scale-down), it:
- Removes the pod from the Service endpoints so load balancers stop sending traffic
- Sends SIGTERM to your container
- Waits up to
terminationGracePeriodSeconds(default 30s) - Sends SIGKILL if the process is still running
There's a subtle race here: steps 1 and 2 happen concurrently. A request can arrive right after SIGTERM but before the load balancer has updated. This is why you should add a small delay before stopping the HTTP server on Kubernetes:
async function shutdown(signal) {
console.log(`${signal} received.`);
// Give load balancers 5 seconds to catch up and stop sending traffic
if (process.env.KUBERNETES_SERVICE_HOST) {
console.log('Kubernetes environment detected. Sleeping 5s for LB drain...');
await new Promise(resolve => setTimeout(resolve, 5_000));
}
await new Promise((resolve) => server.close(resolve));
// ... rest of cleanup
}
Pair this with a proper terminationGracePeriodSeconds in your deployment manifest that's larger than your SHUTDOWN_TIMEOUT, and you'll have genuinely zero-downtime rolling updates.
🚨 The Mistakes That'll Bite You
Not handling unhandledRejection. An unhandled promise rejection in recent Node.js versions emits a warning but won't crash the process by default in some configs. During shutdown especially, you want to know about these:
process.on('unhandledRejection', (reason) => {
console.error('Unhandled rejection during shutdown:', reason);
});
Calling process.exit() directly in routes. Some developers do this when they hit an unrecoverable error. Don't. Throw an error, let your error handler decide, and keep process exit centralized in your shutdown handler.
Ignoring long-lived connections. WebSocket connections and Server-Sent Event streams don't close when server.close() is called — they're still "in use." You need to track these explicitly and close them during shutdown, otherwise your server will wait indefinitely.
📋 TL;DR
- Handle
SIGTERMandSIGINTexplicitly — never rely on the default "just die" behavior - Call
server.close()first to stop new requests, then close DB/Redis/queues - Always add a timeout that forces exit — broken cleanup shouldn't stall deployments forever
- On Kubernetes, add a short sleep before
server.close()to handle the load balancer race condition - Use
timer.unref()so your timeout doesn't artificially keep the process alive
Graceful shutdown takes maybe two hours to implement properly. The alternative is spending that time debugging why a deployment left half your database writes in an inconsistent state and why one very unhappy user is disputing a charge.
Already handling graceful shutdown in your Node.js apps? I'm curious how you handle long-lived connections like WebSockets — that's the next level of this problem and it gets interesting fast. Drop a comment below, or if you've got a pattern that's saved you in production, share it. And if you're currently just Ctrl+C-ing your staging server and hoping for the best — no judgment, but maybe fix that before the next deploy.