🛑 Node.js Graceful Shutdown: Don't Just Kill It

Picture this: your CI pipeline just pushed a new version of your app. Kubernetes sends a SIGTERM to the old pod. Your app — like a toddler who doesn't understand "we're leaving now" — immediately drops everything and shuts down. Three users get 503 errors. One of them was mid-checkout. You get a Slack message at 2am.

This is the ungraceful shutdown experience. And it's shockingly common.

Today we fix that.

What Even Is a Graceful Shutdown?

A graceful shutdown is the difference between a surgeon finishing the stitch before leaving the OR, and just... walking out mid-operation because their shift technically ended.

When your server receives a shutdown signal (SIGTERM, SIGINT), the graceful approach is:

Stop accepting new connections — no new patients in the waiting room
Finish in-flight requests — complete what's already started
Close external connections — database, cache, message queues
Exit cleanly — with code 0, like a professional

Sounds obvious. Most apps don't do it.

The Naive App (aka The Problem)

Here's what most Express apps look like when they die:

const express = require('express');
const app = express();

app.get('/checkout', async (req, res) => {
  await processPayment(req.body);   // 🔥 This might get cut off
  await sendConfirmationEmail();     // 🔥 So might this
  res.json({ success: true });
});

app.listen(3000, () => console.log('Server running on port 3000'));
// No shutdown handling. Just vibes.

When SIGTERM hits, Node exits. If processPayment was halfway done, it's just... abandoned. The charge might go through. The email might not. The customer is confused. Your refund queue grows.

The Graceful Version

Here's how to actually handle this:

const express = require('express');
const app = express();

app.get('/checkout', async (req, res) => {
  await processPayment(req.body);
  await sendConfirmationEmail();
  res.json({ success: true });
});

const server = app.listen(3000, () => {
  console.log('Server running on port 3000');
});

// ---- The important bit ----

let isShuttingDown = false;

// Reject new requests during shutdown
app.use((req, res, next) => {
  if (isShuttingDown) {
    res.setHeader('Connection', 'close');
    return res.status(503).json({ error: 'Server is shutting down' });
  }
  next();
});

async function gracefulShutdown(signal) {
  console.log(`\nReceived ${signal}. Starting graceful shutdown...`);
  isShuttingDown = true;

  // Stop accepting new connections
  server.close(async () => {
    console.log('HTTP server closed. Cleaning up...');

    try {
      await db.end();           // close DB pool
      await redisClient.quit(); // close Redis
      console.log('All connections closed. Exiting.');
      process.exit(0);
    } catch (err) {
      console.error('Error during cleanup:', err);
      process.exit(1);
    }
  });

  // Force-exit if cleanup takes too long (safety net)
  setTimeout(() => {
    console.error('Graceful shutdown timed out. Force exiting.');
    process.exit(1);
  }, 10_000); // 10 seconds
}

process.on('SIGTERM', () => gracefulShutdown('SIGTERM'));
process.on('SIGINT',  () => gracefulShutdown('SIGINT'));

The key moves here:

isShuttingDown flag — a bouncer at the door who stops letting new people in
server.close() — waits for active connections to finish before calling the callback
Timeout fallback — if something is truly stuck, we still exit rather than hanging forever
Cleanup order matters — close your app server before your database, not the other way around

The Kubernetes Angle

In Kubernetes, the lifecycle looks like this:

Pod gets SIGTERM
Kubernetes waits terminationGracePeriodSeconds (default: 30s)
If the pod is still alive after 30s, it gets SIGKILL (the nuclear option)

This means your app has up to 30 seconds to finish in-flight work. Use it wisely. Your timeout in the shutdown handler should be slightly less than terminationGracePeriodSeconds so Node exits cleanly before Kubernetes kills it forcefully.

# In your Kubernetes Deployment
spec:
  terminationGracePeriodSeconds: 30

And in your Node.js app:

setTimeout(() => {
  process.exit(1);
}, 25_000); // 5 seconds less than K8s grace period

That 5-second buffer is your insurance policy.

Common Pitfalls

Forgetting keep-alive connections — server.close() stops new connections but won't forcibly close existing keep-alive HTTP connections. Long-polling clients or SSE streams can hold the process open. You may need to track open sockets and destroy them manually, or use a library like http-terminator which handles this for you.

Not closing your DB pool — If you skip db.end(), the process might hang or leave dangling connections on your database server. Postgres especially gets cranky about this over time.

Logging after process.exit() — process.exit() is synchronous and immediate. Any async logging (like shipping to Datadog) after that line will be silently dropped. Flush your logs before calling exit.

Why This Actually Matters

Graceful shutdown isn't just about being polite to your users (though it is that). It's about:

Zero-downtime deploys — rolling updates only work if the old pod finishes its work before dying
Data integrity — half-written database transactions are a nightmare to debug
Cost — fewer failed transactions means fewer support tickets, refunds, and incidents
Sleep quality — yours, specifically, at 2am

The code to do this right is maybe 30 lines. The cost of not doing it is... higher.

Wrapping Up

Your app receives SIGTERM dozens of times a week in a normal production environment. Every deploy, every scale-down event, every node rotation sends one. Most apps silently drop requests every single time and nobody notices until something important breaks.

Add graceful shutdown. It takes 20 minutes, and it's the kind of boring infrastructure work that makes you a hero when it matters.

Now go update your server.js — your future on-call self will thank you.

Is your production app handling shutdowns correctly? Drop your setup in the comments or ping me on GitHub — always happy to compare war stories.