0x55aa
Back to Blog

🌊 Node.js Streams: Stop Loading Gigabytes Into RAM Like It's the 90s

|
6 min read

Let me paint you a picture.

You have a perfectly fine Express app. It works great in development. Users love it. Then someone uploads a 2GB CSV file and — boom — your server runs out of memory, everything crashes, your on-call phone starts buzzing, and you're debugging at 2am wondering where your life went wrong.

The culprit? You loaded the whole file into memory before doing anything with it.

The cure? Node.js Streams.

What Even Is a Stream?

A stream is the programming equivalent of a garden hose instead of a swimming pool.

Without streams, you wait for all the water to fill the pool before you can drink any of it. With streams, water flows through the hose continuously — you can start drinking immediately, and you never need to hold more than what's in the hose at any moment.

Node.js has four types of streams:

  • Readable — data comes out (like reading a file)
  • Writable — data goes in (like writing a file)
  • Duplex — both directions (like a network socket)
  • Transform — modify data as it flows through (like compression)

The magic is in .pipe() — you chain these together like Lego bricks, and data flows from one to the next without ever accumulating in one giant blob.

The "Uh Oh" Code vs The Stream Code

Here's the classic mistake — downloading a large file and serving it to a client:

// ❌ The "I have 32GB of RAM and I'm not afraid to use it" approach
app.get('/download/:filename', async (req, res) => {
  const filename = req.params.filename;
  
  // readFile loads EVERYTHING into memory before doing anything
  const fileContent = await fs.promises.readFile(`./files/${filename}`);
  
  res.setHeader('Content-Type', 'application/octet-stream');
  res.send(fileContent); // Now you're holding a 2GB Buffer. Congratulations.
});

Your server just became a hostage. Every concurrent download multiplies that memory hit. Ten users downloading simultaneously? That's 20GB sitting in RAM.

Here's the stream version:

// ✅ The "I respect my server's feelings" approach
const fs = require('fs');
const path = require('path');

app.get('/download/:filename', (req, res) => {
  const filename = path.basename(req.params.filename); // sanitize the path!
  const filePath = path.join(__dirname, 'files', filename);

  const readStream = fs.createReadStream(filePath);

  readStream.on('error', (err) => {
    res.status(404).json({ error: 'File not found' });
  });

  res.setHeader('Content-Disposition', `attachment; filename="${filename}"`);
  res.setHeader('Content-Type', 'application/octet-stream');
  
  // Data flows: file → response, chunk by chunk
  readStream.pipe(res);
});

The file data flows directly from disk to the client in small chunks. Memory usage? Flat. Doesn't matter if the file is 1MB or 10GB. Your server barely notices.

Real-World Win: Processing a CSV Without Dying

Here's where streams really shine — transforming data on the fly. Say you receive a large CSV upload and need to parse it, filter rows, and return results to the client. In real production systems this comes up constantly.

const { Transform } = require('stream');
const csv = require('csv-parse');
const { stringify } = require('csv-stringify');

app.post('/process-csv', (req, res) => {
  res.setHeader('Content-Type', 'text/csv');
  res.setHeader('Content-Disposition', 'attachment; filename="processed.csv"');

  // Transform stream: filter only rows where age > 18
  const filterAdults = new Transform({
    objectMode: true, // work with JS objects, not raw buffers
    transform(row, encoding, callback) {
      if (parseInt(row.age) > 18) {
        this.push(row); // pass this row downstream
      }
      callback(); // done with this chunk, ready for next
    }
  });

  req
    .pipe(csv.parse({ columns: true, skip_empty_lines: true })) // parse CSV rows
    .pipe(filterAdults)                                          // filter rows
    .pipe(stringify({ header: true }))                          // back to CSV text
    .pipe(res);                                                  // send to client
});

Read that pipeline left to right: HTTP request body → parse CSV → filter rows → stringify back to CSV → HTTP response. The data never fully materializes in memory at once. You could process a 50GB CSV file on a $5 VPS and it would barely sweat.

The Backpressure Thing (Don't Skip This)

Here's the part nobody talks about: what happens when the source produces data faster than the destination consumes it?

Say you're reading from a blazing-fast SSD and writing to a slow network connection. Without backpressure handling, you'd buffer unbounded amounts of data in memory — exactly what you were trying to avoid.

.pipe() handles this automatically. When the writable stream says "slow down," the readable stream pauses. When the writable catches up, reading resumes. It's flow control built into the plumbing.

If you ever manually implement a stream pipeline (without .pipe()), you need to handle this yourself — but pipeline() from stream/promises is the modern way to do it cleanly, with proper error handling:

const { pipeline } = require('stream/promises');

app.get('/export', async (req, res) => {
  try {
    res.setHeader('Content-Type', 'application/gzip');
    
    const { createReadStream } = require('fs');
    const { createGzip } = require('zlib');
    
    await pipeline(
      createReadStream('./big-file.json'),
      createGzip(),   // compress on the fly
      res
    );
  } catch (err) {
    if (!res.headersSent) {
      res.status(500).json({ error: 'Export failed' });
    }
  }
});

That last example compresses a file on-the-fly while streaming it to the client. No temp files. No full file in memory. Just data flowing through a chain.

When Should You Actually Use Streams?

Streams aren't always the answer — they add complexity and objectMode pipelines can feel like you're spelunking through unfamiliar caves. Use them when:

  • Processing large files (uploads, downloads, exports)
  • Real-time data pipelines (logs, analytics events)
  • Data transformation at scale (CSV parsing, JSON line processing)
  • Proxying requests between services

For small payloads (a 50KB JSON response from a database query), just use res.json(). No need to reach for the industrial equipment to make a cup of coffee.

Wrapping Up

Node.js streams are one of the platform's genuine superpowers. They're the reason Node can handle file processing and data pipelines with minimal memory, and why .pipe() is one of the most elegant APIs in the entire ecosystem.

The mental model is simple: think in flows, not buffers. Data is a river, not a lake. Let it flow through your code without damming it up in memory.

Your challenge this week: Find one place in your codebase where you're using readFile or Buffer.concat on user-uploaded content, and rewrite it with streams. Your server will thank you. Your 2am on-call rotation definitely will.

Have questions about streams or hit a weird backpressure bug? Drop it in the comments — the weird ones are my favorite.

Thanks for reading!

Back to all posts