Picture this: a client asks you to build an endpoint that exports all their users as a CSV. "Sure," you say, "easy." Three months later that table has 500,000 rows and your Node process is eating 2GB of RAM, your server is on its knees, and your Slack is full of angry messages.
The culprit? You loaded everything into memory at once.
Node.js Streams are the cure — and they're one of the most underused tools in a backend developer's belt.
What Even Is a Stream?
A stream is just data that arrives over time instead of all at once. Think of it like a garden hose vs. a swimming pool. You don't fill a swimming pool before you water the plants — you turn on the hose and let water flow.
Node.js has four types of streams:
- Readable — you read data from it (e.g., a file, an HTTP request)
- Writable — you write data to it (e.g., a file, an HTTP response)
- Duplex — both read and write (e.g., a TCP socket)
- Transform — reads, transforms, writes (e.g., gzip compression)
The magic word is pipe. You connect streams like garden hoses — readable flows into writable, possibly through a transform in the middle.
The Classic Mistake: Loading Everything First
Here's how most developers naively write a "download big file" endpoint:
// ❌ The hoarder approach — loads entire file into memory
app.get('/download', async (req, res) => {
const fileContent = await fs.promises.readFile('./huge-dataset.csv');
res.send(fileContent); // 💀 2GB in RAM, server sweating bullets
});
For a small file, fine. For a 500MB CSV? Your process memory spikes, garbage collection goes haywire, and other requests start timing out because the event loop is choking.
The Stream-Powered Fix
// ✅ The stream approach — data flows, memory stays chill
const fs = require('fs');
app.get('/download', (req, res) => {
res.setHeader('Content-Type', 'text/csv');
res.setHeader('Content-Disposition', 'attachment; filename="export.csv"');
const readStream = fs.createReadStream('./huge-dataset.csv');
readStream.on('error', (err) => {
console.error('Stream error:', err);
res.status(500).end('Something went wrong');
});
readStream.pipe(res); // 🌊 data flows chunk by chunk directly to the client
});
That's it. The file never fully lives in RAM. Node reads a chunk, sends it to the client, reads another chunk, sends it — like a conveyor belt. Memory usage stays flat no matter how big the file is.
Real Power: Transform Streams (Compress on the Fly)
Here's where it gets genuinely cool. You can chain streams together. Want to gzip a huge CSV before sending it? Add a transform in the middle:
const fs = require('fs');
const zlib = require('zlib');
app.get('/download-compressed', (req, res) => {
res.setHeader('Content-Type', 'text/csv');
res.setHeader('Content-Encoding', 'gzip');
res.setHeader('Content-Disposition', 'attachment; filename="export.csv.gz"');
const readStream = fs.createReadStream('./huge-dataset.csv');
const gzip = zlib.createGzip();
readStream
.pipe(gzip) // compress each chunk on the fly
.pipe(res); // send compressed chunks to the client
readStream.on('error', (err) => res.destroy(err));
gzip.on('error', (err) => res.destroy(err));
});
Three streams. Zero MB of the full file ever held in memory. The data flows from disk → compression → network as a continuous river of chunks. Your server barely flinches.
This is also how video streaming works. Netflix isn't sending you the entire movie file upfront — it's piping chunks at whatever rate your connection can handle.
Backpressure: The Thing Everyone Forgets
Here's a subtle gotcha. What happens when your readable stream produces data faster than the writable stream can consume it?
Without .pipe(), you'd have to handle this yourself — buffers would pile up and you'd run out of memory anyway, just in a different place. The beauty of .pipe() is that it handles backpressure automatically. When the writable stream's buffer fills up, pipe pauses the readable stream until there's room. It's like a traffic light for data.
If you ever write a custom readable/writable stream, respect backpressure: check the return value of write() and pause if it returns false. But for most use cases, pipe() has your back.
When Should You Use Streams?
Streams shine when:
- Large file uploads or downloads — CSVs, images, videos, backups
- Log processing — reading a 10GB log file line by line
- Real-time data — piping database query results directly to an HTTP response
- Compression — gzip/brotli on the fly
- Encryption — encrypt while writing, decrypt while reading
They're overkill when:
- Data is small (under a few MB) — the overhead isn't worth it
- You need random access — streams are sequential
The Takeaway
Streams are Node.js's superpower for handling large or continuous data. They let you process data as it arrives rather than waiting for all of it, which means lower memory usage, faster time-to-first-byte, and a server that doesn't collapse under load.
The .pipe() method is your best friend. Connect a readable to a writable and let the data flow. Add transforms in the middle for compression, encryption, or any per-chunk processing you need.
Next time you find yourself doing readFile on something bigger than a config file, ask yourself: could this be a stream instead? Nine times out of ten, the answer is yes — and your server will thank you.
Try it out: Take an existing endpoint in your Express app that reads and sends a file, and refactor it to use createReadStream().pipe(res). Watch your memory profile stay flat even under load. Once you see it, you can't unsee it.
Got questions or a stream-related war story? I'd love to hear it. Drop a comment or reach out on GitHub — let's talk data pipelines.