Database Connection Pooling: Stop Crashing Production with 10,000 Connections šāāļøš„
Database Connection Pooling: Stop Crashing Production with 10,000 Connections šāāļøš„
Real confession: My first production outage as a tech lead happened at 2:47 AM on a Saturday. The on-call alert woke me up: "Database connection limit exceeded." I logged in to find 8,472 active database connections. Our Postgres max was 100. The math didn't math. Users couldn't log in. The API was returning 500 errors. I had no idea what connection pooling was. That was about to change. š±
CEO in Slack: "Why is the site down?"
Me: "Uh... too many database connections?"
CTO: "Did you configure connection pooling?"
Me: "The what now?"
Welcome to the day I learned that every open database connection costs memory, and databases eventually say "NOPE!" š
What's Database Connection Pooling Anyway? š¤
Think of database connections like phone lines at a call center:
Without connection pooling (Chaos Mode):
// Every API request opens a NEW connection
app.get('/users/:id', async (req, res) => {
const connection = await mysql.createConnection({
host: 'db.production.com',
user: 'api_user',
password: 'secret'
}); // Opens connection #1
const user = await connection.query('SELECT * FROM users WHERE id = ?', [req.params.id]);
// Oops, forgot to close it! Connection stays open! š
res.json(user);
});
// What happens:
// Request 1 ā Connection 1 (never closed)
// Request 2 ā Connection 2 (never closed)
// Request 3 ā Connection 3 (never closed)
// ...
// Request 101 ā ERROR: Too many connections! š„
With connection pooling (Pro Mode):
// Create a pool of reusable connections
const pool = mysql.createPool({
host: 'db.production.com',
user: 'api_user',
password: 'secret',
connectionLimit: 10, // Max 10 connections at a time
waitForConnections: true,
queueLimit: 0
});
app.get('/users/:id', async (req, res) => {
// Borrow a connection from the pool
const connection = await pool.getConnection();
const user = await connection.query('SELECT * FROM users WHERE id = ?', [req.params.id]);
// Return connection to the pool (reuse it!)
connection.release();
res.json(user);
});
// What happens:
// Request 1 ā Borrow connection #1 ā Use ā Return to pool
// Request 2 ā Borrow connection #1 (reused!) ā Use ā Return
// Request 3 ā Borrow connection #2 ā Use ā Return
// ...
// Request 1000 ā Still using the same 10 connections! ā
Translation: Connection pooling = Reuse connections instead of creating thousands! š
The Production Meltdown That Taught Me Everything š
After deploying our Laravel e-commerce API to AWS, here's what went catastrophically wrong:
Friday, Black Friday Sale Launch, 8 AM:
# Everything looks good
8:00 AM - Traffic: 100 requests/sec, DB connections: 20 ā
8:15 AM - Traffic: 500 requests/sec, DB connections: 45 ā
8:30 AM - Traffic: 1,200 requests/sec, DB connections: 87 ā ļø
8:32 AM - Traffic: 1,500 requests/sec, DB connections: 99 šØ
8:33 AM - ALERT: Database refusing new connections! š„
8:34 AM - API returning 500 errors
8:35 AM - Shopping carts timing out
8:36 AM - Revenue: $0/minute (was $850/minute)
8:37 AM - My phone: RING RING RING š±
What I discovered:
// My Node.js code (THE PROBLEM)
async function getProduct(productId) {
// Create a NEW connection for EVERY request! š±
const db = await new Client({
host: process.env.DB_HOST,
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD
});
await db.connect(); // Opens connection
const result = await db.query('SELECT * FROM products WHERE id = $1', [productId]);
// I NEVER CALLED db.end()! š
// Connection stays open FOREVER!
return result.rows[0];
}
// With 1,500 req/sec:
// 1,500 connections/second Ć 10 seconds = 15,000 connections!
// Postgres max_connections: 100
// Math: š„
The fix that saved our Black Friday:
// Create ONE pool for the entire app
const { Pool } = require('pg');
const pool = new Pool({
host: process.env.DB_HOST,
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
max: 20, // Max 20 connections in pool
min: 5, // Keep 5 connections always ready
idleTimeoutMillis: 30000, // Close idle connections after 30s
connectionTimeoutMillis: 2000, // Wait max 2s for available connection
});
async function getProduct(productId) {
// Borrow from pool (reuses existing connection!)
const result = await pool.query('SELECT * FROM products WHERE id = $1', [productId]);
// Connection automatically returned to pool! ā
return result.rows[0];
}
// Result:
// 1,500 req/sec using only 20 connections!
// Database: Happy ā
// Revenue: Flowing š°
// My sleep: Restored š“
Impact:
- Before: 1,500 requests ā 1,500+ connections ā Database crash
- After: 1,500 requests ā 20 connections ā Smooth sailing
- Lost revenue during 6-minute outage: ~$5,100 šø
After countless production deployments, I learned: Connection pooling isn't optional - it's survival! š
Connection Pooling 101: The Right Way š
Example #1: Node.js with PostgreSQL (pg)
The WRONG way (Don't do this!):
const { Client } = require('pg');
// BAD: Creating clients on demand
app.get('/api/users', async (req, res) => {
const client = new Client({
connectionString: process.env.DATABASE_URL
});
await client.connect(); // Opens connection
const result = await client.query('SELECT * FROM users');
await client.end(); // Closes connection
res.json(result.rows);
});
// Why this is bad:
// - Opens/closes connection EVERY request (slow!)
// - Connection handshake takes 50-200ms
// - SSL negotiation adds another 50-100ms
// - At 100 req/sec = 10,000+ connections/minute opened!
// - Database connection limit: 100
// - You're toast! š„
The RIGHT way (Use a pool!):
const { Pool } = require('pg');
// Create pool ONCE at app startup
const pool = new Pool({
host: process.env.DB_HOST,
port: process.env.DB_PORT || 5432,
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
// Pool configuration
max: 20, // Maximum connections in pool
min: 5, // Minimum connections always ready
// Timing
idleTimeoutMillis: 30000, // Close idle connections after 30s
connectionTimeoutMillis: 2000, // Throw error if can't get connection in 2s
// Monitoring
allowExitOnIdle: false, // Keep pool alive even when idle
});
// Handle pool errors
pool.on('error', (err, client) => {
console.error('Unexpected error on idle client', err);
process.exit(-1);
});
// Use pool in your routes
app.get('/api/users', async (req, res) => {
try {
// Simple query (pool handles connection automatically)
const result = await pool.query('SELECT * FROM users LIMIT 100');
res.json(result.rows);
} catch (err) {
console.error('Database query failed:', err);
res.status(500).json({ error: 'Database error' });
}
});
// For transactions (need explicit connection)
app.post('/api/transfer', async (req, res) => {
const client = await pool.connect(); // Borrow connection
try {
await client.query('BEGIN');
await client.query('UPDATE accounts SET balance = balance - $1 WHERE id = $2', [100, req.body.fromAccount]);
await client.query('UPDATE accounts SET balance = balance + $1 WHERE id = $2', [100, req.body.toAccount]);
await client.query('COMMIT');
res.json({ success: true });
} catch (err) {
await client.query('ROLLBACK');
res.status(500).json({ error: 'Transfer failed' });
} finally {
client.release(); // Return connection to pool! CRITICAL!
}
});
// Graceful shutdown
process.on('SIGTERM', async () => {
console.log('Closing database pool...');
await pool.end();
process.exit(0);
});
Why this works:
- ā Pool created ONCE at startup
- ā Connections reused across thousands of requests
- ā Max 20 connections (configurable, database doesn't explode)
- ā Idle connections cleaned up (saves memory)
- ā Queries wait if all connections busy (graceful degradation)
- ā Automatic reconnection on connection failure
Example #2: Laravel with Database Connection Pool
Laravel does pooling automatically (mostly), but you need to configure it right:
config/database.php:
// config/database.php
'pgsql' => [
'driver' => 'pgsql',
'host' => env('DB_HOST', '127.0.0.1'),
'port' => env('DB_PORT', '5432'),
'database' => env('DB_DATABASE', 'forge'),
'username' => env('DB_USERNAME', 'forge'),
'password' => env('DB_PASSWORD', ''),
// Connection pooling settings
'charset' => 'utf8',
'prefix' => '',
'prefix_indexes' => true,
'schema' => 'public',
'sslmode' => 'prefer',
// IMPORTANT: Pool configuration
'pool' => [
'min' => 5, // Minimum idle connections
'max' => 20, // Maximum active connections
],
// Connection options
'options' => [
PDO::ATTR_TIMEOUT => 2, // 2 second connection timeout
PDO::ATTR_PERSISTENT => false, // Don't use persistent connections (usually!)
],
],
In production with Laravel and AWS RDS, I learned: Use PgBouncer as a connection pooler in front of your database! šÆ
docker-compose.yml with PgBouncer:
version: '3.8'
services:
app:
build: .
environment:
# App connects to PgBouncer, not directly to DB!
DB_HOST: pgbouncer
DB_PORT: 6432
depends_on:
- pgbouncer
pgbouncer:
image: pgbouncer/pgbouncer:latest
environment:
DATABASES_HOST: postgres
DATABASES_PORT: 5432
DATABASES_USER: app_user
DATABASES_PASSWORD: app_password
DATABASES_DBNAME: myapp_production
PGBOUNCER_POOL_MODE: transaction # or 'session' or 'statement'
PGBOUNCER_MAX_CLIENT_CONN: 1000 # Accept 1000 client connections
PGBOUNCER_DEFAULT_POOL_SIZE: 25 # But use only 25 DB connections!
ports:
- "6432:6432"
depends_on:
- postgres
postgres:
image: postgres:14-alpine
environment:
POSTGRES_USER: app_user
POSTGRES_PASSWORD: app_password
POSTGRES_DB: myapp_production
# Postgres max_connections can be lower now!
# PgBouncer multiplexes connections!
command: postgres -c max_connections=50
volumes:
- postgres-data:/var/lib/postgresql/data
volumes:
postgres-data:
Why PgBouncer is a game-changer:
- ā 1,000 app connections ā 25 database connections
- ā Database sees max 25 connections (never overwhelmed)
- ā Apps can have generous pool sizes
- ā Connection multiplexing (reuses DB connections efficiently)
- ā Works transparently (app doesn't know it exists)
In production with multiple Laravel workers, PgBouncer saved us: Each worker can have 20 connections in its pool, but PgBouncer ensures only 25 actually hit the database! š
Example #3: Python with SQLAlchemy
# database.py
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.pool import QueuePool
# Create engine with connection pool
engine = create_engine(
f"postgresql://{DB_USER}:{DB_PASSWORD}@{DB_HOST}:{DB_PORT}/{DB_NAME}",
# Pool configuration
poolclass=QueuePool,
pool_size=10, # Keep 10 connections in pool
max_overflow=20, # Allow 20 more if needed (total max: 30)
pool_timeout=30, # Wait max 30s for connection
pool_recycle=3600, # Recycle connections after 1 hour
pool_pre_ping=True, # Test connection before using (catches dead connections)
# Connection options
connect_args={
"connect_timeout": 5,
"options": "-c statement_timeout=30000", # 30s query timeout
},
echo=False, # Set True for SQL debugging
)
# Create session factory
SessionLocal = sessionmaker(
autocommit=False,
autoflush=False,
bind=engine
)
# Dependency for FastAPI
def get_db():
db = SessionLocal()
try:
yield db
finally:
db.close() # Return connection to pool
Using it in FastAPI:
from fastapi import FastAPI, Depends
from sqlalchemy.orm import Session
app = FastAPI()
@app.get("/users/{user_id}")
async def get_user(user_id: int, db: Session = Depends(get_db)):
# db connection borrowed from pool
user = db.query(User).filter(User.id == user_id).first()
# connection returned to pool when request completes
return user
@app.on_event("shutdown")
async def shutdown():
# Close all pool connections on app shutdown
engine.dispose()
Advanced Pool Configuration (The Good Stuff) šÆ
Pattern #1: Dynamic Pool Sizing Based on Environment
// config/database.js
const isProd = process.env.NODE_ENV === 'production';
const isTest = process.env.NODE_ENV === 'test';
const poolConfig = {
// Production: High concurrency
production: {
max: 20,
min: 5,
idleTimeoutMillis: 30000,
},
// Development: Fewer connections needed
development: {
max: 5,
min: 1,
idleTimeoutMillis: 10000,
},
// Testing: Minimal pool (tests run sequentially)
test: {
max: 2,
min: 1,
idleTimeoutMillis: 1000,
},
};
const pool = new Pool({
host: process.env.DB_HOST,
...poolConfig[process.env.NODE_ENV || 'development'],
});
Pattern #2: Connection Pool Monitoring
// Monitor pool health
pool.on('connect', (client) => {
console.log('New client connected to pool');
});
pool.on('acquire', (client) => {
console.log('Client acquired from pool');
});
pool.on('remove', (client) => {
console.log('Client removed from pool');
});
// Expose pool metrics endpoint
app.get('/metrics/pool', (req, res) => {
res.json({
totalCount: pool.totalCount, // Total clients
idleCount: pool.idleCount, // Idle clients
waitingCount: pool.waitingCount, // Waiting queries
});
});
// Alert if pool is saturated
setInterval(() => {
const utilization = (pool.totalCount - pool.idleCount) / pool.totalCount;
if (utilization > 0.8) {
console.warn(`ā ļø Pool 80% utilized! Consider increasing max connections!`);
}
if (pool.waitingCount > 10) {
console.error(`šØ ${pool.waitingCount} queries waiting for connection!`);
}
}, 10000); // Check every 10 seconds
Pattern #3: Graceful Degradation on Pool Exhaustion
async function queryWithFallback(sql, params) {
const startTime = Date.now();
try {
// Try to get connection from pool
const result = await pool.query(sql, params);
return result;
} catch (err) {
const duration = Date.now() - startTime;
// If timeout waiting for connection
if (err.code === 'ETIMEDOUT' || duration > 2000) {
console.error('Pool exhausted! Falling back to cache or error');
// Option 1: Return cached result
const cached = await redis.get(`cache:${sql}`);
if (cached) {
console.log('Returned cached result due to pool exhaustion');
return JSON.parse(cached);
}
// Option 2: Return friendly error
throw new Error('Database temporarily unavailable. Please try again.');
}
throw err;
}
}
Common Pool Configuration Mistakes (I Made All of These) šØ
Mistake #1: Pool Size Too Large
Bad:
const pool = new Pool({
max: 500, // Each API instance has 500 connections! š±
});
// With 10 API instances:
// 10 Ć 500 = 5,000 database connections!
// Postgres max_connections: 100
// Math: š„
Good:
// Calculate pool size based on:
// - Database max_connections
// - Number of app instances
// - Reserved connections for admin/monitoring
// Postgres max_connections: 100
// Reserved for admin: 10
// Available for apps: 90
// Number of app instances: 6
// Pool size per instance: 90 / 6 = 15
const pool = new Pool({
max: 15, // Leaves room for all instances + admin!
});
The formula I use:
Pool size per instance = (DB max_connections - reserved) / number of instances
Mistake #2: Not Setting Connection Timeout
Bad:
const pool = new Pool({
max: 20,
// No timeout! Queries wait forever if pool exhausted! š±
});
// Request waits 5 minutes for connection ā User gives up
Good:
const pool = new Pool({
max: 20,
connectionTimeoutMillis: 2000, // Fail fast after 2 seconds!
});
// If no connection available in 2s ā Error thrown
// App can return 503 Service Unavailable
// User sees friendly error instead of hanging!
Mistake #3: Never Releasing Connections
Bad:
app.post('/api/checkout', async (req, res) => {
const client = await pool.connect();
await client.query('BEGIN');
await client.query('INSERT INTO orders ...');
await client.query('UPDATE inventory ...');
await client.query('COMMIT');
res.json({ success: true });
// FORGOT client.release()! š
// Connection never returned to pool!
// After 20 requests, pool exhausted!
});
Good:
app.post('/api/checkout', async (req, res) => {
const client = await pool.connect();
try {
await client.query('BEGIN');
await client.query('INSERT INTO orders ...');
await client.query('UPDATE inventory ...');
await client.query('COMMIT');
res.json({ success: true });
} catch (err) {
await client.query('ROLLBACK');
throw err;
} finally {
client.release(); // ALWAYS release! ā
}
});
Even better - Use a helper:
async function withTransaction(callback) {
const client = await pool.connect();
try {
await client.query('BEGIN');
const result = await callback(client);
await client.query('COMMIT');
return result;
} catch (err) {
await client.query('ROLLBACK');
throw err;
} finally {
client.release(); // Guaranteed release!
}
}
// Usage
app.post('/api/checkout', async (req, res) => {
const result = await withTransaction(async (client) => {
await client.query('INSERT INTO orders ...');
await client.query('UPDATE inventory ...');
return { orderId: 123 };
});
res.json(result);
});
Mistake #4: Using Persistent Connections Wrong
Bad:
// PHP with persistent connections
$pdo = new PDO(
"pgsql:host=$host;dbname=$db",
$user,
$pass,
[PDO::ATTR_PERSISTENT => true] // Persistent connection
);
// Problem: PHP-FPM has 50 worker processes
// Each holds a persistent connection
// 50 connections always open!
// Even when idle! šø
Good:
// Use connection pooling middleware (PgBouncer) instead
$pdo = new PDO(
"pgsql:host=pgbouncer;port=6432;dbname=$db",
$user,
$pass,
[PDO::ATTR_PERSISTENT => false] // No persistent connections
);
// PgBouncer handles connection reuse
// PHP workers can come and go
// Database sees consistent connection count! ā
The Bottom Line š”
Connection pooling isn't about fancy architecture - it's about basic math!
The essentials:
- One pool per application instance - Not per request!
- Size pool based on database limits - Not "whatever feels right"
- Always release connections - Use try/finally or helpers
- Set timeouts - Fail fast, don't hang forever
- Monitor pool utilization - Know when to scale
- Consider PgBouncer for production - Multiplexes connections like a boss
The truth about database connections:
It's not "Can we handle the load?" - it's "Can the DATABASE handle our connection count?"
In my 7 years deploying production applications, I learned this: Your fancy Kubernetes cluster with autoscaling doesn't matter if all 100 pods try to open 20 database connections each. That's 2,000 connections trying to connect to a database that maxes out at 100. Math wins. Every time. š
You don't need a massive database from day one - you need proper connection pooling! A small RDS instance with good pooling beats a huge instance with connection leaks! šÆ
Your Action Plan š
Right now:
- Check your database max_connections:
SHOW max_connections; - Count active connections:
SELECT count(*) FROM pg_stat_activity; - Add connection pooling if you don't have it
- Add pool monitoring endpoint
This week:
- Calculate proper pool size per instance
- Set connection timeouts
- Audit code for connection leaks
- Add graceful connection release (try/finally)
This month:
- Set up PgBouncer for production
- Implement pool monitoring/alerting
- Load test with pool exhaustion scenarios
- Document pool configuration in runbook
- Sleep peacefully knowing your database won't crash! š“
Resources Worth Your Time š
Tools:
- PgBouncer - Lightweight connection pooler for PostgreSQL
- ProxySQL - Connection pooler for MySQL
- pg (node-postgres) - PostgreSQL client with built-in pooling
Reading:
- Postgres Connection Pooling
- HikariCP (Java) - Lessons from the best connection pool
Real talk: The best pool configuration is one that never exhausts and never wastes resources!
Still opening a new connection for every request? Connect with me on LinkedIn and let's talk database performance!
Want to see my production configs? Check out my GitHub - Real pool configurations from real production systems!
Now go forth and pool those connections! šāāļøāØ
P.S. If you've never checked your production database's active connections, do it right now. I'll wait. Run: SELECT count(*) FROM pg_stat_activity; and prepare to be horrified! š
P.P.S. I once debugged a "database is slow" issue for 3 days. Turns out we had 4,000 idle connections sitting open. The fix? Connection pooling with proper timeouts. Performance improved 10x. Learn from my pain! š”