Database Caching: Stop Querying for the Same Damn Data šļø
Database Caching: Stop Querying for the Same Damn Data šļø
Real talk: The first time I saw our production database at 90% CPU, I panicked. Then I looked at the queries. We were fetching the same user profile... 10,000 times a minute. The SAME profile. The database was basically screaming "I ALREADY TOLD YOU THIS!" š±
Welcome to the world of caching - where we learn that sometimes the fastest database query is the one you never make!
What's Caching Anyway? š¤
Think of caching like your brain remembering where you left your keys instead of searching the entire house every time:
Without caching:
User: "What's my profile?"
App: *runs database query* "Here!"
User: *refreshes page*
App: *runs THE EXACT SAME QUERY AGAIN* "Here!"
Database: "Dude, seriously?"
With caching:
User: "What's my profile?"
App: *checks cache* "Got it right here!" (5ms)
Database: *sips coffee peacefully* ā
Translation: Cache = Fast temporary storage for stuff you use a lot!
The Wake-Up Call That Taught Me Caching š
When I was architecting our e-commerce backend at my previous company, we had a "product catalog" page. Simple enough, right?
The naive approach I deployed:
// Every. Single. Request. Hit. The. Database.
app.get('/api/products', async (req, res) => {
const products = await db.query(`
SELECT * FROM products
WHERE status = 'active'
ORDER BY featured DESC, created_at DESC
LIMIT 50
`);
res.json(products);
// Query time: 200ms
// Database connections: SCREAMING
});
What happened:
- 500 concurrent users = 500 database queries/second
- Same products returned EVERY TIME (catalog doesn't change that often!)
- Database CPU: 85%
- My boss: "Why is our AWS bill $2,000 this month?"
- Me: "I learned a valuable lesson today..." š
The cached approach:
const redis = require('redis');
const client = redis.createClient();
app.get('/api/products', async (req, res) => {
// Check cache first
const cached = await client.get('products:active');
if (cached) {
return res.json(JSON.parse(cached));
// Response time: 5ms
// Database: chillin'
}
// Cache miss - hit database
const products = await db.query(`
SELECT * FROM products
WHERE status = 'active'
ORDER BY featured DESC, created_at DESC
LIMIT 50
`);
// Store in cache for 5 minutes
await client.setex('products:active', 300, JSON.stringify(products));
res.json(products);
});
Results:
- Response time: 200ms ā 5ms (97.5% faster!)
- Database queries: 500/sec ā 0.2/sec (99.96% reduction!)
- AWS bill: $2,000 ā $400
- My boss: "Promote this person!"
- Me: š
Caching Strategies (When to Use What) šÆ
Strategy #1: Cache-Aside (Lazy Loading)
How it works: Check cache first, query database on miss, then cache the result
async function getUser(userId) {
// 1. Try cache
const cached = await cache.get(`user:${userId}`);
if (cached) return JSON.parse(cached);
// 2. Cache miss - query database
const user = await db.users.findById(userId);
// 3. Store in cache for next time
await cache.setex(`user:${userId}`, 3600, JSON.stringify(user));
return user;
}
Pros:
- ā Only cache what you actually use
- ā Cache failures don't break your app
- ā Easy to implement
Cons:
- ā First request is always slow (cache miss)
- ā Cache can get stale
When I use this: User profiles, product details, settings - stuff that's read-heavy!
Strategy #2: Write-Through Cache
How it works: Update cache AND database at the same time
async function updateUser(userId, data) {
// 1. Update database
const user = await db.users.update(userId, data);
// 2. Immediately update cache
await cache.setex(`user:${userId}`, 3600, JSON.stringify(user));
return user;
}
Pros:
- ā Cache is always fresh
- ā Read operations always hit cache
Cons:
- ā Write operations are slower (two operations)
- ā Cache could have data that's never read (wasted memory)
When I use this: When stale data is UNACCEPTABLE - pricing, inventory counts, user balances!
Strategy #3: Write-Behind Cache (My Favorite for Specific Cases!)
How it works: Update cache immediately, queue database update for later
async function updateUserProfile(userId, data) {
// 1. Update cache immediately (fast!)
const newData = { ...data, updatedAt: Date.now() };
await cache.setex(`user:${userId}`, 3600, JSON.stringify(newData));
// 2. Queue database update for background processing
await queue.add('update-user', { userId, data: newData });
// 3. User gets instant response!
return newData;
}
// Background worker processes the queue
queue.process('update-user', async (job) => {
await db.users.update(job.data.userId, job.data.data);
});
Pros:
- ā Super fast writes
- ā Handles write spikes gracefully
- ā Database can batch updates
Cons:
- ā Complex to implement
- ā Data loss risk if cache crashes before database update
- ā Consistency can be tricky
When I use this: Analytics, user activity tracking, non-critical updates - where eventual consistency is fine!
Strategy #4: Refresh-Ahead Cache
How it works: Proactively refresh cache before it expires
async function getPopularProducts() {
const cached = await cache.get('products:popular');
const ttl = await cache.ttl('products:popular');
// If cache expires in less than 1 minute, refresh it in background
if (ttl < 60 && ttl > 0) {
// Don't await - refresh in background
refreshPopularProducts().catch(err => console.error(err));
}
if (cached) return JSON.parse(cached);
// Cache miss - fetch and cache
return await refreshPopularProducts();
}
async function refreshPopularProducts() {
const products = await db.query(`
SELECT * FROM products
ORDER BY sales_count DESC
LIMIT 20
`);
await cache.setex('products:popular', 600, JSON.stringify(products));
return products;
}
Pros:
- ā No cache misses for frequently accessed data
- ā Consistent performance
Cons:
- ā More complex
- ā Can waste resources refreshing unused data
When I use this: Homepage hero sections, navigation menus, "trending" lists - stuff EVERYONE sees!
The Cache Invalidation Problem (The Hard Part) š
Phil Karlton famously said: "There are only two hard things in Computer Science: cache invalidation and naming things."
He was RIGHT! Here's why:
Problem: Stale Data
// User updates their profile
await db.users.update(userId, { name: 'New Name' });
// But cache still has old data!
const cached = await cache.get(`user:${userId}`);
// Returns: { name: 'Old Name' } š±
Solution #1: Time-Based Expiration
// Set TTL (Time To Live)
await cache.setex(`user:${userId}`, 300, JSON.stringify(user));
// Expires after 5 minutes automatically
Pros: Simple, automatic Cons: Data can be stale for up to 5 minutes
My rule: Short TTL (1-5 min) for frequently changing data, long TTL (1 hour+) for static data!
Solution #2: Explicit Invalidation
async function updateUser(userId, data) {
const user = await db.users.update(userId, data);
// Explicitly delete cache entry
await cache.del(`user:${userId}`);
return user;
}
Pros: Always fresh on updates Cons: Next read is slow (cache miss)
Solution #3: Update Cache on Write
async function updateUser(userId, data) {
const user = await db.users.update(userId, data);
// Update cache with new data
await cache.setex(`user:${userId}`, 3600, JSON.stringify(user));
return user;
}
Pros: Always fresh, no cache miss Cons: More write overhead
In production, I've learned: Use Solution #3 for critical data, Solution #1 for everything else!
Multi-Level Caching (How I Scaled to 100k Users) šļø
The secret: Don't just use ONE cache - use LAYERS!
Request Flow:
1. Application Memory (50ms) - Fastest
ā (miss)
2. Redis Cache (5ms) - Fast
ā (miss)
3. Database (200ms) - Slow
Implementation:
const NodeCache = require('node-cache');
const localCache = new NodeCache({ stdTTL: 60 }); // 1 min local cache
async function getUser(userId) {
// Layer 1: Application memory (super fast!)
const local = localCache.get(`user:${userId}`);
if (local) return local;
// Layer 2: Redis (fast)
const cached = await redis.get(`user:${userId}`);
if (cached) {
const user = JSON.parse(cached);
localCache.set(`user:${userId}`, user); // Populate Layer 1
return user;
}
// Layer 3: Database (slow)
const user = await db.users.findById(userId);
// Populate all cache layers
await redis.setex(`user:${userId}`, 3600, JSON.stringify(user));
localCache.set(`user:${userId}`, user);
return user;
}
Why this works:
- 80% of requests: Served from application memory (instant!)
- 15% of requests: Served from Redis (very fast)
- 5% of requests: Hit database (acceptable)
Real results from our e-commerce backend:
- Average response time: 200ms ā 8ms
- Database load: 95% reduction
- Handled 100k concurrent users on 4 servers
The catch: Cache invalidation becomes HARDER with multiple layers!
async function updateUser(userId, data) {
const user = await db.users.update(userId, data);
// Invalidate ALL cache layers!
localCache.del(`user:${userId}`);
await redis.del(`user:${userId}`);
return user;
}
What Should You Cache? (And What You Shouldn't!) š
CACHE THESE:
- ā User profiles (read 1000x, updated 1x)
- ā Product catalogs (same for everyone)
- ā API responses from external services
- ā Expensive computed data (aggregations, reports)
- ā Configuration settings
- ā Session data
- ā Navigation menus, footer data
- ā "Trending" or "Popular" lists
DON'T CACHE THESE:
- ā Real-time data (stock prices, live scores)
- ā User-specific sensitive data (payment info, passwords)
- ā Data that changes constantly
- ā One-time queries (no repeat benefit)
- ā Tiny queries that are already fast (<10ms)
My golden rule: If it's read more than 10x for every 1 write, cache it!
Common Caching Mistakes (I Made All of These) šŖ¤
Mistake #1: Caching Everything
// I actually did this once š
app.use(async (req, res, next) => {
const cacheKey = req.url;
const cached = await redis.get(cacheKey);
if (cached) {
return res.send(cached);
}
// BAD: Even caching POST requests, user-specific data, etc!
});
The problem:
- Memory bloated with useless cache entries
- Served stale data for user-specific pages
- Served cached POST responses (OOPS!)
The lesson: Be selective! Cache strategically, not blindly!
Mistake #2: Not Setting Expiration
// Cache without TTL = memory leak!
await redis.set(`user:${userId}`, JSON.stringify(user));
// This stays in cache FOREVER
What happened: Redis memory hit 100%, started evicting random keys, chaos!
The fix: ALWAYS set TTL!
await redis.setex(`user:${userId}`, 3600, JSON.stringify(user));
Mistake #3: Cache Stampede
The scenario:
1. Cache expires for popular item
2. 1000 concurrent requests all hit database
3. Database dies
4. All requests fail
5. Panic
The solution - Lock while refreshing:
async function getPopularProduct(productId) {
const cached = await redis.get(`product:${productId}`);
if (cached) return JSON.parse(cached);
// Try to acquire lock
const lock = await redis.set(
`lock:product:${productId}`,
'1',
'NX',
'EX',
10
);
if (lock) {
// I got the lock - I'll refresh the cache
const product = await db.products.findById(productId);
await redis.setex(`product:${productId}`, 600, JSON.stringify(product));
await redis.del(`lock:product:${productId}`);
return product;
} else {
// Someone else is refreshing - wait a bit and retry
await sleep(100);
return getPopularProduct(productId);
}
}
A scalability lesson that cost us: This one bug took down our database during Black Friday. After implementing locks, handled 50x the traffic!
Mistake #4: Wrong Cache Key Strategy
// BAD: Too generic
await redis.set('products', JSON.stringify(products));
// What products? All? Active? Category?
// GOOD: Specific, descriptive keys
await redis.setex('products:active:featured:limit:50', 300, JSON.stringify(products));
await redis.setex(`product:${productId}`, 600, JSON.stringify(product));
await redis.setex(`user:${userId}:profile`, 3600, JSON.stringify(profile));
Key naming pattern I use:
{resource}:{id}:{attribute}:{filter}
Examples:
user:123:profileproducts:category:electronics:page:1stats:daily:2026-02-01
Monitoring Your Cache (Because You Can't Improve What You Don't Measure) š
Critical metrics to track:
const stats = {
hits: 0,
misses: 0,
errors: 0
};
async function getFromCache(key) {
try {
const value = await redis.get(key);
if (value) {
stats.hits++;
return value;
} else {
stats.misses++;
return null;
}
} catch (error) {
stats.errors++;
console.error('Cache error:', error);
return null; // Fail gracefully!
}
}
// Expose metrics
app.get('/metrics/cache', (req, res) => {
const total = stats.hits + stats.misses;
const hitRate = total > 0 ? (stats.hits / total * 100).toFixed(2) : 0;
res.json({
hits: stats.hits,
misses: stats.misses,
errors: stats.errors,
hitRate: `${hitRate}%`
});
});
What to watch:
- Hit Rate: Should be >80% for well-cached data
- Miss Rate: High = wrong TTL or cache not populated
- Error Rate: Should be near 0%
- Memory Usage: Don't let Redis fill up!
My alerting thresholds:
- Hit rate <70%: Investigate TTL settings
- Error rate >1%: Redis connection issues
- Memory >85%: Increase Redis memory or evict less important data
The Tech Stack for Caching š ļø
In-Memory Caches (Fast but Volatile):
Redis - My go-to choice
# Why I love it:
- Super fast (single-digit millisecond reads)
- Rich data structures (strings, hashes, sets, sorted sets)
- Built-in TTL support
- Pub/Sub for cache invalidation
- Persistence options
Memcached - Simpler alternative
# Use when:
- You just need key-value storage
- Don't need persistence
- Want slightly faster raw performance
Application-Level Caches:
Node-cache (Node.js)
const NodeCache = require('node-cache');
const cache = new NodeCache({ stdTTL: 600 });
cache.set('key', 'value');
const value = cache.get('key');
Laravel Cache (PHP)
Cache::put('key', 'value', 600);
$value = Cache::get('key');
CDN Caching (For Static Assets & API Responses):
- CloudFront (AWS)
- Cloudflare
- Fastly
When architecting on AWS, I learned: Use CloudFront for static assets, Redis for application data, local cache for ultra-hot paths!
The Decision Tree: What Caching Strategy to Use? š³
Use Cache-Aside when:
- ā Read-heavy workload
- ā Stale data is acceptable (for TTL duration)
- ā Simple implementation preferred
Use Write-Through when:
- ā Data must always be fresh
- ā Read performance is critical
- ā Write volume is manageable
Use Write-Behind when:
- ā Write performance is critical
- ā Eventual consistency is acceptable
- ā You can handle complex error scenarios
Use Refresh-Ahead when:
- ā Data access patterns are predictable
- ā Zero cache misses are important
- ā You have resources for proactive refreshing
My production setup:
- User profiles: Cache-Aside (5 min TTL)
- Product catalog: Refresh-Ahead (10 min TTL)
- Inventory counts: Write-Through (always fresh!)
- Analytics events: Write-Behind (eventual consistency is fine)
Quick Start: Your Caching Implementation Checklist ā
Ready to add caching? Start here:
-
Identify slow queries:
-- Find your slowest queries SELECT query, mean_time, calls FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10; -
Add Redis to your stack:
docker run -d -p 6379:6379 redis:alpine npm install redis -
Implement cache-aside for top 3 queries:
// Start simple! const cached = await redis.get(key); if (cached) return JSON.parse(cached); const data = await db.query(...); await redis.setex(key, 300, JSON.stringify(data)); return data; -
Monitor hit rates:
// Track hits vs misses console.log(`Cache hit rate: ${hits / (hits + misses) * 100}%`); -
Iterate and optimize! š
The Bottom Line š”
Caching isn't about making everything fast - it's about making COMMON things fast!
The essentials:
- Cache read-heavy data (10:1 read:write ratio or higher)
- Always set TTL (prevent memory leaks)
- Invalidate on writes (keep data fresh)
- Monitor hit rates (improve what you measure)
- Start simple (cache-aside with Redis is 90% of use cases)
The truth about caching:
It's not "use Redis and everything is magically fast!" - it's understanding your data access patterns, choosing the right strategy, and handling cache invalidation properly!
When designing our e-commerce backend, I learned this: Don't cache because it's cool. Cache because your database is crying. Monitor. Measure. Iterate. And for the love of all that is holy, SET YOUR TTLs! ā°
You don't need perfect caching from day one - you need good enough caching that evolves with your traffic! š
Your Action Plan šÆ
This week:
- Profile your database (find slow queries)
- Set up Redis locally
- Cache your top 3 most-queried data
- Add basic hit rate monitoring
This month:
- Implement multi-level caching for hot paths
- Add cache invalidation on writes
- Set up Redis in production (with persistence!)
- Create alerting for cache metrics
This quarter:
- Implement refresh-ahead for critical data
- Fine-tune TTL values based on real usage
- Add distributed caching across servers
- Become the caching guru on your team! š
Resources Worth Your Time š
Tools I use daily:
- Redis Insight - GUI for Redis
- redis-cli monitor - Watch cache operations in real-time
- redis-rdb-tools - Analyze Redis memory usage
Reading list:
Real talk: The best cache strategy is the one you'll actually maintain! Start simple, measure, iterate!
Still hitting your database for the same data? Connect with me on LinkedIn and share your caching war stories!
Want to see my caching implementations? Check out my GitHub - I've got examples from e-commerce to real-time analytics!
Now go forth and cache responsibly! šļøšØ
P.S. If you're not caching yet, your database is probably crying right now. Go give it a hug (in the form of a Redis instance)! š«
P.P.S. I once cached API responses with user auth tokens in the key. Guess who accidentally served User A's data to User B? Don't be like 2019 me - sanitize your cache keys! š