Web Cache Poisoning: Your CDN Is Serving Malware (And You Have No Idea) ๐งช
Web Cache Poisoning: Your CDN Is Serving Malware (And You Have No Idea) ๐งช
I once spent three hours convinced our production app was intermittently XSS-infected โ random users were getting JavaScript pop-ups, logs showed nothing suspicious, our servers were clean. Turned out a junior dev had misconfigured our CloudFront distribution to cache responses based on a header that an attacker was already exploiting. We were essentially paying our CDN to distribute the attack for us. ๐ธ
As someone who's architected serverless e-commerce backends on AWS, cache poisoning is the vulnerability that keeps me up at night. It's elegant, persistent, and scales beautifully (for the attacker).
So What Actually Is Web Cache Poisoning? ๐ค
Imagine a caching layer sitting between your users and your server. Its job is simple: save responses, serve them to everyone who asks for the same URL. Efficient! Wonderful! Until an attacker figures out how to trick the cache into storing a malicious response and then serving it to innocent victims.
The attacker sends one crafted request. The cache stores the poisoned response. Now every user who visits that URL gets the malicious content โ without the attacker needing to do anything else.
It's the laziest effective attack in the hacker playbook.
The Anatomy of a Cache Poisoning Attack ๐ฆด
Caches typically use the URL as the "cache key" โ the identifier for what to store and retrieve. But your application often reads additional inputs to build responses: headers, query params, cookies. If those inputs affect the response but aren't part of the cache key... you've got a problem.
The dangerous pattern:
// Express backend reads X-Forwarded-Host to build absolute URLs
app.get('/api/config', (req, res) => {
const host = req.headers['x-forwarded-host'] || req.headers['host'];
res.json({
apiEndpoint: `https://${host}/api`,
cdnUrl: `https://${host}/static`
});
});
# Cache key: /api/config (URL only)
# But response changes based on X-Forwarded-Host header!
# Attacker sends:
GET /api/config HTTP/1.1
Host: yourapp.com
X-Forwarded-Host: evil.com
# Cache stores response with: { "apiEndpoint": "https://evil.com/api" }
# Next 10,000 users get this poisoned config
Your CDN cached it. Your server never sees the follow-up requests. Game over.
The Safe Pattern ๐
// Trust only the host you control
const TRUSTED_HOST = process.env.APP_URL || 'https://yourapp.com';
app.get('/api/config', (req, res) => {
// Never use user-controlled headers to build URLs
res.json({
apiEndpoint: `${TRUSTED_HOST}/api`,
cdnUrl: `${TRUSTED_HOST}/static`
});
});
Simple. Boring. Unattackable.
Real Talk: The Headers That Bite You ๐
In security communities, we often discuss the "unkeyed inputs" problem. Here are the headers most commonly exploited in cache poisoning attacks I've seen reported:
| Header | Why It's Dangerous |
|---|---|
X-Forwarded-Host |
Apps use it to generate absolute URLs |
X-Forwarded-Scheme |
Apps switch between HTTP/HTTPS |
X-Original-URL |
Some frameworks override routing with this |
X-Rewrite-URL |
Same problem, different name |
X-Forwarded-For |
Used in IP-based logic or geolocation |
In my experience building production systems, the worst offenders are frameworks that automatically trust these headers without any configuration. Laravel, Express, and most others have settings to explicitly define which headers to trust โ use them.
// Laravel: define trusted proxies explicitly
// In App\Http\Middleware\TrustProxies
protected $proxies = [
'10.0.0.0/8', // Your actual load balancer IPs only
'172.16.0.0/12',
];
// NOT this:
protected $proxies = '*'; // Trusts EVERYONE - please no
The CloudFront Misconfiguration That Almost Killed Us โ๏ธ
Back to my production incident. The actual problem was this CloudFront cache behavior:
# What we had (WRONG):
Cache based on: URL + Host header
# What attacker did:
Sent request with custom Host header pointing to their server
CloudFront cached the response under the original URL
All subsequent users got redirected to attacker's server
# What we fixed it to:
Cache based on: URL only (no headers)
Strip all X-Forwarded-* headers before passing to origin
# CloudFront Cache Policy: Origin request only
# Forward headers to origin: none (unless explicitly needed)
The fix was three clicks in the AWS console. The cleanup took three days.
Pro Tip: Test Yourself Before Attackers Do ๐ฏ
The excellent Param Miner Burp Suite extension was built specifically to discover unkeyed inputs. James Kettle's research at PortSwigger basically wrote the book on this attack class.
Quick manual test:
# Send a request with a canary header and unique value
curl -H "X-Custom-Test: CACHE-POISON-TEST-123" https://yourapp.com/somepage
# Check if the value appears in the response
# If it does, and the page gets cached, you have a poisoning vector
If your unique value shows up in the cached response that other users see? You've found a bug. Time to responsibly disclose (or fix it if it's your own app).
The Fat GET Body Variant ๐ฑ
Here's one that catches people off guard. Some caches only key on the URL, ignoring GET request bodies. But some backends actually read GET bodies. The result? Classic unkeyed input.
# Cache sees: GET /api/data (no body in key)
# Server reads: GET /api/data with body {"callback":"evil()"}
# If server uses that callback in response: poisoned for everyone
This is why GraphQL over GET requests can be particularly dangerous if you're not careful with your caching setup.
Your Cache Poisoning Defense Checklist ๐ก๏ธ
Before you ship anything behind a CDN:
- Never use
X-Forwarded-Hostor similar headers to generate URLs โ use hardcoded/environment config - Explicitly configure which headers your app trusts (trusted proxies list)
- In CloudFront/Cloudflare, disable forwarding headers you don't need
- Set
Varyheaders correctly so caches know what inputs affect your responses - Add cache-busting for responses that must be user-specific
- Use
Cache-Control: no-storefor anything containing user data or dynamic content - Regularly scan with Param Miner or similar tools
- Never cache error responses (they often reflect user input)
The Vary Header: Your Secret Weapon ๐
If your response legitimately varies by a header, tell the cache:
// If language changes the response, declare it
res.setHeader('Vary', 'Accept-Language');
res.setHeader('Cache-Control', 'public, max-age=3600');
Now the cache will store separate versions per Accept-Language value. The trade-off is cache efficiency vs. security โ but at least your cache is behaving predictably.
Why This Matters More in Serverless ๐๏ธ
As someone who's built serverless e-commerce backends on AWS, I see cache poisoning risks amplified:
- API Gateway + CloudFront: Easy to accidentally cache API responses that should be dynamic
- Lambda cold starts: Teams add aggressive caching to compensate, often misconfigured
- Shared CDN edge nodes: One poisoned cache entry can affect users across entire regions
The performance pressure to cache everything creates security pressure to cache nothing. Find the middle ground with precise, intentional cache policies.
TL;DR ๐ฏ
Web cache poisoning turns your CDN into an attack distribution network. The fix is conceptually simple: don't let uncontrolled user inputs affect cached responses. In practice, audit every header your app reads, lock down your CDN cache policies, and test with Param Miner before someone else does.
Your cache is supposed to serve your content faster. Don't let it serve someone else's malware instead.
Fascinated by attack surfaces like this one? I hang out in security communities like YAS and InitCrew where we discuss these things obsessively. Connect with me on LinkedIn or check out my experiments on GitHub.
Now go audit your CloudFront cache behaviors. I'll wait. ๐