AWS S3 Lifecycle Policies: Stop Paying for Data You Forgot Existed ð°âïļ
AWS S3 Lifecycle Policies: Stop Paying for Data You Forgot Existed ð°âïļ
Real talk: I opened our AWS bill one morning, choked on my coffee, and discovered we were paying S3 Standard prices for 847 GB of access logs dating back to 2021.
Nobody was reading those logs. Nobody was going to read those logs. But AWS was happily charging us $19/month to keep them warm and ready, like a butler standing outside a room that's been empty for three years.
That ended the day I discovered S3 Lifecycle Policies. Our storage bill dropped 73% in 30 days. I wish I was exaggerating. ð
The S3 Storage Tier Secret AWS Doesn't Advertise Loudly ðĪŦ
Here's what AWS doesn't put on the homepage: S3 Standard is expensive relative to what you actually need for most data.
The storage tiers, from "I use this constantly" to "please just don't delete it":
S3 Standard: $0.023/GB/month â You're paying this for EVERYTHING
S3 Standard-IA: $0.0125/GB/month â Infrequent Access (30-day min)
S3 One Zone-IA: $0.01/GB/month â Cheaper, one AZ only
S3 Glacier Instant: $0.004/GB/month â Archives, retrieval in ms
S3 Glacier Flexible: $0.0036/GB/month â Archives, retrieval in minutes
S3 Glacier Deep: $0.00099/GB/month â Cold storage, retrieval in hours
Translation: You could be paying 23x less for data you haven't touched in 6 months. By not configuring lifecycle policies, you're choosing to overpay. Every. Single. Month.
In production, I've deployed e-commerce backends that generate gigabytes of logs, order exports, invoice PDFs, and product images daily. Without lifecycle rules, you're building a storage debt that compounds forever.
The Lifecycle Policy That Saved Us $400/Month ð
Here's the exact policy I applied to our application logs bucket:
{
"Rules": [
{
"ID": "logs-tiering-rule",
"Status": "Enabled",
"Filter": { "Prefix": "logs/" },
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER_IR"
},
{
"Days": 365,
"StorageClass": "DEEP_ARCHIVE"
}
],
"Expiration": {
"Days": 730
}
}
]
}
What this does:
- Day 0â30: Standard (hot, fast - in case you need to debug recent stuff)
- Day 30â90: Standard-IA (45% cheaper, still accessible quickly)
- Day 90â365: Glacier Instant (82% cheaper, millisecond retrieval)
- Day 365â730: Deep Archive (95% cheaper, for compliance "just in case")
- Day 730: Deleted forever (and that's fine, 2-year-old logs are garbage)
Result: 847 GB of logs went from costing ~$19.50/month â $3.20/month. Set it once, save forever. ðŊ
Intelligent Tiering: The Lazy Person's Lifecycle Policy âĄ
Can't predict access patterns? Don't want to think about it? AWS has a mode for that.
S3 Intelligent-Tiering monitors object access and automatically moves data between tiers:
# Enable Intelligent-Tiering on upload via AWS CLI
aws s3 cp myfile.pdf s3://my-bucket/ \
--storage-class INTELLIGENT_TIERING
Or set it as the default for an entire bucket:
{
"Rules": [
{
"ID": "auto-tier-everything",
"Status": "Enabled",
"Filter": {},
"Transitions": [
{
"Days": 0,
"StorageClass": "INTELLIGENT_TIERING"
}
]
}
]
}
How it works:
- Objects accessed frequently â stays in Standard tier
- Not accessed for 30 days â moves to IA tier automatically
- Not accessed for 90 days â Archive Instant tier
- Object accessed again â immediately moves back to Standard
The catch: There's a $0.0025/1,000 objects monitoring fee. For large files this is irrelevant. For millions of tiny files (thumbnails, JSON fragments), the monitoring cost can exceed the savings.
When I use it: Product images and user-uploaded assets where I genuinely don't know what'll be popular. Some images get 10,000 views, some get 3. Intelligent Tiering figures it out.
When I don't: Logs, backups, exports - anything with a predictable access pattern. Manual lifecycle rules win there.
The Gotchas That Will Bite You ðŠĪ
Gotcha #1: The Minimum Storage Duration
S3 Standard-IA and Glacier have minimum storage duration charges. Move an object to IA and delete it 5 days later? You're still charged for 30 days. Move to Glacier and delete after a week? Charged for 90 days.
Standard-IA: 30-day minimum
One Zone-IA: 30-day minimum
Glacier Instant: 90-day minimum
Glacier Flexible: 90-day minimum
Deep Archive: 180-day minimum
Real scenario: We had a feature that generated temporary export files, stored them in Standard-IA, and deleted them after 7 days. Cost us 4x more than just leaving them in Standard. ðĪĶââïļ
Rule: Only tier data that will actually stay there for the minimum duration.
Gotcha #2: Retrieval Fees Are Real
Glacier isn't magic free storage. Reading data back has a cost:
Glacier Instant: $0.03/GB retrieval
Glacier Flexible: $0.01/GB (standard), up to 5h wait
Deep Archive: $0.02/GB, up to 12h wait
The trap: Storing 1TB in Deep Archive costs $1/month. But if you retrieve that 1TB once, you pay $20 just for retrieval. Archive data you genuinely never expect to touch.
A serverless pattern that saved us: We use Glacier for compliance archives - stuff auditors might need once a year. The retrieval cost for one annual audit is trivial compared to 12 months of Standard pricing.
Gotcha #3: Versioned Buckets Need Separate Rules
If you have versioning enabled (you should!), lifecycle rules work differently. Old versions don't transition automatically unless you add a rule specifically for them:
{
"Rules": [
{
"ID": "expire-old-versions",
"Status": "Enabled",
"NoncurrentVersionTransitions": [
{
"NoncurrentDays": 30,
"StorageClass": "STANDARD_IA"
}
],
"NoncurrentVersionExpiration": {
"NoncurrentDays": 90
}
}
]
}
Without this, every deleted/overwritten file version stays in Standard storage. Forever. Silently. Expensively.
When architecting on AWS, I learned this one by noticing that a "200GB bucket" was actually consuming 600GB because of three layers of old versions we forgot existed.
The Bucket Audit That Pays for Itself ðļ
Before you set any policies, run this to find where your S3 money is actually going:
# See storage breakdown by tier per bucket
aws s3api list-buckets --query 'Buckets[].Name' --output text | \
tr '\t' '\n' | while read bucket; do
echo "=== $bucket ==="
aws cloudwatch get-metric-statistics \
--namespace AWS/S3 \
--metric-name BucketSizeBytes \
--dimensions Name=BucketName,Value="$bucket" \
Name=StorageType,Value=StandardStorage \
--statistics Average \
--start-time $(date -u -d '1 day ago' +%Y-%m-%dT%H:%M:%S) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
--period 86400 \
--query 'Datapoints[0].Average' \
--output text 2>/dev/null || echo "No data"
done
Or honestly: just check the S3 Storage Lens dashboard in the console. It shows storage class breakdown per bucket in a nice graph that will immediately show you the problem areas.
My Production Bucket Strategy ðïļ
After managing multiple e-commerce backends, here's how I categorize every bucket:
Hot buckets (Standard only, no lifecycle):
- Product images served via CloudFront
- Active user avatars
- Current session data
Warm buckets (Standard â IA after 30d):
- Recent order invoices (customers re-download within a month)
- Recent database backups (restore scenarios are time-sensitive)
Cold buckets (Full lifecycle cascade):
- Application logs
- Older database backups
- Analytics exports
- Audit trails
Archive buckets (Glacier Direct):
- Legal/compliance documents
- Historical data migrations
- Old media assets
One rule per bucket type. Don't mix hot and cold data in the same bucket unless you're using prefixes to separate them with different rules.
The Cost Comparison You Should See Before Sleeping ðī
| Scenario | Without Lifecycle | With Lifecycle |
|---|---|---|
| 1TB of logs/year | $275/year | $38/year |
| 500GB DB backups | $138/year | $21/year |
| 200GB export files | $55/year | $8/year |
| Total | $468/year | $67/year |
That's an $401/year saving on just 1.7TB of data. At scale, this becomes thousands of dollars.
And this isn't even using the most aggressive tiers. That's just Standard â IA â Glacier with expiration.
Quick Setup: Apply in 10 Minutes âąïļ
# 1. Create a lifecycle configuration file
cat > lifecycle.json << 'EOF'
{
"Rules": [{
"ID": "standard-tiering",
"Status": "Enabled",
"Filter": {},
"Transitions": [
{ "Days": 30, "StorageClass": "STANDARD_IA" },
{ "Days": 90, "StorageClass": "GLACIER_IR" },
{ "Days": 365, "StorageClass": "DEEP_ARCHIVE" }
],
"Expiration": { "Days": 730 },
"NoncurrentVersionExpiration": { "NoncurrentDays": 30 }
}]
}
EOF
# 2. Apply it to your logs bucket
aws s3api put-bucket-lifecycle-configuration \
--bucket my-app-logs-bucket \
--lifecycle-configuration file://lifecycle.json
# 3. Verify it applied
aws s3api get-bucket-lifecycle-configuration \
--bucket my-app-logs-bucket
Done. That bucket is now on autopilot. Cost goes down every month automatically.
The TL;DR for Your Monday Morning ðŊ
S3 Lifecycle Policies are one of the highest ROI 30-minute tasks in all of AWS:
- Logs older than 30 days? Move to Standard-IA. Half the price, nobody notices.
- Backups older than 90 days? Glacier Instant. 80% cheaper, still fast to restore.
- Compliance archives? Deep Archive. 95% cheaper. You'll thank yourself during audits.
- Don't know your access pattern? Intelligent Tiering and let AWS figure it out.
- Enable expiration on anything that isn't required forever. Storage you delete is storage you don't pay for.
The most expensive S3 data isn't what you're actively using â it's the stuff you forgot to clean up two years ago.
Go check your S3 bills right now. I'll wait. ð
Found a bucket graveyard in your AWS account? Hit me up on LinkedIn â I love a good AWS cost horror story.
Want to see lifecycle policies in a real Terraform setup? Check out my GitHub where I have IaC templates for all of this.
Now go configure those lifecycle policies before your next bill arrives! âïļð°
P.S. After I published our cost savings internally, three other teams audited their buckets and found an additional $600/month. Turns out nobody had ever configured lifecycle policies on anything. We'd been paying AWS Standard prices on 5 years of data. Don't be us.