DynamoDB: The NoSQL Database That Will Make You Rethink Everything ššļø
DynamoDB: The NoSQL Database That Will Make You Rethink Everything ššļø
Real talk: I spent the first three months of my career confidently explaining to everyone that "DynamoDB is just a fast key-value store, how hard can it be?"
Then our AWS bill hit $4,200 for a database serving 20,000 users.
Then a hot-partition alert fired at 3 AM.
Then I spent four hours debugging why a single-table design query returned zero results ā and the data was definitely there.
DynamoDB is a masterpiece of cloud engineering. It scales to literally any size, it's serverless, and it integrates beautifully with Lambda. But it will also absolutely destroy your career confidence if you don't understand what you're signing up for! Let's fix that. š
What Even IS DynamoDB? āļø
DynamoDB = AWS's fully managed NoSQL database, designed for internet-scale applications.
Think of it like this:
Traditional SQL (PostgreSQL/MySQL):
Tables ā Rows ā Columns
JOIN anything with anything
Query however you want
Scales... eventually š¬
DynamoDB:
Tables ā Items ā Attributes
Access patterns defined UP FRONT
Query only by keys (mostly)
Scales to infinite size with millisecond latency š
Why I use it on serverless backends:
- No connections to manage (Lambda ā¤ļø DynamoDB)
- Single-digit millisecond responses at any scale
- Pay per request OR per capacity unit
- No patching, no maintenance, no sizing
- AWS handles replication, durability, backups
The catch: You have to design your data model around how you'll ACCESS it ā not how it looks logically. This breaks every SQL developer's brain!
The $4,200 Bill: My DynamoDB Horror Story š
When architecting our e-commerce backend, I naively created a DynamoDB table like this:
// The table
TableName: "Products"
PartitionKey: "productId" // UUID
// The queries I kept running...
// "Get all products in category X" - ran as SCAN
// "Get products by price range" - ran as SCAN
// "Get featured products" - ran as SCAN
What I didn't realize about scans:
Table: 5 million products
Scan reads: ALL 5 million items every time
Read Capacity Units consumed: 5 million per scan
Cost per scan: ~$2.50
Scans per minute: 200 (from our product listing API)
Cost per minute: $500
Cost per hour: $30,000
Me checking email at 9 AM: š±
AWS Support called US. That's when you know it's bad.
The lesson that cost $4,200: In DynamoDB, Scan is the enemy. Query is your friend. And your access patterns need to be designed BEFORE you build, not discovered after launch!
Designing Access Patterns First āļø
In production, I've deployed e-commerce backends where I now religiously document access patterns before writing a single line of code:
Access Patterns for Products Table:
1. Get product by ID ā Primary key lookup ā
2. Get all products in category ā GSI on category ā
3. Get featured products ā GSI on featured flag ā
4. Get products by price range ā GSI with price sort key ā
5. Get user's order history ā Sort key with userId prefix ā
Rule: If a pattern would require a Scan, redesign the table!
Then design keys around those patterns:
// Product table design
{
TableName: "Products",
KeySchema: [
{ AttributeName: "PK", KeyType: "HASH" }, // "PROD#<id>"
{ AttributeName: "SK", KeyType: "RANGE" } // "META#<id>"
],
GlobalSecondaryIndexes: [
{
IndexName: "category-price-index",
KeySchema: [
{ AttributeName: "category", KeyType: "HASH" },
{ AttributeName: "price", KeyType: "RANGE" }
]
}
]
}
A serverless pattern that saved us: Single-table design. All entities in ONE table with a composite key strategy. Counterintuitive but INSANELY efficient! šÆ
Single-Table Design: The Mind-Bending Trick š¤Æ
Old (multiple tables, SQL-brained approach):
UsersTable
OrdersTable
ProductsTable
CartTable
New (single-table design):
Table: "EcommerceApp"
User: PK="USER#123" SK="PROFILE"
User's Orders: PK="USER#123" SK="ORDER#2026-03-20#abc"
Order Items: PK="ORDER#abc" SK="ITEM#product-xyz"
Product: PK="PROD#xyz" SK="META#xyz"
Why this is brilliant:
// Get user + all their orders in ONE request
const result = await docClient.query({
TableName: "EcommerceApp",
KeyConditionExpression: "PK = :pk AND begins_with(SK, :prefix)",
ExpressionAttributeValues: {
":pk": "USER#123",
":prefix": "ORDER#"
}
}).promise();
// Returns user profile AND all orders - single DynamoDB call!
vs. SQL approach:
SELECT * FROM users
JOIN orders ON users.id = orders.user_id
WHERE users.id = 123;
-- Two tables, one JOIN, multiple roundtrips
When architecting on AWS, I learned: DynamoDB rewards you for thinking about HOW you access data, not HOW it looks in a spreadsheet. Embrace the weirdness! š§
Capacity Modes: The Choice That Defines Your Bill š°
DynamoDB has two billing modes and picking the wrong one is like choosing between "pay as you go" and "pay for a private jet whether you use it or not."
On-Demand Mode (Pay Per Request)
// Create table with on-demand pricing
{
TableName: "Orders",
BillingMode: "PAY_PER_REQUEST" // Pay for actual reads/writes
}
Cost:
- Read: $0.25 per million Read Request Units
- Write: $1.25 per million Write Request Units
When to use it:
- ā Unpredictable traffic (startup, new feature launch)
- ā Low to moderate traffic
- ā Peace of mind > cost optimization
When NOT to use it:
- ā Predictable high traffic (paying 4-7Ć premium vs. provisioned!)
Provisioned Mode (Pay for Capacity)
{
TableName: "Products",
BillingMode: "PROVISIONED",
ProvisionedThroughput: {
ReadCapacityUnits: 100, // Pay for this capacity 24/7
WriteCapacityUnits: 50
}
}
Cost:
- 100 RCU Ć $0.00013/hour Ć 24 Ć 30 = $9.36/month
- vs. On-Demand at same traffic: $40/month
My production setup for e-commerce:
Orders table: On-Demand (traffic spikes during sales events!)
Products table: Provisioned + Auto Scaling (predictable read patterns)
Sessions table: On-Demand (can't predict when users log in)
Analytics table: Provisioned (controlled batch writes)
Auto Scaling is your best friend for provisioned mode:
// Set up auto scaling (via CDK/Terraform)
// Scales between 5 and 1000 RCU based on utilization
// Target: keep utilization at 70%
A serverless pattern that saved us: Start on-demand, switch to provisioned once you understand your traffic patterns. I saved 65% on our product catalog table after the switch! šø
The Hot Partition Problem: DynamoDB's Dirty Secret š„
This one will ruin your Sunday if you're not careful.
The problem:
DynamoDB distributes data across partitions using your partition key. If every request hits the same partition key ā that partition becomes a "hot" partition and gets throttled.
Our real incident:
// We built a "flash sale" feature
// Every user checking the sale hit this key:
PK = "FLASH_SALE_ACTIVE"
SK = "CONFIG"
// 50,000 users ā 50,000 requests/second to ONE partition
// DynamoDB throttled us ā Checkout down ā Boss not happy š¤
CloudWatch showed:
ConsumedReadCapacityUnits: 50,000/second
ProvisionedReadCapacityUnits: 100/second
ThrottledRequests: 49,900/second šØ
The fix - partition sharding:
// Instead of one config item, shard across N partitions
const SHARD_COUNT = 10;
function getFlashSaleKey() {
const shard = Math.floor(Math.random() * SHARD_COUNT);
return { PK: `FLASH_SALE#${shard}`, SK: "CONFIG" };
}
// Read: try random shard, cache aggressively
// Writes: write to ALL shards (keeps them in sync)
Or better: use DAX (DynamoDB Accelerator) for caching:
// DAX caches DynamoDB reads in memory
// Cache hit: <1ms response (vs 5ms DynamoDB)
// Reduces DynamoDB reads by 90%+ for hot items
// Cost: ~$0.268/hour for smallest cluster (~$190/month)
// Worth it when you're spending >$200/month on reads of the same items!
When I architected our flash sale system, I learned: any "global config" item in DynamoDB is a hot partition bomb waiting to go off. Either shard it or cache it! š„
Common DynamoDB Gotchas I Hit (So You Don't Have To) šŖ¤
Gotcha #1: Eventual Consistency vs. Strong Consistency
// Default read: Eventually consistent (cheaper but stale)
const item = await docClient.get({
TableName: "Orders",
Key: { PK: "ORDER#123" }
}).promise();
// Strongly consistent read (current, costs 2Ć as much!)
const item = await docClient.get({
TableName: "Orders",
Key: { PK: "ORDER#123" },
ConsistentRead: true
}).promise();
My e-commerce rule:
- Show user their cart:
ConsistentRead: true(must be accurate!) - Show product reviews: Eventually consistent (2-second staleness = fine)
- Process payment:
ConsistentRead: true(don't double-charge anyone!)
Gotcha #2: Item Size Limit Is 400KB
// This will FAIL silently-ish
const item = {
PK: "PRODUCT#123",
description: "A".repeat(500000) // 500KB - too big!
};
// DynamoDB throws: "Item size exceeds maximum allowed size"
The fix: Store large blobs in S3, put the S3 key in DynamoDB:
// Upload large content to S3
const s3Key = `products/${productId}/description.html`;
await s3.putObject({ Bucket: "content-bucket", Key: s3Key, Body: bigContent }).promise();
// Store reference in DynamoDB
const item = {
PK: `PRODUCT#${productId}`,
descriptionS3Key: s3Key // DynamoDB stays lean!
};
Gotcha #3: Transactions Cost 2Ć and Have a 100-Item Limit
// DynamoDB transactions (ACID across multiple items!)
await docClient.transactWrite({
TransactItems: [
{
Update: {
TableName: "Inventory",
Key: { PK: "PRODUCT#456" },
UpdateExpression: "SET stock = stock - :qty",
ConditionExpression: "stock >= :qty", // Prevent negative stock!
ExpressionAttributeValues: { ":qty": 1 }
}
},
{
Put: {
TableName: "EcommerceApp",
Item: { PK: `ORDER#${orderId}`, SK: "STATUS", status: "CONFIRMED" }
}
}
]
}).promise();
// Either BOTH succeed or BOTH fail!
But: Transactions consume 2Ć the capacity units. Use them only where you need atomicity!
In production, I've deployed an order processing system where I use transactions ONLY for inventory deduction + order confirmation. Everything else is eventually consistent. Saves ~30% on write costs! š°
Cost Optimization Tricks That Actually Work š°
Trick #1: Use TTL to Auto-Delete Old Items (Free!)
// Set TTL attribute on session items
const session = {
PK: `SESSION#${token}`,
userId: "123",
expiresAt: Math.floor(Date.now() / 1000) + 86400 // 24 hours from now
};
// Enable TTL on the table
aws dynamodb update-time-to-live \
--table-name EcommerceApp \
--time-to-live-specification "Enabled=true, AttributeName=expiresAt"
// DynamoDB automatically deletes expired items - NO cost!
Savings: Our sessions table was growing 5GB/month. TTL reduced it to steady-state 800MB. Saved $80/month in storage! š
Trick #2: Project Only What You Need on GSIs
// BAD: Copy all attributes to GSI (wasteful!)
GlobalSecondaryIndex: {
Projection: { ProjectionType: "ALL" }
}
// GOOD: Only copy what the query needs
GlobalSecondaryIndex: {
Projection: {
ProjectionType: "INCLUDE",
NonKeyAttributes: ["productName", "price", "thumbnail"]
}
}
Savings: GSI storage costs DROP when you're not copying 40 attributes just to display a product card!
Trick #3: Batch Operations Instead of Single Writes
// Expensive: 100 individual puts
for (const item of items) {
await docClient.put({ TableName: "Products", Item: item }).promise();
}
// Efficient: Batch write (up to 25 items per request)
const batches = chunk(items, 25);
for (const batch of batches) {
await docClient.batchWrite({
RequestItems: {
Products: batch.map(item => ({ PutRequest: { Item: item } }))
}
}).promise();
}
// 25 writes, 1 API call! š
When architecting on AWS, I learned: Batch operations don't reduce capacity units consumed, but they dramatically reduce Lambda execution time and API overhead! ā”
The DynamoDB Decision Tree š³
Use DynamoDB when:
- ā You know your access patterns (or can design for them)
- ā You need infinite scale with no ops overhead
- ā You're building serverless on AWS (Lambda + DynamoDB = ā¤ļø)
- ā Low latency at scale is non-negotiable
- ā Simple read/write patterns dominate
Don't use DynamoDB when:
- ā You need complex ad-hoc queries (just use PostgreSQL!)
- ā Your data is highly relational (many JOINs ā relational DB)
- ā Team doesn't understand NoSQL design (expensive mistakes incoming)
- ā You need complex aggregations (Aurora or Redshift instead)
My production stack for e-commerce:
- DynamoDB: User sessions, orders, cart, product catalog, inventory
- Aurora Serverless: Financial reporting, complex analytics queries
- ElastiCache: Real-time stock counts, frequently-read config
Don't go all-in on DynamoDB when a bit of PostgreSQL would be simpler! š ļø
Your DynamoDB Launch Checklist ā
Before you deploy anything to production:
-
Document all access patterns (write them down, seriously)
ā GET /product/:id ā Query by PK ā GET /category/:name ā GSI query ā GET /user/:id/orders ā begins_with SK pattern ā GET /products?search=keyword ā Full text search (use Elasticsearch!) -
Design keys around access patterns, not data shape
PK="USER#123" SK="PROFILE" ā User profile PK="USER#123" SK="ORDER#2026-03-20" ā User's orders (sorted by date!) -
Enable Point-in-Time Recovery (PITR)
aws dynamodb update-continuous-backups \ --table-name EcommerceApp \ --point-in-time-recovery-specification PointInTimeRecoveryEnabled=true$0.20/GB/month insurance against "I accidentally deleted everything" š
-
Set up CloudWatch alarms
ThrottledRequests > 0 ā Alert! (hot partitions or under-provisioned) ConsumedReadCapacity > 80% provisioned ā Scale up SystemErrors > 0 ā Alert! (AWS-side issues) -
Enable DynamoDB Streams if you need event-driven processing
// Every change fires a Lambda trigger // Great for: order notifications, audit logs, search index updates
The Bottom Line š”
DynamoDB is not a drop-in replacement for MySQL. It's a completely different mental model ā one that rewards you MASSIVELY when you get it right, and punishes you BRUTALLY when you don't.
The essentials:
- Design access patterns FIRST, schema SECOND
- Single-table design is weird but powerful ā learn it!
- On-demand for unpredictable traffic, provisioned for steady load
- TTL for automatic cleanup (free!)
- Never run Scan in production. Ever. š«
The honest truth:
When I finally wrapped my head around DynamoDB, our e-commerce backend became almost maintenance-free. No database servers. No capacity planning headaches. Lambda + DynamoDB scales from 10 users to 10 million users without a single config change. That's the dream ā and it's real when you do it right!
The $4,200 bill? Worst and best thing that happened to me on AWS. I never ran an unindexed Scan again. š¤
Your Action Plan šÆ
This week:
- Pick one small service and list its access patterns
- Design a DynamoDB table with those patterns
- Switch to On-Demand pricing if you're not sure about traffic
- Enable PITR on any table holding real data
This month:
- Migrate your first Lambda ā DynamoDB integration
- Try single-table design on a greenfield feature
- Set up CloudWatch alarms on your tables
- Calculate whether switching to provisioned saves money
This quarter:
- Audit all your DynamoDB scans (replace with queries + GSIs)
- Add TTL to any table with time-bounded data
- Benchmark with DAX if you have high-read, low-change data
- Become the DynamoDB design authority on your team š
Built something cool with DynamoDB? Connect with me on LinkedIn ā I love seeing creative single-table designs!
Want to see my DynamoDB schemas for e-commerce? Check out my GitHub ā I've open-sourced several of the patterns I use in production!
Now go design those access patterns before you touch a single line of code! ššļø
P.S. The DynamoDB single-table design community is divided: some people love it, some hate it. I've landed firmly in the "love it for serverless, hate it for reporting" camp. Use it where it shines, use SQL where SQL shines. Being dogmatic about NoSQL is just as bad as being dogmatic about SQL! š
P.P.S. If you want to see your DynamoDB costs itemized, check AWS Cost Explorer ā DynamoDB ā filter by operation type. Nothing motivates you to switch from Scan to Query like seeing "Scan operations: $847.00" staring back at you from your monthly bill! šø