Why Storage Choice Matters More Than You Think

Most founders focus on compute costs when optimizing cloud spend. But storage is the silent budget killer: it's always on, scales linearly with data growth, and teams often default to "premium" options "just to be safe."

The reality? Choosing the wrong disk type can cost you 5-10x more than necessary—without delivering any performance benefit to your users. I've seen startups pay $1,200 per month for database storage that could run perfectly on disks costing $240 per month. That's $11,520 per year wasted on over-engineering.

Let's fix that.

The Storage Decision Framework: 3 Questions Your Team Must Answer

Before selecting any disk type, answer these three questions:

What's your IOPS requirement? (Input/Output Operations Per Second)
Transactional databases need high IOPS (thousands+). Batch processing needs low IOPS.
What's your throughput requirement? (MB/s of sequential data transfer)
Video processing, log aggregation, and big data analytics need high throughput. Web servers need modest throughput.
How frequently is data accessed?
Hot data (daily access) versus warm (weekly) versus cold (monthly/yearly) dictates storage tier.

Pro tip: Measure first. Run iostat -x 1 on Linux or use CloudWatch/GCP Monitoring for 48 hours before making decisions. Guessing equals overpaying.

AWS Storage Deep Dive: EBS Volume Types Decoded

General Purpose SSD: gp3 (The New Default)

Pricing: $0.08 per GB-month (us-east-1)

Why it matters: gp3 decouples storage size from performance. Unlike gp2 (where IOPS scaled with volume size), gp3 gives you:

3,000 baseline IOPS (free) regardless of volume size
125 MB/s baseline throughput (free)
Scale IOPS independently ($0.005 per provisioned IOPS beyond 3,000)
Scale throughput independently ($0.04 per MB/s beyond 125)

Real-world example:
Your PostgreSQL database needs 4,000 IOPS but only 200GB storage.
- gp2 approach: Must provision 1,334GB to get 4,000 IOPS → $133.40 per month
- gp3 approach: Provision 200GB + 1,000 extra IOPS → $16 (storage) + $5 (IOPS) = $21 per month
→ 84% savings with identical performance

Action item: Migrate ALL gp2 volumes to gp3. It's non-disruptive (via Modify Volume) and saves 20% immediately—even before tuning performance.

Provisioned IOPS SSD: io2/io2 Block Express (For Mission-Critical Databases)

Pricing: $0.125 per GB-month + $0.065 per provisioned IOPS-month (Block Express)

When to use: Only when you need:
- Predictable sub-millisecond latency under heavy load
- More than 16,000 IOPS consistently
- 99.999% durability requirement (io2 offers this; gp3 offers 99.99%)

Pitfall alert: Most startups don't need io2. Unless you're running high-frequency trading or real-time ad auctions, gp3 with provisioned IOPS handles 95% of production workloads.

Throughput-Optimized HDD: st1 ($0.045 per GB-month)

Perfect for:
- Kafka/Zookeeper logs
- Hadoop/Spark scratch space
- Media processing buffers
- Log aggregation before shipping to S3

Performance: Up to 500 MB/s throughput, but only ~500 IOPS. Sequential workloads only—terrible for random access.

Cold HDD: sc1 ($0.015 per GB-month)

Perfect for:
- Archived database backups (older than 90 days)
- Compliance data with 12+ month retention
- Infrequently accessed analytics datasets

Warning: 256 IOPS max. Access latency can hit seconds—not for production workloads.

GCP Storage Deep Dive: Persistent Disk Types Decoded

Balanced Persistent Disk (pd-balanced) — The Sweet Spot

Pricing: Approximately $0.08 per GB-month (varies by region)

Why it matters: Google's answer to AWS gp3. Delivers 60% of SSD performance at 45% of SSD cost. Ideal for:
- General-purpose databases (PostgreSQL, MySQL)
- Application servers
- Development environments

Performance:
- 15 IOPS per GB (up to 15,000 IOPS max)
- 0.28 MB/s per GB throughput (up to 1,200 MB/s)

SSD Persistent Disk (pd-ssd)

Pricing: Approximately $0.17 per GB-month

When to use: Only when you need:
- Sub-1ms latency consistently
- High random I/O workloads (e.g., Redis, Cassandra)
- OLTP databases with more than 10,000 IOPS requirements

Pitfall: Most web applications run perfectly fine on pd-balanced. SSD is 2x more expensive for marginal gains in non-I/O-bound apps.

Standard Persistent Disk (pd-standard)

Pricing: Approximately $0.04 per GB-month

Perfect for:
- Boot disks for non-critical VMs
- Batch processing workloads
- Infrequently accessed data stores

Performance: HDD-backed. 0.75 IOPS per GB (max 15,000). Avoid for databases.

Hyperdisk (GCP's New Tiered Offering)

Google now offers Hyperdisk variants:
- Hyperdisk Balanced: Replaces pd-balanced with better price/performance
- Hyperdisk Throughput: Optimized for sequential workloads (like st1)
- Hyperdisk Extreme: For ultra-high IOPS workloads

Founder note: Hyperdisk pricing is complex (separate capacity/throughput billing). Only adopt after thorough benchmarking—most startups don't need it yet.

The Object Storage Trap: When NOT to Use S3/Cloud Storage

Founders often hear "use S3 for everything!" That's dangerously wrong.

Use object storage for:
- Static assets (images, CSS, JS)
- User uploads (profile pics, documents)
- Log archival (more than 30 days old)
- Backup repositories

NEVER use object storage for:
- Database storage (PostgreSQL/MySQL data directories)
- Application runtime state
- Anything requiring POSIX filesystem semantics
- High-frequency random reads/writes

Warning: S3 has eventual consistency, high latency (approximately 100ms), and charges per API call. A database performing 10,000 IOPS would generate $150+ per month in S3 request fees alone. Never place database transaction logs, temp files, or anything requiring POSIX semantics in object storage.

Real Cost Comparison: Production Database Scenario

Scenario: 500GB PostgreSQL database requiring 3,500 IOPS, moderate write load

Provider	Disk Type	Monthly Cost	Notes
AWS	gp2 (500GB)	$50.00	Only 1,500 baseline IOPS—performance bottleneck
AWS	gp3 (500GB + 500 IOPS)	$42.50	3,500 IOPS provisioned, 250 MB/s throughput
AWS	io2 (500GB + 3,500 IOPS)	$185.00	Massive overkill for this workload
GCP	pd-balanced (500GB)	$40.00	Approximately 7,500 IOPS baseline—more than enough
GCP	pd-ssd (500GB)	$85.00	2x cost for marginal latency gains

Savings opportunity: Choosing gp3/pd-balanced over "premium" options saves $140+ per month—$1,680 per year—with zero user impact.

5 Actionable Optimization Strategies (Implement This Week)

1. The gp2 to gp3 Migration (AWS)

Run this AWS CLI command to convert without downtime:

aws ec2 modify-volume --volume-id vol-xxxxxx --volume-type gp3

Expected savings: 20% baseline + ability to right-size IOPS independently.

2. Implement Storage Tiering for Logs

- Hot tier (last 7 days): EBS gp3 / pd-balanced
- Warm tier (8-90 days): S3 Standard / GCS Standard
- Cold tier (91+ days): S3 Glacier Instant / GCS Nearline
- Archive tier (1+ year): S3 Glacier Deep Archive / GCS Archive ($0.00099 per GB-month)

Example: 10TB of logs
- All in gp3: $800 per month
- Tiered approach: $80 (hot) + $230 (warm) + $100 (cold) + $10 (archive) = $420 per month
→ 47% savings with automated lifecycle policies

3. Right-Size Boot Disks

Default AWS AMIs ship with 8GB gp2 boot volumes. Most apps need only 20-30GB.
- Reduce from 100GB to 30GB gp3: Save $5.60 per month per instance
- For 50 instances: $3,360 per year saved with zero risk

4. Use Instance Store for Ephemeral Workloads (AWS)

For stateless apps (web servers, CI runners), use instance store instead of EBS:
- Zero cost for storage
- Higher performance (physically attached NVMe)
- Caveat: Data lost on stop/terminate—only for truly ephemeral workloads

5. Schedule Non-Production Disks

Dev/staging databases don't need 24/7 uptime. Use AWS Instance Scheduler or GCP Start/Stop VM to:
- Shut down non-prod instances nights/weekends
- Save 65% on storage costs for those environments

The Founder's Checklist: Questions to Ask Your Tech Lead

Are we still using gp2 volumes anywhere? Why not migrate to gp3?
What's our actual measured IOPS/throughput for each critical workload?
Which datasets haven't been accessed in 90+ days? Can we move them to cold storage?
Are we using SSD disks for workloads that only need sequential throughput?
What's our backup retention policy? Are we keeping 13 months of backups when compliance requires 12?
Have we benchmarked pd-balanced versus pd-ssd for our GCP workloads?

The Bottom Line

Storage optimization isn't about finding the cheapest option—it's about matching performance to actual requirements. The biggest savings come from:

Eliminating over-provisioning (gp2 to gp3 migration alone saves 20%)
Tiering data by access patterns (hot/warm/cold/archival)
Measuring before deciding (no more guessing IOPS needs)
Automating lifecycle policies (move data to cheaper tiers automatically)

Start with one action this week: run aws ec2 describe-volumes --filters "Name=volume-type,Values=gp2" and migrate the top 3 largest gp2 volumes to gp3. You'll see savings on next month's bill—and build momentum for deeper optimization.

Your infrastructure shouldn't cost more than your growth. Choose wisely.