How We Reduced a Client's Cloud Bill by 62% Without Changing a Single Feature
A SaaS startup came to us with a problem they didn't know they had. They were focused on growth — new features, more users, Series A preparation. But when we looked under the hood, their AWS bill told a different story.
$14,200/month. For an app serving 3,000 daily active users.
That's not a cloud bill. That's a leak.
How Cloud Costs Spiral Out of Control
Cloud infrastructure has a unique property: it's incredibly easy to scale up and psychologically difficult to scale down. Here's the pattern we see repeatedly:
- Developer provisions an oversized instance "just to be safe" during a deadline
- Nobody reviews or right-sizes it after launch
- Development environments run 24/7 even though nobody uses them at night
- Old services stay running after being replaced
- Logging and monitoring tools ingest everything at the highest retention tier
- Database snapshots accumulate without cleanup policies
After 12–18 months, these individual decisions compound into a bill that's 3–5x what it should be.
The Audit: What We Found
Here's the actual breakdown from our client's infrastructure audit:
Finding 1: Oversized Compute (Savings: $3,800/month)
Their primary API ran on three m5.2xlarge instances (8 vCPUs, 32GB RAM each) behind a load balancer. Average CPU utilization: 8%. Average memory utilization: 12%.
We right-sized to two m5.large instances (2 vCPUs, 8GB RAM) with auto-scaling configured to add a third instance only when CPU exceeds 60%. Same reliability, 75% less compute cost.
Finding 2: Development Environments Running 24/7 (Savings: $2,100/month)
They had three complete environments — staging, QA, and demo — each mirroring production. All three ran 24/7/365, even though:
- Staging was used only during business hours
- QA was used only during sprint testing (2 weeks per month)
- Demo was used for client calls (maybe 10 hours/week)
We implemented scheduled scaling: environments spin up at 8am, spin down at 8pm on weekdays. QA only runs during sprint weeks. Demo runs on-demand with a Slack bot trigger. Same availability for the team, 80% less cost.
Finding 3: Unoptimized Database (Savings: $1,400/month)
Their PostgreSQL RDS instance was a db.r5.2xlarge with Multi-AZ enabled, provisioned IOPS at 10,000, and 2TB of allocated storage.
Actual usage: 45GB of data, peak connections at 30, average IOPS at 200.
We migrated to a db.t3.large with general-purpose SSD, disabled Multi-AZ (their RPO tolerance was 24 hours and they had automated backups), and right-sized storage to 100GB with auto-scaling.
Finding 4: Logging and Monitoring Bloat (Savings: $900/month)
CloudWatch was configured to retain all logs for 12 months at full resolution. Application logs were verbose (debug level in production), shipping approximately 50GB/day.
We reduced log verbosity to info level, implemented structured logging with sampling for high-volume endpoints, set retention to 30 days for application logs (90 days for audit logs), and moved historical logs to S3 Glacier for compliance.
Finding 5: Orphaned Resources (Savings: $600/month)
- 14 unattached EBS volumes from terminated instances
- 3 Elastic IPs not associated with any instance
- An old Elasticsearch cluster from a feature that was deprecated 6 months ago
- 47 ECR images that were never cleaned up
None of these were serving any purpose. They were just accumulating charges.
The Optimization Playbook
Based on this engagement and dozens of similar ones, here's our systematic approach:
Step 1: Visibility (Week 1)
You can't optimize what you can't measure. We set up cost allocation tags across every resource, configure AWS Cost Explorer with custom dashboards, and establish a per-service cost breakdown.
Step 2: Quick Wins (Week 2)
Address the obvious waste first — orphaned resources, oversized instances with clear utilization data, and non-production environments running 24/7. These changes are low-risk and high-impact.
Step 3: Architecture Optimization (Weeks 3–4)
This is where the real savings live:
- Move appropriate workloads to spot instances or Fargate
- Implement caching layers (Redis, CloudFront) to reduce compute and database load
- Evaluate serverless for bursty workloads
- Optimize data transfer costs (most companies don't realize inter-AZ transfer is charged)
Step 4: Commitment Optimization (Ongoing)
Once you've right-sized everything, lock in savings with Reserved Instances or Savings Plans for stable workloads. 1-year commitments typically save 30–40% over on-demand pricing.
Step 5: Guardrails (Ongoing)
- Set up billing alerts at 80%, 100%, and 120% of target budget
- Implement tagging policies that prevent untagged resources
- Create automated cleanup scripts for development environments
- Monthly cost review as part of the engineering standup
Results Timeline
| Week | Monthly Bill | Change | |---|---|---| | 0 (baseline) | $14,200 | — | | 2 (quick wins) | $10,800 | -24% | | 4 (architecture) | $6,900 | -51% | | 8 (commitments) | $5,400 | -62% |
Total annualized savings: $105,600.
The audit and implementation took 6 weeks of part-time engineering effort. ROI: approximately 15x in the first year.
The Performance Surprise
Here's what most people don't expect: the optimized infrastructure actually performed better. Why?
- Right-sized instances meant less overhead and more predictable resource allocation
- Caching layers reduced database load and improved response times
- Structured logging made debugging faster
- Auto-scaling handled traffic spikes more gracefully than static over-provisioning
Faster responses, lower latency, and a 62% smaller bill. Not a tradeoff — a win-win.
When to Do This
If your cloud bill has grown by more than 50% in the last 12 months without a corresponding increase in users or features, you have optimization opportunities. The question is whether you have the internal expertise and bandwidth to find them.
At Devoax, infrastructure optimization is part of our ongoing partnership model. We don't just build and leave — we monitor, measure, and continuously optimize. Because the cheapest server is the one you don't need.
Every dollar saved on infrastructure is a dollar you can invest in product. And unlike revenue, cost savings go straight to the bottom line. If you haven't audited your cloud spend in the last 6 months, you're almost certainly overpaying.