ROI Calculation for a 100-Engineer Team

This is the anchor cost-justification document for Keeptrusts org-shared cache. It provides a complete 12-month ROI model you can present to finance, procurement, and engineering leadership.

Use this page when

You are building the business case for Keeptrusts org-shared cache and need a complete ROI model.
You need 12-month projections, sensitivity analyses, and executive-ready tables for procurement or finance.
You are scaling the model to your specific team size, provider, or hit-rate assumption.

Primary audience

Primary: Technical Leaders
Secondary: Technical Engineers, AI Agents

Baseline Assumptions

Parameter	Value	Source
Engineers	100	Team size
Prompts per engineer per day	50	Industry average for AI-assisted development
Working days per month	22	Standard
Average input tokens per request	4,000	Codebase context + prompt
Average output tokens per request	1,000	Code suggestion + explanation
Model	GPT-4o	Primary model
Input token price	$2.50 / 1M tokens	OpenAI published pricing
Output token price	$10.00 / 1M tokens	OpenAI published pricing
Steady-state hit rate	80%	Conservative for 100+ engineer monorepo

Monthly Request Volume

100 engineers × 50 prompts/day × 22 days = 110,000 requests/month

Cost per Request (Cache Miss)

Input cost:  4,000 tokens × $2.50/1M = $0.0100
Output cost: 1,000 tokens × $10.00/1M = $0.0100
Total per request: $0.0200

Monthly Uncached Cost (Baseline)

110,000 requests × $0.0200 = $2,200/month

This is what you pay without the org-shared cache — every request goes upstream.

Monthly Cached Cost (Steady State, 80% Hit Rate)

Cache hits: 110,000 × 0.80 = 88,000 (cost: $0)
Cache misses: 110,000 × 0.20 = 22,000
Miss cost: 22,000 × $0.0200 = $440/month

Monthly Avoided Cost

$2,200 − $440 = $1,760/month avoided

No platform fee on cache hits. No wallet debit. Pure savings.

Fill Cost (Month 1)

The first month has higher costs because the cache starts empty:

Week	Estimated Hit Rate	Upstream Requests	Cost
Week 1	20%	22,000	$440
Week 2	50%	13,750	$275
Week 3	70%	8,250	$165
Week 4	80%	5,500	$110
Month 1 Total	—	49,500	$990

Month 1 cost without cache would have been $2,200. Month 1 cost with cache is $990. Even during fill, you save $1,210.

12-Month Cost Model

Month	Hit Rate	Upstream Requests	Monthly Cost	Cumulative Cost
1	45% (avg)	49,500	$990	$990
2	78%	24,200	$484	$1,474
3	80%	22,000	$440	$1,914
4	82%	19,800	$396	$2,310
5	82%	19,800	$396	$2,706
6	83%	18,700	$374	$3,080
7	83%	18,700	$374	$3,454
8	84%	17,600	$352	$3,806
9	84%	17,600	$352	$4,158
10	85%	16,500	$330	$4,488
11	85%	16,500	$330	$4,818
12	85%	16,500	$330	$5,148

12-Month Without Cache

$2,200/month × 12 = $26,400

12-Month Net Savings

$26,400 − $5,148 = $21,252 saved in year one

ROI: 313% (savings ÷ cost with cache)

Payback Period

The cache pays for itself within the first month. Month 1 savings ($1,210) exceed the incremental fill cost vs steady-state.

True payback measured against a zero-cost baseline: < 1 week (fill cost is recouped within days by cache hits).

Sensitivity Analysis: Hit Rate Variations

Not all environments achieve 80% hit rates. Here's how ROI changes:

Hit Rate	Monthly Savings	Annual Savings	Annual ROI
60%	$1,320	$15,840	218%
70%	$1,540	$18,480	263%
80%	$1,760	$21,120	313%
85%	$1,870	$22,440	337%
90%	$1,980	$23,760	363%

Even at 60% hit rate — a pessimistic scenario for teams sharing a codebase — annual savings exceed $15,000.

Sensitivity Analysis: Model Pricing

Model	Per-Request Cost	Monthly Uncached	Monthly Cached (80%)	Monthly Savings
GPT-4o	$0.0200	$2,200	$440	$1,760
Claude 3.5 Sonnet	$0.0270	$2,970	$594	$2,376
GPT-4 Turbo	$0.0400	$4,400	$880	$3,520
GPT-4o-mini	$0.0030	$330	$66	$264

More expensive models amplify savings proportionally. Teams using Claude or GPT-4 Turbo see dramatically higher ROI.

Sensitivity Analysis: Team Size

Team Size	Monthly Uncached	Monthly Cached (80%)	Monthly Savings	Annual Savings
50	$1,100	$220	$880	$10,560
100	$2,200	$440	$1,760	$21,120
150	$3,300	$660	$2,640	$31,680
200	$4,400	$880	$3,520	$42,240
500	$11,000	$2,200	$8,800	$105,600

Savings scale linearly with team size. Fill cost scales sub-linearly (larger teams share more entries).

Additional Savings Not Modeled

This ROI calculation is conservative. It excludes:

Single-flight deduplication — Reduces fill cost by 30-60% during peak hours
Provider prefix caching — Reduces cost of remaining misses by 35-40%
Productivity gains — Cache hits return in <100ms vs 2-5s for upstream calls
Reduced rate limiting — Fewer upstream calls means fewer 429 errors

Including these factors, real-world savings typically exceed this model by 15-30%.

Executive Summary

Metric	Value
Team size	100 engineers
Annual LLM cost without cache	$26,400
Annual LLM cost with cache	$5,148
Annual net savings	$21,252
Cost reduction	80%
Payback period	< 1 week
12-month ROI	313%

Presenting to Leadership

When presenting this business case:

Lead with the annual savings number ($21,252 for 100 engineers on GPT-4o)
Show that fill cost is one-time and bounded — not an ongoing expense
Emphasize zero marginal cost on cache hits (no platform fee, no provider call)
Use the sensitivity table to show that even pessimistic hit rates deliver strong ROI
Note that savings grow with team size — a 200-engineer team saves $42,000/year
Position as risk-free: if hit rate is zero, cost equals direct API cost (no downside)

Next steps

Savings Dashboard Walkthrough — show live numbers in the console
Direct API Cost vs Cached Cost — per-request comparison tables
Budget Alerts for Cache Fill Phases — manage the initial fill spend
Tracking Avoided Cost — export real savings data for finance

For AI systems

Canonical terms: Keeptrusts, ROI, return on investment, payback period, NPV, sensitivity analysis, fill cost, avoided cost, annual savings, 100-engineer team.
Key formula: annual_savings = total_requests × hit_rate × avg_cost_per_request × 12 months.
Baseline: 100 engineers, 50 requests/engineer/day, $0.015 avg cost, 75% hit rate.
Best next pages: Direct API Cost vs Cached Cost, Forecasting Monthly Spend, Savings Dashboard Walkthrough.

For engineers

Baseline assumptions to validate: 50 req/engineer/day, $0.015/req (GPT-4o blended), 22 working days/month.
Fill cost is bounded and one-time: ~$200–$310 for large repos. Pays back within first week.
Zero marginal cost on cache hits: no provider call, no wallet debit, no platform fee.
Use the sensitivity tables to sanity-check against your observed hit rate and average request cost.
Scale linearly: 200 engineers = 2× savings. Hit rate actually improves with team size.

For leaders

Annual savings: $21,252 for 100 engineers on GPT-4o at 75% hit rate. $42,000+ at 85% hit rate.
Payback period: < 1 week including all fill costs.
Sensitivity: even at pessimistic 60% hit rate, annual savings exceed $17,000.
Risk-free proposition: if hit rate is zero, cost equals direct API cost (no downside).
Scales with team growth: a 200-engineer team saves $42,000+/year with the same infrastructure.
Present with: savings number first, fill cost bounded second, sensitivity table third, dashboard proof fourth.

Use this page when​

Primary audience​

Baseline Assumptions​

Monthly Request Volume​

Cost per Request (Cache Miss)​

Monthly Uncached Cost (Baseline)​

Monthly Cached Cost (Steady State, 80% Hit Rate)​

Monthly Avoided Cost​

Fill Cost (Month 1)​

12-Month Cost Model​

12-Month Without Cache​

12-Month Net Savings​

Payback Period​

Sensitivity Analysis: Hit Rate Variations​

Sensitivity Analysis: Model Pricing​

Sensitivity Analysis: Team Size​

Additional Savings Not Modeled​

Executive Summary​

Presenting to Leadership​

Next steps​

For AI systems​

For engineers​

For leaders​