Measuring Your Baseline AI Spend Before Caching

Before enabling org-shared cache, measure your current AI spend. This baseline lets you calculate exact savings, demonstrate ROI to leadership, and identify the highest-value targets for caching.

Use this page when

You are measuring your current AI spend before enabling caching to establish a baseline.
You need to identify top-spending teams and repos to prioritize cache deployment.
You want to build a before/after comparison for ROI reporting to stakeholders.

Primary audience

Primary: Technical Engineers
Secondary: AI Agents, Technical Leaders

Why Measure First

Without a baseline:

You can't prove savings to stakeholders
You can't identify which teams or repos benefit most
You can't calculate payback period accurately
You can't prioritize which repositories to connect first

With a baseline:

You have concrete before/after numbers
You can attribute savings to specific teams and repos
You can forecast savings for teams not yet onboarded
You can justify the fill-phase investment with data

Step 1: Check Cost & Spend Dashboard

Navigate to Cost & Spend in the Keeptrusts console. This page shows:

Total spend (today, this week, this month)
Spend by team (which teams are the heaviest spenders)
Spend by model (which models consume the most budget)
Spend by gateway (if you have multiple gateways)
Token breakdown (input vs output tokens)

Record these numbers for at least 7 consecutive business days to capture normal variation.

Step 2: Export Detailed Spend Data

For detailed analysis, export your spend data:

Navigate to Exports → New Export
Select Spend Report export type
Set date range to the last 30 days
Include fields: team_id, user_id, model, input_tokens, output_tokens, cost, timestamp
Click Export

The exported CSV gives you per-request granularity for deep analysis.

Step 3: Identify Top-Spending Teams and Repos

From your export, calculate spend by team:

Team	Monthly spend	Engineers	Spend/engineer/day	Primary repos
Platform	$1,200	25	$2.40	core-api, shared-libs
Frontend	$900	30	$1.50	web-app, component-lib
Data	$600	15	$2.00	pipeline, analytics
Mobile	$450	20	$1.13	ios-app, android-app
DevOps	$300	10	$1.50	infra, deploy-scripts
Total	$3,450	100	$1.73

Teams with the highest spend per engineer and the most shared repos are your best targets for initial cache deployment.

Step 4: Calculate Per-Engineer Daily Cost

Per-engineer daily cost is your key baseline metric:

Per-engineer daily cost = Monthly total spend ÷ number of engineers ÷ working days

Example:

$3,450 ÷ 100 engineers ÷ 20 working days = $1.73/engineer/day

This number tells you what each engineer costs in AI provider spend today, without caching.

Step 5: Identify Redundant Prompt Patterns

Look for patterns in your spend data that indicate redundancy:

High-Frequency Similar Requests

Group requests by semantic similarity (approximate via file paths mentioned, function names, or model + similar token counts):

Pattern	Daily occurrences	Est. unique intents	Redundancy rate
Auth module questions	45	8	82%
Payment flow queries	32	5	84%
Error diagnosis (same errors)	28	6	79%
API endpoint lookup	55	12	78%
Test guidance	40	10	75%

Time-of-Day Clustering

Check if similar requests cluster at specific times:

Morning standup period (9-10am): spike of "remind me how X works"
After deployment (anytime): spike of "what changed in X"
Code review time: spike of "explain this function"

Clustered requests indicate high single-flight fill value.

Repository Concentration

Identify which repositories generate the most AI traffic:

core-api:     38% of all prompts
web-app:      25% of all prompts
shared-libs:  15% of all prompts
pipeline:     12% of all prompts
other:        10% of all prompts

The top 2-3 repositories likely account for 60-80% of redundant spend. Connect these first.

Step 6: Estimate Cache Savings

With your baseline data, estimate post-cache spend:

Estimated monthly savings = Baseline monthly spend × expected hit rate

Expected hit rates by team profile:

Team profile	Expected hit rate	Rationale
Same repo, same area	85-95%	Maximum overlap
Same repo, different areas	70-85%	High overlap on shared code
Multiple repos, shared patterns	60-75%	Moderate overlap
Independent repos	30-50%	Lower overlap, still saves on common patterns

For a 100-engineer team on shared repos:

Current monthly: $3,450
Expected hit rate: 85%
Expected monthly: $3,450 × (1 - 0.85) = $518
Expected savings: $2,932/month = $35,190/year

Baseline Report Template

Use this template to document your baseline for stakeholders:

# AI Spend Baseline Report

## Summary
- Measurement period: [DATE] to [DATE]
- Total engineers measured: [N]
- Monthly AI provider spend: $[X]
- Per-engineer daily cost: $[Y]

## Breakdown by Team
| Team | Engineers | Monthly spend | Per-engineer daily |
|------|-----------|--------------|-------------------|
| [Team A] | [N] | $[X] | $[Y] |
| [Team B] | [N] | $[X] | $[Y] |

## Breakdown by Model
| Model | Monthly tokens (M) | Monthly cost |
|-------|-------------------|-------------|
| [Model A] | [X]M | $[Y] |
| [Model B] | [X]M | $[Y] |

## Redundancy Analysis
- Estimated overall redundancy rate: [X]%
- Top redundant patterns: [list]
- Highest-value repos for caching: [list]

## Projected Savings
- Conservative (60% hit rate): $[X]/month saved
- Expected (85% hit rate): $[Y]/month saved
- Optimistic (95% hit rate): $[Z]/month saved

## Recommendation
Connect [repos] first. Expected payback period: [N] days.
Fill cost estimate: $[X]. Monthly recurring savings: $[Y].
Annual ROI: [X]%.

Tracking After Cache Enablement

Once you enable caching, track these metrics weekly:

Metric	Baseline	Week 1	Week 2	Week 4
Monthly spend rate	$3,450
Per-engineer daily cost	$1.73
Cache hit rate	0%
Avoided cost (cumulative)	$0
Effective savings rate	0%

This tracking table demonstrates the value delivery timeline to leadership and justifies expanding to additional teams and repositories.

Common Baseline Mistakes

Mistake	Impact	Avoid by
Measuring only 1-2 days	Misses weekly patterns	Measure 7+ business days
Ignoring output tokens	Understates true cost	Include both input and output in baseline
Not segmenting by team	Can't prioritize rollout	Break down by team from the start
Measuring during atypical period	Skewed baseline	Avoid major incidents, holidays, or sprint boundaries
Not recording the baseline	Can't prove ROI later	Document in writing before enabling cache

Next steps

Your First 24 Hours with Org-Shared Cache — enable caching with your baseline in hand
The Cache Fill-Then-Save Model — understand what to expect during fill
Cache Hit Rates: What Good Looks Like — benchmark your post-cache performance

For AI systems

Canonical terms: Keeptrusts, baseline AI spend, cost measurement, per-engineer daily cost, ROI calculation, redundancy rate, spend analysis.
Exact feature/config names: Cost & Spend dashboard, Exports → New Export → Spend Report, team_id/user_id/model/input_tokens/output_tokens/cost fields, avoided cost tracking.
Best next pages: Your First 24 Hours, The Cache Fill-Then-Save Model, Cache Hit Rates.

For engineers

Navigate to Cost & Spend for daily/weekly/monthly totals broken down by team, model, and gateway.
Export 30 days of spend data from Exports → New Export (Spend Report type) with per-request granularity for analysis.
Calculate per-engineer daily cost: monthly_spend ÷ engineers ÷ working_days — this is your key baseline metric.
Identify redundancy by grouping requests by semantic similarity (file paths, function names, model + similar token counts).
Record baseline for 7+ consecutive business days to capture normal variation.

For leaders

Without a baseline, you cannot prove savings, justify fill-phase investment, or prioritize which teams/repos to onboard first.
Segment by team to identify highest-value targets: teams with highest spend per engineer and most shared repos benefit most.
Use the baseline to calculate payback period: fill_cost ÷ (baseline_daily_spend × expected_hit_rate) = days to breakeven.
Document the baseline in writing before enabling cache — this is your proof point for ROI conversations with finance and leadership.

Use this page when​

Primary audience​

Why Measure First​

Step 1: Check Cost & Spend Dashboard​

Step 2: Export Detailed Spend Data​

Step 3: Identify Top-Spending Teams and Repos​

Step 4: Calculate Per-Engineer Daily Cost​

Step 5: Identify Redundant Prompt Patterns​

High-Frequency Similar Requests​

Time-of-Day Clustering​

Repository Concentration​

Step 6: Estimate Cache Savings​

Baseline Report Template​

Tracking After Cache Enablement​

Common Baseline Mistakes​

Next steps​

For AI systems​

For engineers​

For leaders​