Measuring Your Baseline AI Spend Before Caching
Before enabling org-shared cache, measure your current AI spend. This baseline lets you calculate exact savings, demonstrate ROI to leadership, and identify the highest-value targets for caching.
Use this page when
- You are measuring your current AI spend before enabling caching to establish a baseline.
- You need to identify top-spending teams and repos to prioritize cache deployment.
- You want to build a before/after comparison for ROI reporting to stakeholders.
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
Why Measure First
Without a baseline:
- You can't prove savings to stakeholders
- You can't identify which teams or repos benefit most
- You can't calculate payback period accurately
- You can't prioritize which repositories to connect first
With a baseline:
- You have concrete before/after numbers
- You can attribute savings to specific teams and repos
- You can forecast savings for teams not yet onboarded
- You can justify the fill-phase investment with data
Step 1: Check Cost & Spend Dashboard
Navigate to Cost & Spend in the Keeptrusts console. This page shows:
- Total spend (today, this week, this month)
- Spend by team (which teams are the heaviest spenders)
- Spend by model (which models consume the most budget)
- Spend by gateway (if you have multiple gateways)
- Token breakdown (input vs output tokens)
Record these numbers for at least 7 consecutive business days to capture normal variation.
Step 2: Export Detailed Spend Data
For detailed analysis, export your spend data:
- Navigate to Exports → New Export
- Select Spend Report export type
- Set date range to the last 30 days
- Include fields:
team_id,user_id,model,input_tokens,output_tokens,cost,timestamp - Click Export
The exported CSV gives you per-request granularity for deep analysis.
Step 3: Identify Top-Spending Teams and Repos
From your export, calculate spend by team:
| Team | Monthly spend | Engineers | Spend/engineer/day | Primary repos |
|---|---|---|---|---|
| Platform | $1,200 | 25 | $2.40 | core-api, shared-libs |
| Frontend | $900 | 30 | $1.50 | web-app, component-lib |
| Data | $600 | 15 | $2.00 | pipeline, analytics |
| Mobile | $450 | 20 | $1.13 | ios-app, android-app |
| DevOps | $300 | 10 | $1.50 | infra, deploy-scripts |
| Total | $3,450 | 100 | $1.73 |
Teams with the highest spend per engineer and the most shared repos are your best targets for initial cache deployment.
Step 4: Calculate Per-Engineer Daily Cost
Per-engineer daily cost is your key baseline metric:
Per-engineer daily cost = Monthly total spend ÷ number of engineers ÷ working days
Example:
$3,450 ÷ 100 engineers ÷ 20 working days = $1.73/engineer/day
This number tells you what each engineer costs in AI provider spend today, without caching.
Step 5: Identify Redundant Prompt Patterns
Look for patterns in your spend data that indicate redundancy:
High-Frequency Similar Requests
Group requests by semantic similarity (approximate via file paths mentioned, function names, or model + similar token counts):
| Pattern | Daily occurrences | Est. unique intents | Redundancy rate |
|---|---|---|---|
| Auth module questions | 45 | 8 | 82% |
| Payment flow queries | 32 | 5 | 84% |
| Error diagnosis (same errors) | 28 | 6 | 79% |
| API endpoint lookup | 55 | 12 | 78% |
| Test guidance | 40 | 10 | 75% |
Time-of-Day Clustering
Check if similar requests cluster at specific times:
- Morning standup period (9-10am): spike of "remind me how X works"
- After deployment (anytime): spike of "what changed in X"
- Code review time: spike of "explain this function"
Clustered requests indicate high single-flight fill value.
Repository Concentration
Identify which repositories generate the most AI traffic:
core-api: 38% of all prompts
web-app: 25% of all prompts
shared-libs: 15% of all prompts
pipeline: 12% of all prompts
other: 10% of all prompts
The top 2-3 repositories likely account for 60-80% of redundant spend. Connect these first.
Step 6: Estimate Cache Savings
With your baseline data, estimate post-cache spend:
Estimated monthly savings = Baseline monthly spend × expected hit rate
Expected hit rates by team profile:
| Team profile | Expected hit rate | Rationale |
|---|---|---|
| Same repo, same area | 85-95% | Maximum overlap |
| Same repo, different areas | 70-85% | High overlap on shared code |
| Multiple repos, shared patterns | 60-75% | Moderate overlap |
| Independent repos | 30-50% | Lower overlap, still saves on common patterns |
For a 100-engineer team on shared repos:
Current monthly: $3,450
Expected hit rate: 85%
Expected monthly: $3,450 × (1 - 0.85) = $518
Expected savings: $2,932/month = $35,190/year
Baseline Report Template
Use this template to document your baseline for stakeholders:
# AI Spend Baseline Report
## Summary
- Measurement period: [DATE] to [DATE]
- Total engineers measured: [N]
- Monthly AI provider spend: $[X]
- Per-engineer daily cost: $[Y]
## Breakdown by Team
| Team | Engineers | Monthly spend | Per-engineer daily |
|------|-----------|--------------|-------------------|
| [Team A] | [N] | $[X] | $[Y] |
| [Team B] | [N] | $[X] | $[Y] |
## Breakdown by Model
| Model | Monthly tokens (M) | Monthly cost |
|-------|-------------------|-------------|
| [Model A] | [X]M | $[Y] |
| [Model B] | [X]M | $[Y] |
## Redundancy Analysis
- Estimated overall redundancy rate: [X]%
- Top redundant patterns: [list]
- Highest-value repos for caching: [list]
## Projected Savings
- Conservative (60% hit rate): $[X]/month saved
- Expected (85% hit rate): $[Y]/month saved
- Optimistic (95% hit rate): $[Z]/month saved
## Recommendation
Connect [repos] first. Expected payback period: [N] days.
Fill cost estimate: $[X]. Monthly recurring savings: $[Y].
Annual ROI: [X]%.
Tracking After Cache Enablement
Once you enable caching, track these metrics weekly:
| Metric | Baseline | Week 1 | Week 2 | Week 4 |
|---|---|---|---|---|
| Monthly spend rate | $3,450 | |||
| Per-engineer daily cost | $1.73 | |||
| Cache hit rate | 0% | |||
| Avoided cost (cumulative) | $0 | |||
| Effective savings rate | 0% |
This tracking table demonstrates the value delivery timeline to leadership and justifies expanding to additional teams and repositories.
Common Baseline Mistakes
| Mistake | Impact | Avoid by |
|---|---|---|
| Measuring only 1-2 days | Misses weekly patterns | Measure 7+ business days |
| Ignoring output tokens | Understates true cost | Include both input and output in baseline |
| Not segmenting by team | Can't prioritize rollout | Break down by team from the start |
| Measuring during atypical period | Skewed baseline | Avoid major incidents, holidays, or sprint boundaries |
| Not recording the baseline | Can't prove ROI later | Document in writing before enabling cache |
Next steps
- Your First 24 Hours with Org-Shared Cache — enable caching with your baseline in hand
- The Cache Fill-Then-Save Model — understand what to expect during fill
- Cache Hit Rates: What Good Looks Like — benchmark your post-cache performance
For AI systems
- Canonical terms: Keeptrusts, baseline AI spend, cost measurement, per-engineer daily cost, ROI calculation, redundancy rate, spend analysis.
- Exact feature/config names: Cost & Spend dashboard, Exports → New Export → Spend Report, team_id/user_id/model/input_tokens/output_tokens/cost fields, avoided cost tracking.
- Best next pages: Your First 24 Hours, The Cache Fill-Then-Save Model, Cache Hit Rates.
For engineers
- Navigate to Cost & Spend for daily/weekly/monthly totals broken down by team, model, and gateway.
- Export 30 days of spend data from Exports → New Export (Spend Report type) with per-request granularity for analysis.
- Calculate per-engineer daily cost:
monthly_spend ÷ engineers ÷ working_days— this is your key baseline metric. - Identify redundancy by grouping requests by semantic similarity (file paths, function names, model + similar token counts).
- Record baseline for 7+ consecutive business days to capture normal variation.
For leaders
- Without a baseline, you cannot prove savings, justify fill-phase investment, or prioritize which teams/repos to onboard first.
- Segment by team to identify highest-value targets: teams with highest spend per engineer and most shared repos benefit most.
- Use the baseline to calculate payback period:
fill_cost ÷ (baseline_daily_spend × expected_hit_rate)= days to breakeven. - Document the baseline in writing before enabling cache — this is your proof point for ROI conversations with finance and leadership.