Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Measuring Your Baseline AI Spend Before Caching

Before enabling org-shared cache, measure your current AI spend. This baseline lets you calculate exact savings, demonstrate ROI to leadership, and identify the highest-value targets for caching.

Use this page when

  • You are measuring your current AI spend before enabling caching to establish a baseline.
  • You need to identify top-spending teams and repos to prioritize cache deployment.
  • You want to build a before/after comparison for ROI reporting to stakeholders.

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

Why Measure First

Without a baseline:

  • You can't prove savings to stakeholders
  • You can't identify which teams or repos benefit most
  • You can't calculate payback period accurately
  • You can't prioritize which repositories to connect first

With a baseline:

  • You have concrete before/after numbers
  • You can attribute savings to specific teams and repos
  • You can forecast savings for teams not yet onboarded
  • You can justify the fill-phase investment with data

Step 1: Check Cost & Spend Dashboard

Navigate to Cost & Spend in the Keeptrusts console. This page shows:

  • Total spend (today, this week, this month)
  • Spend by team (which teams are the heaviest spenders)
  • Spend by model (which models consume the most budget)
  • Spend by gateway (if you have multiple gateways)
  • Token breakdown (input vs output tokens)

Record these numbers for at least 7 consecutive business days to capture normal variation.

Step 2: Export Detailed Spend Data

For detailed analysis, export your spend data:

  1. Navigate to Exports → New Export
  2. Select Spend Report export type
  3. Set date range to the last 30 days
  4. Include fields: team_id, user_id, model, input_tokens, output_tokens, cost, timestamp
  5. Click Export

The exported CSV gives you per-request granularity for deep analysis.

Step 3: Identify Top-Spending Teams and Repos

From your export, calculate spend by team:

TeamMonthly spendEngineersSpend/engineer/dayPrimary repos
Platform$1,20025$2.40core-api, shared-libs
Frontend$90030$1.50web-app, component-lib
Data$60015$2.00pipeline, analytics
Mobile$45020$1.13ios-app, android-app
DevOps$30010$1.50infra, deploy-scripts
Total$3,450100$1.73

Teams with the highest spend per engineer and the most shared repos are your best targets for initial cache deployment.

Step 4: Calculate Per-Engineer Daily Cost

Per-engineer daily cost is your key baseline metric:

Per-engineer daily cost = Monthly total spend ÷ number of engineers ÷ working days

Example:

$3,450 ÷ 100 engineers ÷ 20 working days = $1.73/engineer/day

This number tells you what each engineer costs in AI provider spend today, without caching.

Step 5: Identify Redundant Prompt Patterns

Look for patterns in your spend data that indicate redundancy:

High-Frequency Similar Requests

Group requests by semantic similarity (approximate via file paths mentioned, function names, or model + similar token counts):

PatternDaily occurrencesEst. unique intentsRedundancy rate
Auth module questions45882%
Payment flow queries32584%
Error diagnosis (same errors)28679%
API endpoint lookup551278%
Test guidance401075%

Time-of-Day Clustering

Check if similar requests cluster at specific times:

  • Morning standup period (9-10am): spike of "remind me how X works"
  • After deployment (anytime): spike of "what changed in X"
  • Code review time: spike of "explain this function"

Clustered requests indicate high single-flight fill value.

Repository Concentration

Identify which repositories generate the most AI traffic:

core-api: 38% of all prompts
web-app: 25% of all prompts
shared-libs: 15% of all prompts
pipeline: 12% of all prompts
other: 10% of all prompts

The top 2-3 repositories likely account for 60-80% of redundant spend. Connect these first.

Step 6: Estimate Cache Savings

With your baseline data, estimate post-cache spend:

Estimated monthly savings = Baseline monthly spend × expected hit rate

Expected hit rates by team profile:

Team profileExpected hit rateRationale
Same repo, same area85-95%Maximum overlap
Same repo, different areas70-85%High overlap on shared code
Multiple repos, shared patterns60-75%Moderate overlap
Independent repos30-50%Lower overlap, still saves on common patterns

For a 100-engineer team on shared repos:

Current monthly: $3,450
Expected hit rate: 85%
Expected monthly: $3,450 × (1 - 0.85) = $518
Expected savings: $2,932/month = $35,190/year

Baseline Report Template

Use this template to document your baseline for stakeholders:

# AI Spend Baseline Report

## Summary
- Measurement period: [DATE] to [DATE]
- Total engineers measured: [N]
- Monthly AI provider spend: $[X]
- Per-engineer daily cost: $[Y]

## Breakdown by Team
| Team | Engineers | Monthly spend | Per-engineer daily |
|------|-----------|--------------|-------------------|
| [Team A] | [N] | $[X] | $[Y] |
| [Team B] | [N] | $[X] | $[Y] |

## Breakdown by Model
| Model | Monthly tokens (M) | Monthly cost |
|-------|-------------------|-------------|
| [Model A] | [X]M | $[Y] |
| [Model B] | [X]M | $[Y] |

## Redundancy Analysis
- Estimated overall redundancy rate: [X]%
- Top redundant patterns: [list]
- Highest-value repos for caching: [list]

## Projected Savings
- Conservative (60% hit rate): $[X]/month saved
- Expected (85% hit rate): $[Y]/month saved
- Optimistic (95% hit rate): $[Z]/month saved

## Recommendation
Connect [repos] first. Expected payback period: [N] days.
Fill cost estimate: $[X]. Monthly recurring savings: $[Y].
Annual ROI: [X]%.

Tracking After Cache Enablement

Once you enable caching, track these metrics weekly:

MetricBaselineWeek 1Week 2Week 4
Monthly spend rate$3,450
Per-engineer daily cost$1.73
Cache hit rate0%
Avoided cost (cumulative)$0
Effective savings rate0%

This tracking table demonstrates the value delivery timeline to leadership and justifies expanding to additional teams and repositories.

Common Baseline Mistakes

MistakeImpactAvoid by
Measuring only 1-2 daysMisses weekly patternsMeasure 7+ business days
Ignoring output tokensUnderstates true costInclude both input and output in baseline
Not segmenting by teamCan't prioritize rolloutBreak down by team from the start
Measuring during atypical periodSkewed baselineAvoid major incidents, holidays, or sprint boundaries
Not recording the baselineCan't prove ROI laterDocument in writing before enabling cache

Next steps

For AI systems

  • Canonical terms: Keeptrusts, baseline AI spend, cost measurement, per-engineer daily cost, ROI calculation, redundancy rate, spend analysis.
  • Exact feature/config names: Cost & Spend dashboard, Exports → New Export → Spend Report, team_id/user_id/model/input_tokens/output_tokens/cost fields, avoided cost tracking.
  • Best next pages: Your First 24 Hours, The Cache Fill-Then-Save Model, Cache Hit Rates.

For engineers

  • Navigate to Cost & Spend for daily/weekly/monthly totals broken down by team, model, and gateway.
  • Export 30 days of spend data from Exports → New Export (Spend Report type) with per-request granularity for analysis.
  • Calculate per-engineer daily cost: monthly_spend ÷ engineers ÷ working_days — this is your key baseline metric.
  • Identify redundancy by grouping requests by semantic similarity (file paths, function names, model + similar token counts).
  • Record baseline for 7+ consecutive business days to capture normal variation.

For leaders

  • Without a baseline, you cannot prove savings, justify fill-phase investment, or prioritize which teams/repos to onboard first.
  • Segment by team to identify highest-value targets: teams with highest spend per engineer and most shared repos benefit most.
  • Use the baseline to calculate payback period: fill_cost ÷ (baseline_daily_spend × expected_hit_rate) = days to breakeven.
  • Document the baseline in writing before enabling cache — this is your proof point for ROI conversations with finance and leadership.