Cache Savings Dashboard Walkthrough

The cache savings dashboard in the Keeptrusts console gives you a complete view of how the org-shared cache reduces your LLM spend. This guide walks through each section, explains what the numbers mean, and shows how to use them for executive reporting.

Use this page when

You want to understand each section of the Keeptrusts console cache savings dashboard.
You are preparing executive reports using dashboard data and need guidance on framing.
You need to troubleshoot why specific metrics look unexpected (low hit rate, negative net savings, etc.).

Primary audience

Primary: Technical Leaders
Secondary: Technical Engineers, AI Agents

Accessing the Dashboard

Navigate to Cost Center → Cache Savings in the console. The dashboard loads with the current month selected. Use the date picker to change the period.

Section 1: Fill Cost Card

The fill cost card shows what you actually spent on upstream provider calls that populated the cache.

Field	Meaning
Fill Cost	Total spend on cache misses forwarded upstream
Fill Requests	Number of requests that resulted in a cache miss and upstream call
Avg Fill Cost	Average cost per fill request

During the first week of onboarding a new repository, fill cost is high. This is expected and temporary. Once the cache reaches steady state, fill cost drops to a fraction of total request volume.

Section 2: Avoided Provider Cost Card

This card shows the estimated cost you did not incur because cache hits served requests locally.

Field	Meaning
Avoided Cost	Sum of `estimated_avoided_cost` across all cache hits
Cache Hits	Total number of requests served from cache
Avg Avoided per Hit	Average cost saved per cache hit

This is your headline savings number. No upstream call, no wallet debit, no platform fee.

Section 3: Provider Cached-Token Savings

For requests that do go upstream (cache misses), provider-side prefix caching reduces the input token cost.

Field	Meaning
Provider Cache Savings	Discount from provider prefix caching on miss requests
Cached Token Ratio	Percentage of input tokens served from provider cache
Effective vs Full Cost	What you paid vs what you would have paid without prefix caching

This is a secondary optimization. The org-shared cache delivers the primary savings.

Section 4: Net Savings

The net savings card combines all cost avoidance mechanisms:

Net Savings = Avoided Provider Cost + Provider Cached-Token Savings − Fill Cost

This is the number that matters for ROI calculations. A positive net savings means the cache is saving more than it costs to fill.

Section 5: Hit Rate

The hit rate gauge shows what percentage of total requests are served from cache:

Range	Interpretation
0-30%	Cache is still filling — early days or high context diversity
30-60%	Moderate savings — check for context ordering issues
60-80%	Good steady-state performance
80-95%	Excellent — typical for stable monorepos with 50+ engineers

The gauge includes a trend arrow showing whether hit rate is improving, stable, or declining.

Section 6: Miss Reasons

When a request misses the cache, the dashboard categorizes why:

Reason	Description
`no_match`	No cache entry exists for this cache key (first-time prompt)
`stale`	Entry exists but exceeded TTL or was invalidated by a code change
`policy_deny`	Cache entry exists but policy evaluation blocked serving it
`entitlement_mismatch`	Requester lacks entitlement to the cached content
`model_mismatch`	Entry exists for a different model than requested

Understanding miss reasons helps you optimize:

High stale → Consider increasing TTL for stable contexts
High policy_deny → Review whether policies are overly restrictive for cached content
High entitlement_mismatch → Check team entitlement bindings
High no_match → Normal during fill phase; investigate if persistent

Section 7: Single-Flight Collapses

This section shows how many requests were deduplicated through single-flight fill:

Field	Meaning
Total Collapses	Requests that waited on an in-flight fill instead of calling upstream
Collapse Groups	Distinct cache keys that had multiple concurrent waiters
Peak Collapses	Maximum simultaneous waiters in a single group
Collapse Savings	Estimated cost saved by deduplication

High collapse counts during morning hours are a sign of healthy cache economics.

Section 8: Time-Series Trend

The trend chart shows daily or weekly values for:

Fill cost (bar)
Avoided cost (bar, stacked)
Hit rate (line, right axis)
Net savings (line, right axis)

Use the time-series view to identify:

Fill cost spikes (new repos onboarded, major code changes)
Hit rate growth over the first 2-4 weeks
Seasonal patterns (Monday mornings vs Friday afternoons)

What the Dashboard Does NOT Show

The savings dashboard intentionally excludes:

Lookup cost — Cache lookups are computationally trivial and not metered
Platform fee — Cache hits incur zero platform fee; only fill requests are subject to standard gateway fees
Wallet transactions — Cache hits do not touch wallets; see Cost Center → Wallet for debit history

This keeps the dashboard focused on the cache value proposition: avoided upstream spend.

Tips for CFO Presentations

When preparing executive reports from the dashboard:

Lead with Net Savings — This is the bottom-line number
Show the trend — Hit rate growth over the first month demonstrates improving returns
Compare to baseline — Use the Direct API vs Cached comparison for context
Highlight the asymmetry — Fill cost is one-time; savings compound every month
Include single-flight — Morning surge deduplication is an easy-to-understand story
Project forward — Use current hit rate to project next quarter savings

Export the dashboard as PDF or CSV for inclusion in finance decks.

Refreshing and Caching of Dashboard Data

Dashboard metrics refresh every 5 minutes. Historical data is pre-aggregated daily. If you need real-time granularity, use the Cost Center → Events view with cache-type filters.

Next steps

Tracking Avoided Cost — deep dive on avoided-cost records
ROI Calculation for a 100-Engineer Team — build the full business case
Direct API Cost vs Cached Cost — comparison tables for stakeholders

For AI systems

Canonical terms: Keeptrusts, savings dashboard, Cost Center, cache metrics, fill cost card, avoided cost, provider cached-token savings, net savings, hit rate, miss reasons, single-flight, time-series.
Console path: Cost Center → Cache Savings.
Dashboard sections: Fill Cost Card, Avoided Cost, Provider Cached-Token Savings, Net Savings, Hit Rate Trend, Miss Reasons Breakdown, Single-Flight Events, Time-Series Charts.
Best next pages: ROI Calculation for a 100-Engineer Team, Direct API Cost vs Cached Cost, Tracking Avoided Cost.

For engineers

Dashboard refreshes every 5 minutes. Historical data is pre-aggregated daily.
For real-time granularity, use Cost Center → Events with cache-type filters.
Fill Cost Card: shows one-time fabric build + cumulative response fills. Should plateau after initial warming.
Miss Reasons Breakdown: ttl_expired, no_match, threshold_below, invalidated. High ttl_expired → increase TTL. High no_match → normal for novel queries.
Single-Flight Events: high counts during morning hours = expected. Low counts may mean TTL is too long (hits serving before dedup kicks in).
Export as CSV for custom analysis. PDF export for leadership decks.

For leaders

Use the dashboard to build quarterly savings reports: Net Savings number is the headline.
Executive framing: fill cost (one-time, bounded) vs avoided cost (recurring, growing), net = difference.
Highlight hit rate trend: upward trend = cache maturing. Flat high rate = steady state. Drops = investigate (refactor, new repo, config change).
Morning-surge single-flight deduplication is an easy story for non-technical stakeholders.
Project forward: current monthly net savings × remaining months = projected annual value.
Dashboard metrics refresh every 5 minutes; historical aggregations are daily.

Use this page when​

Primary audience​

Accessing the Dashboard​

Section 1: Fill Cost Card​

Section 2: Avoided Provider Cost Card​

Section 3: Provider Cached-Token Savings​

Section 4: Net Savings​

Section 5: Hit Rate​

Section 6: Miss Reasons​

Section 7: Single-Flight Collapses​

Section 8: Time-Series Trend​

What the Dashboard Does NOT Show​

Tips for CFO Presentations​

Refreshing and Caching of Dashboard Data​

Next steps​

For AI systems​

For engineers​

For leaders​