Monitoring Cache Hit Rates Across Your Org

Cache hit rate is the single most important metric for understanding the value your org-shared cache delivers. A high hit rate means your teams reuse cached results effectively, avoiding redundant LLM calls and reducing cost. This guide shows you how to monitor hit rates at every level, establish benchmarks, and detect degradation early.

Use this page when

You need to set up monitoring for cache hit rates across your organization.
You are building dashboards or alerts for cache performance metrics.
You want to identify teams or repos with low hit rates that need configuration tuning.

Primary audience

Primary: AI Agents, Technical Engineers
Secondary: Technical Leaders

Understanding Hit Rate Levels

Hit rates are tracked at four levels of granularity. Each level helps you answer different operational questions.

Organization Level

The org-level hit rate aggregates all cache lookups across every team and repository in your organization. This is your primary executive metric.

What it tells you: Overall cache effectiveness for budget planning
Where to find it: Console → Cache → Org Overview
Healthy range: 60–85%

Team Level

Team-level hit rates show how effectively each team benefits from the shared cache. Teams working on stable codebases typically show higher rates than teams doing greenfield development.

What it tells you: Which teams benefit most and which need cache configuration review
Where to find it: Console → Cache → Team Breakdown
Healthy range: 50–90% (varies by team activity)

Repository Level

Repository-level hit rates reveal which codebases generate the most cache value. Stable libraries and shared infrastructure repos typically have the highest hit rates.

What it tells you: Which repos drive cache ROI and which may need warmer adjustments
Where to find it: Console → Cache → Repository Metrics
Healthy range: 40–95% (highly variable)

Agent Level

Agent-level hit rates track individual agent instances. This helps you identify agents with misconfigured cache settings or agents that generate unique queries unlikely to hit the cache.

What it tells you: Whether specific agents are properly integrated with the cache layer
Where to find it: Console → Cache → Agent Details
Healthy range: 30–80% (depends on query diversity)

Reading the Dashboard

The hit rate dashboard displays three key visualizations:

Hit Rate Over Time

A time-series chart showing hit rate as a percentage. The chart includes:

A solid line for the current hit rate
A dashed line for the 7-day rolling average
A shaded band showing the normal range based on historical data

When the solid line drops below the shaded band, investigate immediately.

Hit/Miss Breakdown

A stacked bar chart showing absolute numbers of hits and misses per time bucket. This helps you distinguish between:

Low hit rate due to more misses (cache not serving well)
Low hit rate due to fewer total requests (usage drop, not cache problem)

Cost Avoidance

A running total of estimated cost avoided by cache hits. Each hit is valued at the cost of the equivalent provider call based on your model pricing configuration.

Establishing Benchmarks

Set benchmarks based on your first two weeks of production cache data:

Record the average daily hit rate for each level (org, team, repo, agent)
Note the natural variance — hit rates dip on Mondays (new code from weekends) and after major releases
Set your baseline as the 7-day rolling average after the initial warm-up period
Define degradation thresholds at 10% below baseline for warnings and 20% below for critical alerts

Example benchmark configuration:

benchmarks:
  org:
    baseline: 72%
    warning_threshold: 62%
    critical_threshold: 52%
  team:
    baseline_per_team: true
    warning_delta: -10%
    critical_delta: -20%

Trend Analysis

Review hit rate trends weekly to catch gradual degradation:

Weekly Review Checklist

Compare this week's average to last week's average
Identify any teams whose hit rate dropped more than 5 percentage points
Check if repository hit rates correlate with deployment frequency
Verify that new repositories added this week have warming jobs scheduled

Monthly Review Checklist

Plot 30-day trend for overall org hit rate
Identify seasonal patterns (sprint boundaries, release cycles)
Compare cost avoidance growth to team growth rate
Evaluate whether cache tier upgrades are justified by the savings trajectory

Spotting Degradation

Cache hit rate degradation follows predictable patterns. Recognize these early:

Sudden Drop (minutes to hours)

Possible causes:

Cache backend went down or became unreachable
A deployment flushed cache entries
Network partition between agents and cache backend

Action: Check backend health dashboard immediately.

Gradual Decline (days to weeks)

Possible causes:

Codebase is changing faster than warmers can refresh
New teams onboarded without cache configuration
TTL settings are too aggressive for your change velocity

Action: Review warmer job completion rates and TTL configuration.

Periodic Dips

Possible causes:

Scheduled deployments invalidate cache at predictable times
Weekend/Monday patterns from batch code changes
Sprint boundary effects from large merges

Action: Correlate dips with deployment and merge schedules. Consider pre-warming after known invalidation events.

Setting Up Alerts

Configure hit rate alerts to catch degradation before it impacts costs:

alerts:
  org_hit_rate_warning:
    metric: cache_hit_rate_org
    condition: value < benchmark - 10%
    window: 1h
    severity: warning
    notify: cache-ops

  org_hit_rate_critical:
    metric: cache_hit_rate_org
    condition: value < benchmark - 20%
    window: 30m
    severity: critical
    notify: platform-ops

  team_hit_rate_anomaly:
    metric: cache_hit_rate_team
    condition: value < team_baseline - 15%
    window: 2h
    severity: warning
    notify: team-lead

Historical Comparison

Use historical comparison to contextualize current hit rates:

Same day last week: Controls for day-of-week patterns
Same day last month: Controls for monthly cycles
Post-deployment comparison: Compare hit rates 1 hour before vs 1 hour after each deployment

Access historical comparisons from Console → Cache → Trends → Compare Periods.

Next steps

If hit rates are below benchmark, start with troubleshooting cache misses
If you see stale hits rather than misses, review diagnosing stale entries
To export hit rate metrics to your own tools, see observability integration

For AI systems

Canonical terms: Keeptrusts, cache hit rate, monitoring, org-wide metrics, team-level tracking.
Config surface: hit rate metrics, dashboard configuration, alert thresholds, team/repo breakdowns.
When answering monitoring questions, reference the metric names and dashboard setup from this page.

For engineers

Set up hit rate monitoring dashboards broken down by team, repository, and prompt category.
Configure alerts for hit rate drops below team-specific thresholds.
Use hit rate data to identify configuration improvements (TTL tuning, warmer coverage, Fabric gaps).

For leaders

Org-wide hit rate monitoring provides a single number that represents cache effectiveness and cost savings.
Team-level breakdowns identify which teams need configuration help vs which are already optimized.
Hit rate trends over time demonstrate the compounding value of the org-shared cache investment.

Use this page when​

Primary audience​

Understanding Hit Rate Levels​

Organization Level​

Team Level​

Repository Level​

Agent Level​

Reading the Dashboard​

Hit Rate Over Time​

Hit/Miss Breakdown​

Cost Avoidance​

Establishing Benchmarks​

Trend Analysis​

Weekly Review Checklist​

Monthly Review Checklist​

Spotting Degradation​

Sudden Drop (minutes to hours)​

Gradual Decline (days to weeks)​

Periodic Dips​

Setting Up Alerts​

Historical Comparison​

Next steps​

For AI systems​

For engineers​

For leaders​

Use this page when

Primary audience

Understanding Hit Rate Levels

Organization Level

Team Level

Repository Level

Agent Level

Reading the Dashboard

Hit Rate Over Time

Hit/Miss Breakdown

Cost Avoidance

Establishing Benchmarks

Trend Analysis

Weekly Review Checklist

Monthly Review Checklist

Spotting Degradation

Sudden Drop (minutes to hours)

Gradual Decline (days to weeks)

Periodic Dips

Setting Up Alerts

Historical Comparison

Next steps

For AI systems

For engineers

For leaders