Setting Budget Alerts for Cache Fill Phases

When you onboard a new repository to the org-shared cache, the initial fill phase sends every unique request upstream. This is the most expensive period — and it is predictable, bounded, and one-time. Setting budget alerts ensures you stay informed and in control during fill.

Use this page when

You are onboarding a new repository and need to set spending alerts and limits for the initial cache fill phase.
You want to estimate fill cost before starting and communicate budget expectations to stakeholders.
You need recommended alert thresholds by team size and repository complexity.

Primary audience

Primary: Technical Leaders
Secondary: Technical Engineers, AI Agents

Understanding Fill Phase Economics

The fill phase follows a predictable pattern:

Week	Hit Rate	Fill Cost (relative)	Description
Week 1	10-30%	High	Most requests are novel; cache is building
Week 2	40-60%	Moderate	Common patterns are cached; less common still fill
Week 3	60-80%	Low	Long-tail patterns filling; most traffic hits cache
Week 4+	75-90%	Minimal	Steady state; fills only on new/changed code

After the fill phase completes, your ongoing fill cost is typically 5-15% of total request volume — only truly novel prompts or stale invalidations trigger upstream calls.

Estimating Fill Cost Before Starting

Estimate your fill cost before onboarding a repository:

Formula:

Estimated Fill Cost = Unique Prompt Patterns × Avg Cost per Request

Rules of thumb by repository size:

Repo Size	Unique Patterns (first month)	Avg Cost/Request	Estimated Fill
Small (< 50K LOC)	200-500	$0.06	$12-30
Medium (50-200K LOC)	500-2,000	$0.08	$40-160
Large (200K-1M LOC)	2,000-8,000	$0.10	$200-800
Monorepo (1M+ LOC)	8,000-25,000	$0.12	$960-3,000

These estimates assume GPT-4o-class models. Cheaper models reduce fill cost proportionally.

Setting Wallet Alerts

Configure alerts before starting a fill phase:

Navigate to Settings → Wallets in the console
Select the wallet assigned to the team onboarding the new repository
Click Alerts → Add Alert
Configure:

Setting	Recommended Value
Alert Type	Spending threshold
Threshold	50% of estimated fill cost
Period	Weekly
Channel	Email + Slack
Action	Notify (do not block)

Add a second alert at 90% of estimated fill cost with the same period
Save

Setting Spending Limits

For hard cost control during fill, set a spending limit:

Navigate to Settings → Wallets
Select the target wallet
Click Limits → Add Limit
Configure:

Setting	Recommended Value
Limit Type	Daily maximum
Amount	Estimated monthly fill ÷ 20 working days × 1.5 buffer
Enforcement	Soft limit (warn) or Hard limit (block after threshold)
Scope	Per-repository or per-team

A soft limit notifies you but allows requests to continue. A hard limit blocks further upstream calls — subsequent requests return a cost-limit error until the next period.

Recommended Thresholds by Team Size

Team Size	Daily Soft Limit	Daily Hard Limit	Weekly Alert
10 engineers	$15	$25	$75
25 engineers	$35	$60	$175
50 engineers	$70	$120	$350
100 engineers	$140	$240	$700

These assume a medium-to-large repository during peak fill phase. Adjust based on your estimated fill cost and risk tolerance.

Notification Channels

Budget alerts support multiple notification channels:

Email — Sent to wallet owner and configured recipients
Slack — Posts to a designated channel via webhook
Microsoft Teams — Posts via incoming webhook connector
Webhook — Generic HTTP POST for custom integrations
Console banner — In-app notification visible to all team members

Configure at least two channels to ensure visibility.

Monitoring Fill Progress

Track fill progress in real-time:

Navigate to Cost Center → Cache Performance
Watch the Hit Rate metric climb over days
Check Fill Cost / Day trending downward
Review New Cache Entries / Day — this should decrease as the cache fills

When new entries per day drops below 5% of daily request volume, you have reached steady state.

Fill Phase Best Practices

Start with a smaller team — Let 5-10 engineers fill the cache before scaling to 100
Fill during off-peak — Spread fill cost over a week rather than a single day
Monitor miss reasons — If stale misses spike, your TTL may be too aggressive
Communicate to the team — Let engineers know the first week costs more; savings come after
Don't over-restrict — Hard limits during fill slow down cache population and delay ROI

Adjusting After Fill Phase

Once your repository reaches steady-state hit rate (70%+):

Lower daily spending limits by 60-80%
Remove fill-phase-specific alerts
Set long-term alerts based on expected ongoing fill cost (10-20% of pre-cache baseline)
Review monthly to ensure limits match actual spend patterns

Handling Unexpected Fill Spikes

Occasional fill spikes occur when:

Major code refactors invalidate cached entries
New models are deployed (different cache keys)
TTL expires on a large batch of entries simultaneously
A new team joins the codebase

These are temporary. If an alert fires unexpectedly, check Miss Reasons in the dashboard — stale misses during a code change are normal and self-resolving.

Next steps

Tracking Avoided Cost — see savings offset fill cost
ROI Calculation for a 100-Engineer Team — model fill cost into full ROI
Single-Flight Fill — reduce fill cost further with deduplication

For AI systems

Canonical terms: Keeptrusts, budget alerts, cache fill phase, wallet alerts, spending limits, fill cost estimation, soft limit, hard limit.
Console paths: Settings → Wallets → Alerts → Add Alert, Settings → Wallets → Limits → Add Limit, Cost Center → Cache Performance.
Best next pages: Estimating Fill Cost, ROI Calculation for a 100-Engineer Team, Savings Dashboard Walkthrough.

For engineers

Set alerts at 50% and 90% of estimated fill cost with weekly period. Configure email + Slack channels.
Daily spending limit formula: estimated_monthly_fill ÷ 20 working_days × 1.5 buffer.
Soft limits notify but allow continued requests; hard limits block upstream calls until next period.
Monitor fill progress: Cost Center → Cache Performance → watch hit rate climb and fill cost/day trend downward.
Steady state reached when new cache entries/day drops below 5% of daily request volume.
After steady state: lower limits by 60–80%, remove fill-specific alerts, set long-term alerts at 10–20% of baseline.

For leaders

Fill cost is one-time, predictable, and bounded. It represents the investment required to unlock ongoing 80%+ cost reduction.
Recommended approach: start with 5–10 engineers filling the cache, then scale to 100 once steady-state is reached.
Fill spikes after major refactors are temporary and self-resolving — not a sign of misconfiguration.
Communicate to teams: first week costs more; savings compound from week 2 onward. ROI is typically < 1 week payback.

Use this page when​

Primary audience​

Understanding Fill Phase Economics​

Estimating Fill Cost Before Starting​

Setting Wallet Alerts​

Setting Spending Limits​

Recommended Thresholds by Team Size​

Notification Channels​

Monitoring Fill Progress​

Fill Phase Best Practices​

Adjusting After Fill Phase​

Handling Unexpected Fill Spikes​

Next steps​

For AI systems​

For engineers​

For leaders​