Your First 24 Hours with Org-Shared Cache
This tutorial walks you through enabling org-shared cache for your engineering team and observing the fill-then-save economics in real time. By the end of 24 hours, you should see measurable savings on your dashboard.
Use this page when
- You are enabling org-shared cache for the first time and want a guided 24-hour walkthrough.
- You need the prerequisite checklist, step-by-step configuration, and expected metrics timeline.
- You want to troubleshoot common first-day issues (0% hit rate, stuck artifacts, unexpected cost spikes).
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
Prerequisites Checklist
Before starting, confirm you have:
- Admin access to the Keeptrusts console
- At least one gateway deployed in hosted gateway mode
- A connected repository with active engineering traffic
- The
worker_cache_warmerbinary running (or scheduled to start) - At least 5 engineers actively using AI-assisted development tools routed through Keeptrusts
- Wallet funding sufficient for the fill phase (typically 2-3× your normal daily spend)
Step 1: Enable Cache Routing in Declarative Config
Cache routing is configured through declarative config, not through a console settings page.
- Open your organization configuration in Configurations or the managed manifest repository.
- Enable
workflow_cacheand set the default tier toorg_shared_cache. - Set TTL to 86400 seconds (24 hours) for your first day — you can tune this later.
- Set max entry tokens to 32000 (covers most code-related responses).
- Apply the updated configuration.
Example:
workflow_cache:
enabled: true
default_tier: org_shared_cache
ttl_seconds: 86400
max_entry_tokens: 32000
The gateway picks up this configuration change within 60 seconds after reload.
Step 2: Connect Your Repository
If you haven't already connected a repository, go to Settings → Repositories and add your primary codebase:
- Click Connect Repository
- Provide the git URL and access credentials
- Select the branches to track (start with your main/default branch)
- Enable Auto-build fabric artifacts — this triggers the initial context build
The system begins building Codebase Context Fabric artifacts immediately after connection.
Step 3: Watch Fabric Artifacts Build
Navigate to Repositories → [Your Repo] → Fabric Status to monitor artifact creation:
| Artifact | Purpose | Typical build time |
|---|---|---|
repo_map | High-level repository structure | 30-60 seconds |
dependency_graph | Module and package dependencies | 1-3 minutes |
file_summary | Per-file natural language summaries | 5-15 minutes |
test_map | Test file to source file mappings | 1-2 minutes |
api_inventory | API endpoint catalog | 2-5 minutes |
symbol_index | Function/class/type index | 3-8 minutes |
embedding_index | Semantic embedding vectors | 10-30 minutes |
All artifacts building simultaneously. The full fabric for a medium-sized repository (50,000-200,000 lines) completes within 30 minutes.
What "Building" Means Economically
Fabric artifact creation incurs upstream LLM costs because the system sends your code to providers for summarization and indexing. This is part of the fill phase. Monitor the cost in Cost & Spend → Today — you'll see a spike that represents your investment in shared context.
Step 4: First Engineer Sends a Prompt — Cache Miss
Once fabric artifacts are built, normal engineering traffic begins populating the response cache.
What happens on a cache miss:
- Engineer A opens their IDE and asks: "How does the authentication middleware work?"
- The gateway receives the request and checks the org-shared cache → miss
- The gateway reserves wallet funds for the estimated cost
- The request goes upstream to the LLM provider with fabric context attached
- The provider returns a response
- The gateway settles the actual cost against the wallet
- The response is stored in org-shared cache with the composite key
- The response is returned to Engineer A
Cost: Full provider price (input tokens + output tokens).
Step 5: Second Engineer Sends Similar Prompt — Cache Hit
Minutes or hours later, Engineer B asks a semantically similar question:
What happens on a cache hit:
- Engineer B asks: "Explain the auth middleware flow"
- The gateway receives the request and checks the org-shared cache → hit
- No wallet reservation occurs
- No upstream provider call is made
- The cached response is returned to Engineer B
- An avoided-cost record is emitted
Cost: Zero. No provider tokens, no platform fee, no wallet transaction.
Step 6: Check Your Savings Dashboard
Navigate to Cost & Spend → Savings to see:
- Cache hits today: Number of requests served from cache
- Avoided cost today: Dollar amount saved by cache hits
- Hit rate: Percentage of requests that hit cache
- Fill cost today: Amount spent populating the cache
In the first 24 hours, your savings dashboard tells the story of the fill-then-save model playing out in real time.
Expected Outcomes by Time
Hour 1 (0-1h)
- Fabric artifacts complete building
- Cache is mostly empty
- Hit rate: 0-5%
- Cost: Higher than baseline (fill overhead)
- Status: Investing
Hour 4 (1-4h)
- Common questions about core modules start hitting cache
- Hit rate: 15-30%
- Cost: Approaching baseline
- Status: Breaking even
Hour 12 (4-12h)
- Most frequently-asked codebase questions are cached
- Fabric context reuse kicks in (prompts are smaller because context is pre-built)
- Hit rate: 40-60%
- Cost: Below baseline
- Status: Saving
Hour 24 (12-24h)
- Coverage across primary codebase areas is strong
- Engineers in different time zones benefit from earlier engineers' fills
- Hit rate: 55-75%
- Cost: Significantly below baseline
- Status: Saving substantially
Hit rates continue improving over the following days. Typical steady-state hit rates for teams sharing codebases reach 70-90% by the end of the first week.
Monitoring During the First Day
Keep these dashboards open during your first 24 hours:
Cost & Spend → Today
Watch the hourly cost trend. You should see costs peak in the first few hours then decline steadily.
Cache → Hit Rate
The hit rate graph should show a steady upward trend throughout the day as the cache fills.
Cache → Entries
The total cached entries count shows how quickly your shared context is building.
Repositories → Fabric Status
Confirm all artifact types show "Ready" status. If any are stuck in "Building" or "Error", investigate before the cache can achieve full effectiveness.
Common First-Day Issues
| Symptom | Cause | Fix |
|---|---|---|
| Hit rate stays at 0% | Cache routing not enabled on active gateway | Verify gateway config includes workflow_cache.enabled: true |
| Fabric artifacts stuck in "Building" | Worker not running | Confirm worker_cache_warmer is deployed and healthy |
| Cost spike much higher than expected | Large repo with many files triggering bulk summaries | Normal for initial fill — cost drops after first pass |
| Hit rate drops suddenly | Config version bump invalidated cache | Expected after policy changes — cache re-fills automatically |
| Engineers report slower responses | Cache lookup adding latency on misses | Verify cache backend is in same region as gateway |
What to Do After 24 Hours
After your first day:
- Check ROI: Compare today's total spend vs. your baseline daily spend
- Review hit rate: If below 40%, check that multiple engineers are sending overlapping traffic
- Tune TTL: If your code changes frequently, consider a shorter TTL for file-specific entries
- Add more repos: Each connected repo expands the cache's effectiveness
- Share results: Show the savings dashboard to your team lead and engineering director
Next steps
- Setting Up Codebase Context Fabric — optimize your fabric configuration
- Cache Hit Rates: What Good Looks Like — benchmark your performance
- How 100 Engineers Share One Cache — understand the scaling dynamics
For AI systems
- Canonical terms: Keeptrusts, org-shared cache, first 24 hours, cache routing, fill phase, cache hit, avoided cost, fabric artifacts, worker_cache_warmer.
- Exact feature/config names: Configurations or managed manifest repository,
workflow_cache.enabled: true,default_tier: org_shared_cache,ttl_seconds: 86400, Settings → Repositories, Cost & Spend → Savings,X-Cache-Statusresponse header. - Best next pages: Setting Up Codebase Context Fabric, Cache Hit Rates, How 100 Engineers Share One Cache.
For engineers
- Prerequisite checklist: admin console access, hosted-gateway gateway, connected repo,
worker_cache_warmerrunning, 5+ active engineers, wallet funded at 2-3× daily spend. - Enable in declarative config: set
workflow_cache.enabled: true,default_tier: org_shared_cache,ttl_seconds: 86400, andmax_entry_tokens: 32000, then apply the configuration. - Watch fabric build at Repositories → [Repo] → Fabric Status — full build for a medium repo completes within 30 minutes.
- If hit rate stays at 0%, verify
workflow_cache.enabled: truein your active gateway config. - If fabric artifacts are stuck in "Building", confirm
worker_cache_warmeris deployed and healthy.
For leaders
- First-day economics: expect a fill-phase cost spike (2-3× daily baseline) that drops to 15-20% of baseline by day's end as cache populates.
- Minimum viable test: 5 engineers on the same codebase with overlapping work patterns is enough to see measurable savings in 24 hours.
- After 24 hours, compare today's spend vs. baseline daily spend to calculate initial ROI and justify team-wide rollout.
- Share the savings dashboard with your engineering director to build momentum for expanding to additional repos and teams.