Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Your First 24 Hours with Org-Shared Cache

This tutorial walks you through enabling org-shared cache for your engineering team and observing the fill-then-save economics in real time. By the end of 24 hours, you should see measurable savings on your dashboard.

Use this page when

  • You are enabling org-shared cache for the first time and want a guided 24-hour walkthrough.
  • You need the prerequisite checklist, step-by-step configuration, and expected metrics timeline.
  • You want to troubleshoot common first-day issues (0% hit rate, stuck artifacts, unexpected cost spikes).

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

Prerequisites Checklist

Before starting, confirm you have:

  • Admin access to the Keeptrusts console
  • At least one gateway deployed in hosted gateway mode
  • A connected repository with active engineering traffic
  • The worker_cache_warmer binary running (or scheduled to start)
  • At least 5 engineers actively using AI-assisted development tools routed through Keeptrusts
  • Wallet funding sufficient for the fill phase (typically 2-3× your normal daily spend)

Step 1: Enable Cache Routing in Declarative Config

Cache routing is configured through declarative config, not through a console settings page.

  1. Open your organization configuration in Configurations or the managed manifest repository.
  2. Enable workflow_cache and set the default tier to org_shared_cache.
  3. Set TTL to 86400 seconds (24 hours) for your first day — you can tune this later.
  4. Set max entry tokens to 32000 (covers most code-related responses).
  5. Apply the updated configuration.

Example:

workflow_cache:
enabled: true
default_tier: org_shared_cache
ttl_seconds: 86400
max_entry_tokens: 32000

The gateway picks up this configuration change within 60 seconds after reload.

Step 2: Connect Your Repository

If you haven't already connected a repository, go to Settings → Repositories and add your primary codebase:

  1. Click Connect Repository
  2. Provide the git URL and access credentials
  3. Select the branches to track (start with your main/default branch)
  4. Enable Auto-build fabric artifacts — this triggers the initial context build

The system begins building Codebase Context Fabric artifacts immediately after connection.

Step 3: Watch Fabric Artifacts Build

Navigate to Repositories → [Your Repo] → Fabric Status to monitor artifact creation:

ArtifactPurposeTypical build time
repo_mapHigh-level repository structure30-60 seconds
dependency_graphModule and package dependencies1-3 minutes
file_summaryPer-file natural language summaries5-15 minutes
test_mapTest file to source file mappings1-2 minutes
api_inventoryAPI endpoint catalog2-5 minutes
symbol_indexFunction/class/type index3-8 minutes
embedding_indexSemantic embedding vectors10-30 minutes

All artifacts building simultaneously. The full fabric for a medium-sized repository (50,000-200,000 lines) completes within 30 minutes.

What "Building" Means Economically

Fabric artifact creation incurs upstream LLM costs because the system sends your code to providers for summarization and indexing. This is part of the fill phase. Monitor the cost in Cost & Spend → Today — you'll see a spike that represents your investment in shared context.

Step 4: First Engineer Sends a Prompt — Cache Miss

Once fabric artifacts are built, normal engineering traffic begins populating the response cache.

What happens on a cache miss:

  1. Engineer A opens their IDE and asks: "How does the authentication middleware work?"
  2. The gateway receives the request and checks the org-shared cache → miss
  3. The gateway reserves wallet funds for the estimated cost
  4. The request goes upstream to the LLM provider with fabric context attached
  5. The provider returns a response
  6. The gateway settles the actual cost against the wallet
  7. The response is stored in org-shared cache with the composite key
  8. The response is returned to Engineer A

Cost: Full provider price (input tokens + output tokens).

Step 5: Second Engineer Sends Similar Prompt — Cache Hit

Minutes or hours later, Engineer B asks a semantically similar question:

What happens on a cache hit:

  1. Engineer B asks: "Explain the auth middleware flow"
  2. The gateway receives the request and checks the org-shared cache → hit
  3. No wallet reservation occurs
  4. No upstream provider call is made
  5. The cached response is returned to Engineer B
  6. An avoided-cost record is emitted

Cost: Zero. No provider tokens, no platform fee, no wallet transaction.

Step 6: Check Your Savings Dashboard

Navigate to Cost & Spend → Savings to see:

  • Cache hits today: Number of requests served from cache
  • Avoided cost today: Dollar amount saved by cache hits
  • Hit rate: Percentage of requests that hit cache
  • Fill cost today: Amount spent populating the cache

In the first 24 hours, your savings dashboard tells the story of the fill-then-save model playing out in real time.

Expected Outcomes by Time

Hour 1 (0-1h)

  • Fabric artifacts complete building
  • Cache is mostly empty
  • Hit rate: 0-5%
  • Cost: Higher than baseline (fill overhead)
  • Status: Investing

Hour 4 (1-4h)

  • Common questions about core modules start hitting cache
  • Hit rate: 15-30%
  • Cost: Approaching baseline
  • Status: Breaking even

Hour 12 (4-12h)

  • Most frequently-asked codebase questions are cached
  • Fabric context reuse kicks in (prompts are smaller because context is pre-built)
  • Hit rate: 40-60%
  • Cost: Below baseline
  • Status: Saving

Hour 24 (12-24h)

  • Coverage across primary codebase areas is strong
  • Engineers in different time zones benefit from earlier engineers' fills
  • Hit rate: 55-75%
  • Cost: Significantly below baseline
  • Status: Saving substantially
tip

Hit rates continue improving over the following days. Typical steady-state hit rates for teams sharing codebases reach 70-90% by the end of the first week.

Monitoring During the First Day

Keep these dashboards open during your first 24 hours:

Cost & Spend → Today

Watch the hourly cost trend. You should see costs peak in the first few hours then decline steadily.

Cache → Hit Rate

The hit rate graph should show a steady upward trend throughout the day as the cache fills.

Cache → Entries

The total cached entries count shows how quickly your shared context is building.

Repositories → Fabric Status

Confirm all artifact types show "Ready" status. If any are stuck in "Building" or "Error", investigate before the cache can achieve full effectiveness.

Common First-Day Issues

SymptomCauseFix
Hit rate stays at 0%Cache routing not enabled on active gatewayVerify gateway config includes workflow_cache.enabled: true
Fabric artifacts stuck in "Building"Worker not runningConfirm worker_cache_warmer is deployed and healthy
Cost spike much higher than expectedLarge repo with many files triggering bulk summariesNormal for initial fill — cost drops after first pass
Hit rate drops suddenlyConfig version bump invalidated cacheExpected after policy changes — cache re-fills automatically
Engineers report slower responsesCache lookup adding latency on missesVerify cache backend is in same region as gateway

What to Do After 24 Hours

After your first day:

  1. Check ROI: Compare today's total spend vs. your baseline daily spend
  2. Review hit rate: If below 40%, check that multiple engineers are sending overlapping traffic
  3. Tune TTL: If your code changes frequently, consider a shorter TTL for file-specific entries
  4. Add more repos: Each connected repo expands the cache's effectiveness
  5. Share results: Show the savings dashboard to your team lead and engineering director

Next steps

For AI systems

  • Canonical terms: Keeptrusts, org-shared cache, first 24 hours, cache routing, fill phase, cache hit, avoided cost, fabric artifacts, worker_cache_warmer.
  • Exact feature/config names: Configurations or managed manifest repository, workflow_cache.enabled: true, default_tier: org_shared_cache, ttl_seconds: 86400, Settings → Repositories, Cost & Spend → Savings, X-Cache-Status response header.
  • Best next pages: Setting Up Codebase Context Fabric, Cache Hit Rates, How 100 Engineers Share One Cache.

For engineers

  • Prerequisite checklist: admin console access, hosted-gateway gateway, connected repo, worker_cache_warmer running, 5+ active engineers, wallet funded at 2-3× daily spend.
  • Enable in declarative config: set workflow_cache.enabled: true, default_tier: org_shared_cache, ttl_seconds: 86400, and max_entry_tokens: 32000, then apply the configuration.
  • Watch fabric build at Repositories → [Repo] → Fabric Status — full build for a medium repo completes within 30 minutes.
  • If hit rate stays at 0%, verify workflow_cache.enabled: true in your active gateway config.
  • If fabric artifacts are stuck in "Building", confirm worker_cache_warmer is deployed and healthy.

For leaders

  • First-day economics: expect a fill-phase cost spike (2-3× daily baseline) that drops to 15-20% of baseline by day's end as cache populates.
  • Minimum viable test: 5 engineers on the same codebase with overlapping work patterns is enough to see measurable savings in 24 hours.
  • After 24 hours, compare today's spend vs. baseline daily spend to calculate initial ROI and justify team-wide rollout.
  • Share the savings dashboard with your engineering director to build momentum for expanding to additional repos and teams.