Cost Savings for Mono-Repo Teams
Monorepos are the ideal topology for org-shared caching. A single codebase identity means every engineer working across packages shares the same cache pool — one set of fabric artifacts, maximum semantic overlap, and hit rates that routinely reach 85–95%.
Use this page when
- You have a monorepo and want to understand the cost savings achievable with org-shared caching.
- You need configuration examples for
monorepo_groupmode and expected hit rate benchmarks. - You are presenting the monorepo caching case to leadership with savings projections.
Primary audience
- Primary: Technical Leaders
- Secondary: Technical Engineers, AI Agents
Why Monorepos Excel at Caching
When your entire engineering organization works in one repository, the conditions for cache efficiency compound:
- Shared context: Engineers ask similar questions about the same modules, utilities, and patterns.
- Unified fabric artifacts: A single fabric definition covers all packages, so cache keys align naturally.
- High contributor density: 100+ engineers touching overlapping code paths generate repeated queries that resolve from cache.
- Consistent coding patterns: Monorepo style guides and shared libraries mean prompts produce semantically identical responses.
Configuring Monorepo Group Mode
To enable monorepo-aware caching, set the codebase_identity_mode to monorepo_group in your policy configuration:
cache:
codebase_identity_mode: monorepo_group
monorepo_group_id: "platform-monorepo"
monorepo_repo_ids:
- "github.com/acme/platform"
fabric_scope: org
ttl_seconds: 86400
Configuration Fields
| Field | Description |
|---|---|
codebase_identity_mode | Set to monorepo_group to treat multiple packages as one cache pool. |
monorepo_group_id | A unique identifier for your monorepo group. All packages within share this identity. |
monorepo_repo_ids | The repository identifiers that belong to this group. |
fabric_scope | Set to org for org-wide sharing across all teams. |
ttl_seconds | How long cached responses remain valid. |
Example: Platform Monorepo with api/, cli/, console/
Consider a platform team with three packages in one repository:
platform/
├── api/ # Rust Axum backend
├── cli/ # Rust CLI binary
└── console/ # Next.js frontend
All three packages share utilities, types, and patterns. When an engineer on the console team asks about error handling, the cached response from an api/ engineer's identical question resolves instantly.
cache:
codebase_identity_mode: monorepo_group
monorepo_group_id: "platform-monorepo"
monorepo_repo_ids:
- "github.com/acme/platform"
fabric_scope: org
ttl_seconds: 86400
semantic_similarity_threshold: 0.92
Expected Hit Rates by Package Overlap
| Scenario | Typical Hit Rate | Monthly Savings (100 engineers) |
|---|---|---|
| All packages share types and utilities | 90–95% | 90–95% of upstream costs avoided |
| Moderate shared code (50% overlap) | 85–90% | 85–90% of upstream costs avoided |
| Loosely coupled packages | 75–85% | 75–85% of upstream costs avoided |
How Cache Hits Save Money
Every cache hit in a monorepo avoids three costs simultaneously:
- No upstream provider call — the LLM provider is never contacted.
- No wallet reserve/settle — your team's wallet balance is not debited.
- No platform fee — Keeptrusts does not charge for cached responses.
With a 90% hit rate across 100 engineers making 50 requests per day, you avoid 4,500 upstream calls daily. At an average cost of $0.03 per request, that represents $135/day or approximately $4,050/month in savings.
Monitoring Monorepo Cache Performance
Use the savings dashboard to track:
- Hit rate by package: Identify which packages benefit most from shared context.
- Estimated avoided cost: The total dollar amount saved through cache hits.
- Fill frequency: How often new responses enter the cache pool.
- Peak sharing hours: When cross-team cache sharing is highest.
Best Practices for Monorepo Caching
- Keep
semantic_similarity_thresholdat 0.92 or above to avoid false positives. - Set
ttl_secondsto 86400 (24 hours) for stable codebases, or shorter for rapidly evolving packages. - Use org-scoped fabric so all teams within the monorepo benefit from each other's cache fills.
- Review the savings dashboard weekly to identify packages with unexpectedly low hit rates.
- Ensure your fabric artifacts cover shared types and utilities that appear across all packages.
Scaling Considerations
As your monorepo grows beyond 200 engineers, cache hit rates typically improve further because query diversity plateaus while contributor count continues rising. The marginal engineer adds more cache hits than cache misses.
For monorepos with distinct sub-teams that rarely share code, consider splitting into multiple monorepo_group_id values to keep cache pools focused and hit rates high.
For AI systems
- Canonical terms: Keeptrusts, monorepo, org-shared cache,
monorepo_group,codebase_identity_mode, cache hit rate, semantic_similarity_threshold, fabric_scope. - Config keys:
cache.codebase_identity_mode: monorepo_group,cache.monorepo_group_id,cache.monorepo_repo_ids,cache.fabric_scope: org,cache.semantic_similarity_threshold. - Best next pages: Configuring Monorepo Group Caching, Cost Savings for Multi-Repo Teams, ROI Calculation for a 100-Engineer Team.
For engineers
- Set
codebase_identity_mode: monorepo_groupwith your monorepo’s repository ID. - Use
fabric_scope: orgfor org-wide sharing across all teams within the monorepo. - Keep
semantic_similarity_thresholdat 0.92+ to avoid false-positive cross-package cache hits. - Set
ttl_seconds: 86400(24h) for stable codebases; shorter for rapidly evolving packages. - Monitor per-package hit rates in the savings dashboard to identify underperforming packages.
For leaders
- Monorepos achieve 85–95% hit rates — the highest of any topology — because all engineers share one cache pool.
- 100 engineers making 50 requests/day at 90% hit rate: avoid 4,500 upstream calls daily = ~$4,050/month savings.
- Every cache hit avoids all three costs simultaneously: no provider call, no wallet debit, no platform fee.
- As the monorepo grows past 200 engineers, hit rates improve because query diversity plateaus while contributors increase.
- Review weekly: identify packages with low hit rates for potential
excluded_agentsor threshold adjustments.
Next steps
- Configuring Monorepo Group Caching — detailed setup guide
- Cost Savings for Multi-Repo Teams — alternative topology
- ROI Calculation for a 100-Engineer Team — full business case