Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Cost Savings for Mono-Repo Teams

Monorepos are the ideal topology for org-shared caching. A single codebase identity means every engineer working across packages shares the same cache pool — one set of fabric artifacts, maximum semantic overlap, and hit rates that routinely reach 85–95%.

Use this page when

  • You have a monorepo and want to understand the cost savings achievable with org-shared caching.
  • You need configuration examples for monorepo_group mode and expected hit rate benchmarks.
  • You are presenting the monorepo caching case to leadership with savings projections.

Primary audience

  • Primary: Technical Leaders
  • Secondary: Technical Engineers, AI Agents

Why Monorepos Excel at Caching

When your entire engineering organization works in one repository, the conditions for cache efficiency compound:

  • Shared context: Engineers ask similar questions about the same modules, utilities, and patterns.
  • Unified fabric artifacts: A single fabric definition covers all packages, so cache keys align naturally.
  • High contributor density: 100+ engineers touching overlapping code paths generate repeated queries that resolve from cache.
  • Consistent coding patterns: Monorepo style guides and shared libraries mean prompts produce semantically identical responses.

Configuring Monorepo Group Mode

To enable monorepo-aware caching, set the codebase_identity_mode to monorepo_group in your policy configuration:

cache:
codebase_identity_mode: monorepo_group
monorepo_group_id: "platform-monorepo"
monorepo_repo_ids:
- "github.com/acme/platform"
fabric_scope: org
ttl_seconds: 86400

Configuration Fields

FieldDescription
codebase_identity_modeSet to monorepo_group to treat multiple packages as one cache pool.
monorepo_group_idA unique identifier for your monorepo group. All packages within share this identity.
monorepo_repo_idsThe repository identifiers that belong to this group.
fabric_scopeSet to org for org-wide sharing across all teams.
ttl_secondsHow long cached responses remain valid.

Example: Platform Monorepo with api/, cli/, console/

Consider a platform team with three packages in one repository:

platform/
├── api/ # Rust Axum backend
├── cli/ # Rust CLI binary
└── console/ # Next.js frontend

All three packages share utilities, types, and patterns. When an engineer on the console team asks about error handling, the cached response from an api/ engineer's identical question resolves instantly.

cache:
codebase_identity_mode: monorepo_group
monorepo_group_id: "platform-monorepo"
monorepo_repo_ids:
- "github.com/acme/platform"
fabric_scope: org
ttl_seconds: 86400
semantic_similarity_threshold: 0.92

Expected Hit Rates by Package Overlap

ScenarioTypical Hit RateMonthly Savings (100 engineers)
All packages share types and utilities90–95%90–95% of upstream costs avoided
Moderate shared code (50% overlap)85–90%85–90% of upstream costs avoided
Loosely coupled packages75–85%75–85% of upstream costs avoided

How Cache Hits Save Money

Every cache hit in a monorepo avoids three costs simultaneously:

  1. No upstream provider call — the LLM provider is never contacted.
  2. No wallet reserve/settle — your team's wallet balance is not debited.
  3. No platform fee — Keeptrusts does not charge for cached responses.

With a 90% hit rate across 100 engineers making 50 requests per day, you avoid 4,500 upstream calls daily. At an average cost of $0.03 per request, that represents $135/day or approximately $4,050/month in savings.

Monitoring Monorepo Cache Performance

Use the savings dashboard to track:

  • Hit rate by package: Identify which packages benefit most from shared context.
  • Estimated avoided cost: The total dollar amount saved through cache hits.
  • Fill frequency: How often new responses enter the cache pool.
  • Peak sharing hours: When cross-team cache sharing is highest.

Best Practices for Monorepo Caching

  • Keep semantic_similarity_threshold at 0.92 or above to avoid false positives.
  • Set ttl_seconds to 86400 (24 hours) for stable codebases, or shorter for rapidly evolving packages.
  • Use org-scoped fabric so all teams within the monorepo benefit from each other's cache fills.
  • Review the savings dashboard weekly to identify packages with unexpectedly low hit rates.
  • Ensure your fabric artifacts cover shared types and utilities that appear across all packages.

Scaling Considerations

As your monorepo grows beyond 200 engineers, cache hit rates typically improve further because query diversity plateaus while contributor count continues rising. The marginal engineer adds more cache hits than cache misses.

For monorepos with distinct sub-teams that rarely share code, consider splitting into multiple monorepo_group_id values to keep cache pools focused and hit rates high.

For AI systems

For engineers

  • Set codebase_identity_mode: monorepo_group with your monorepo’s repository ID.
  • Use fabric_scope: org for org-wide sharing across all teams within the monorepo.
  • Keep semantic_similarity_threshold at 0.92+ to avoid false-positive cross-package cache hits.
  • Set ttl_seconds: 86400 (24h) for stable codebases; shorter for rapidly evolving packages.
  • Monitor per-package hit rates in the savings dashboard to identify underperforming packages.

For leaders

  • Monorepos achieve 85–95% hit rates — the highest of any topology — because all engineers share one cache pool.
  • 100 engineers making 50 requests/day at 90% hit rate: avoid 4,500 upstream calls daily = ~$4,050/month savings.
  • Every cache hit avoids all three costs simultaneously: no provider call, no wallet debit, no platform fee.
  • As the monorepo grows past 200 engineers, hit rates improve because query diversity plateaus while contributors increase.
  • Review weekly: identify packages with low hit rates for potential excluded_agents or threshold adjustments.

Next steps