The Economics of Shared Codebases
Shared codebases create the perfect conditions for AI cost savings. When multiple engineers work on the same code, they naturally generate overlapping AI requests. The more engineers and the fewer repositories, the higher the cache hit rate and the greater the savings.
Use this page when
- You want to understand why shared codebases produce the highest cache hit rates and savings.
- You are building a business case for org-shared cache based on your team's prompt overlap patterns.
- You need data on the five sources of prompt overlap and their typical redundancy rates.
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
Why Shared Codebases Are Special
Not all engineering teams benefit equally from shared caching. The economics are strongest when:
- Many engineers work on few repositories (high density)
- The codebase has shared modules used by everyone
- Engineering work follows sprint patterns (everyone exploring similar areas)
- Onboarding is continuous (new engineers ask questions others already asked)
Shared codebases create natural convergence in AI requests — without any coordination, engineers independently ask similar questions about the same code.
The Five Sources of Prompt Overlap
1. Code Understanding Requests
Engineers ask LLMs to explain code they didn't write. In a shared codebase, multiple engineers encounter the same unfamiliar code:
- "What does this middleware chain do?"
- "How does the retry logic work in
HttpClient?" - "Explain the event sourcing pattern in
OrderService"
Overlap rate: 85-95%. Core modules get explained to multiple engineers per week.
2. Architecture and Design Questions
Engineers ask about system design and data flow. In a shared codebase, the architecture is common knowledge that everyone needs:
- "How do requests flow from the API gateway to the database?"
- "What's the caching strategy for user sessions?"
- "How does the notification system work?"
Overlap rate: 90-97%. Architecture questions are nearly identical across engineers.
3. Debugging and Error Diagnosis
When something breaks, multiple engineers investigate. They paste the same errors and ask similar diagnostic questions:
- "What causes
ConnectionPool exhaustedin production?" - "Why is this test flaking with timeout errors?"
- "What's the root cause of this null pointer in
UserSerializer?"
Overlap rate: 80-92%. Common errors generate repeated diagnosis requests.
4. Refactoring and Migration Guidance
During team-wide refactoring efforts, every engineer needs guidance on the same patterns:
- "How do I migrate from callback-style to async/await in this module?"
- "What's the pattern for converting this class to the new service interface?"
- "Show me how to add error handling following the team's
Resultpattern"
Overlap rate: 88-95%. Migration patterns are highly repetitive across files.
5. Code Generation with Shared Context
Engineers generate new code that follows existing patterns. The context (existing code, types, interfaces) is the same for everyone:
- "Write a new handler following the pattern in
UserHandler" - "Create a test file for
PaymentServicefollowing our test conventions" - "Add a new endpoint similar to
GET /api/users"
Overlap rate: 70-85%. The context overlap is high even when the generation target differs.
The Density Equation
Cache hit rate correlates directly with team density on shared code:
Expected hit rate ≈ 1 - (unique_daily_questions / total_daily_prompts)
Where:
unique_daily_questions= truly novel questions not similar to any previous questiontotal_daily_prompts= all prompts sent by all engineers
Density Scenarios
| Engineers | Repos | Expected hit rate | Monthly savings (from $5k baseline) |
|---|---|---|---|
| 10 | 1 | 55-65% | $2,750-3,250 |
| 50 | 3 | 70-80% | $3,500-4,000 |
| 100 | 5 | 80-90% | $4,000-4,500 |
| 200 | 5 | 88-95% | $4,400-4,750 |
| 500 | 10 | 90-96% | $4,500-4,800 |
The relationship is logarithmic: each additional engineer contributes less marginal diversity (most questions have been asked) and more marginal cache hits.
Industry Data: How Engineers Use AI
Research on engineering team AI usage reveals consistent patterns:
Prompt Categories (typical distribution)
| Category | % of prompts | Avg input tokens | Cache affinity |
|---|---|---|---|
| Code explanation | 30% | 3,500 | Very high |
| Code generation | 25% | 5,000 | High (same context) |
| Debugging | 20% | 4,000 | High (same errors) |
| Refactoring | 10% | 4,500 | Very high |
| Documentation | 8% | 2,500 | High |
| Architecture | 7% | 3,000 | Very high |
Daily Prompt Volume (per engineer)
| Seniority | Prompts/day | Token-heavy? | Cache benefit |
|---|---|---|---|
| Junior (0-2 years) | 60-80 | Yes (more context needed) | Highest |
| Mid-level (2-5 years) | 40-60 | Moderate | High |
| Senior (5+ years) | 20-40 | Less (more targeted) | Moderate |
| Staff+ (8+ years) | 10-25 | Least (precise queries) | Lower per-person, but fills cache for others |
Junior engineers benefit most from caching because they ask more questions and their questions have higher overlap with what others have already asked.
The "Same Codebase, Different Engineer" Effect
The most powerful savings come from a simple observation: when two engineers look at the same code, they ask similar questions.
Example: A Shared Payment Module
Monday 9:00 AM - Engineer A (frontend team):
"How does the PaymentService validate credit cards?"
→ Cache miss → $0.02 → Response cached
Monday 10:30 AM - Engineer B (mobile team):
"Explain the credit card validation in PaymentService"
→ Cache hit → $0.00
Monday 2:00 PM - Engineer C (new hire, platform team):
"What validation does PaymentService do for card payments?"
→ Cache hit → $0.00
Tuesday 9:15 AM - Engineer D (QA team):
"How does card validation work in the payment flow?"
→ Cache hit → $0.00
Wednesday 11:00 AM - Engineer E (security review):
"What's the credit card validation logic?"
→ Cache hit → $0.00
One fill, four free responses. Multiply this by hundreds of functions and modules across your codebase, and the savings compound massively.
When Shared Codebases Don't Help
Be realistic about scenarios where cache savings are lower:
Low-Overlap Scenarios
- Completely independent microservices: If each team owns isolated code nobody else touches, overlap is minimal
- Highly personal tooling: Scripts and configs unique to one engineer
- Novel research/prototyping: Genuinely new questions without precedent in the codebase
- Rapidly changing code: If files change every hour, cached responses about them become stale quickly
Mitigation Strategies
Even in lower-overlap scenarios, you can increase cache effectiveness:
- Connect shared libraries first: Even in a microservices org, shared packages generate high overlap
- Tune TTL per repo: Stable repos get longer TTLs; volatile repos get shorter ones
- Use fabric for structure: Even if specific file content changes, structural questions (architecture, dependencies) remain stable
Building the Business Case
For Engineering Managers
Frame shared-codebase caching as a team efficiency multiplier:
- It removes the "coordination tax" of everyone independently learning the same code
- It makes the first person's AI usage investment benefit the entire team
- It means a team of 100 is effectively paying the AI cost of a team of 15
For Finance/Procurement
Frame it as a unit economics improvement:
- Current: AI cost scales linearly with headcount ($X per engineer)
- With caching: AI cost scales with codebase complexity, not headcount
- Net effect: As you hire more engineers, per-engineer AI cost decreases
For CTOs/VPs
Frame it as an infrastructure investment with compounding returns:
- Year 1 savings: 70-90% of current AI spend
- The savings grow as team grows (more cache hits per dollar of fill)
- No behavior change required from engineers — transparent to users
- Governance and observability maintained — policies still enforce on cache hits
Next steps
- Why Engineering Teams Pay Too Much — quantify the waste
- How 100 Engineers Share One Cache — the mechanics of sharing
- Measuring Your Baseline Spend — prepare your before numbers
For AI systems
- Canonical terms: Keeptrusts, shared codebase economics, prompt overlap, cache affinity, density equation, org-shared cache, natural convergence.
- Key metrics: overlap rate (85-97% for shared codebases), expected hit rate by team density, five sources of prompt overlap (code understanding, architecture, debugging, refactoring, code generation).
- Best next pages: Why Teams Overpay, How 100 Engineers Share One Cache, Measuring Baseline Spend.
For engineers
- Connect shared libraries first — even in microservices orgs, shared packages generate the highest prompt overlap.
- Tune TTL per repo: stable repos get longer TTLs; volatile repos get shorter ones to balance freshness and hit rate.
- Use fabric for structural questions (architecture, dependencies) which remain stable even as file content changes.
- The density equation:
expected_hit_rate ≈ 1 - (unique_daily_questions / total_daily_prompts)— measure this for your team.
For leaders
- Shared codebases create 85-97% prompt overlap — meaning 85-97% of your current AI spend is redundant.
- The relationship between team size and savings is logarithmic: each additional engineer adds minimal cost but increases cache value.
- Remove the "coordination tax" — without shared caching, every engineer independently discovers the same knowledge at full price.
- AI cost shifts from scaling linearly with headcount to scaling with codebase complexity, making budget predictable regardless of hiring.