The Economics of Shared Codebases

Shared codebases create the perfect conditions for AI cost savings. When multiple engineers work on the same code, they naturally generate overlapping AI requests. The more engineers and the fewer repositories, the higher the cache hit rate and the greater the savings.

Use this page when

You want to understand why shared codebases produce the highest cache hit rates and savings.
You are building a business case for org-shared cache based on your team's prompt overlap patterns.
You need data on the five sources of prompt overlap and their typical redundancy rates.

Primary audience

Primary: Technical Engineers
Secondary: AI Agents, Technical Leaders

Why Shared Codebases Are Special

Not all engineering teams benefit equally from shared caching. The economics are strongest when:

Many engineers work on few repositories (high density)
The codebase has shared modules used by everyone
Engineering work follows sprint patterns (everyone exploring similar areas)
Onboarding is continuous (new engineers ask questions others already asked)

Shared codebases create natural convergence in AI requests — without any coordination, engineers independently ask similar questions about the same code.

The Five Sources of Prompt Overlap

1. Code Understanding Requests

Engineers ask LLMs to explain code they didn't write. In a shared codebase, multiple engineers encounter the same unfamiliar code:

"What does this middleware chain do?"
"How does the retry logic work in HttpClient?"
"Explain the event sourcing pattern in OrderService"

Overlap rate: 85-95%. Core modules get explained to multiple engineers per week.

2. Architecture and Design Questions

Engineers ask about system design and data flow. In a shared codebase, the architecture is common knowledge that everyone needs:

"How do requests flow from the API gateway to the database?"
"What's the caching strategy for user sessions?"
"How does the notification system work?"

Overlap rate: 90-97%. Architecture questions are nearly identical across engineers.

3. Debugging and Error Diagnosis

When something breaks, multiple engineers investigate. They paste the same errors and ask similar diagnostic questions:

"What causes ConnectionPool exhausted in production?"
"Why is this test flaking with timeout errors?"
"What's the root cause of this null pointer in UserSerializer?"

Overlap rate: 80-92%. Common errors generate repeated diagnosis requests.

4. Refactoring and Migration Guidance

During team-wide refactoring efforts, every engineer needs guidance on the same patterns:

"How do I migrate from callback-style to async/await in this module?"
"What's the pattern for converting this class to the new service interface?"
"Show me how to add error handling following the team's Result pattern"

Overlap rate: 88-95%. Migration patterns are highly repetitive across files.

5. Code Generation with Shared Context

Engineers generate new code that follows existing patterns. The context (existing code, types, interfaces) is the same for everyone:

"Write a new handler following the pattern in UserHandler"
"Create a test file for PaymentService following our test conventions"
"Add a new endpoint similar to GET /api/users"

Overlap rate: 70-85%. The context overlap is high even when the generation target differs.

The Density Equation

Cache hit rate correlates directly with team density on shared code:

Expected hit rate ≈ 1 - (unique_daily_questions / total_daily_prompts)

Where:

unique_daily_questions = truly novel questions not similar to any previous question
total_daily_prompts = all prompts sent by all engineers

Density Scenarios

Engineers	Repos	Expected hit rate	Monthly savings (from $5k baseline)
10	1	55-65%	$2,750-3,250
50	3	70-80%	$3,500-4,000
100	5	80-90%	$4,000-4,500
200	5	88-95%	$4,400-4,750
500	10	90-96%	$4,500-4,800

The relationship is logarithmic: each additional engineer contributes less marginal diversity (most questions have been asked) and more marginal cache hits.

Industry Data: How Engineers Use AI

Research on engineering team AI usage reveals consistent patterns:

Prompt Categories (typical distribution)

Category	% of prompts	Avg input tokens	Cache affinity
Code explanation	30%	3,500	Very high
Code generation	25%	5,000	High (same context)
Debugging	20%	4,000	High (same errors)
Refactoring	10%	4,500	Very high
Documentation	8%	2,500	High
Architecture	7%	3,000	Very high

Daily Prompt Volume (per engineer)

Seniority	Prompts/day	Token-heavy?	Cache benefit
Junior (0-2 years)	60-80	Yes (more context needed)	Highest
Mid-level (2-5 years)	40-60	Moderate	High
Senior (5+ years)	20-40	Less (more targeted)	Moderate
Staff+ (8+ years)	10-25	Least (precise queries)	Lower per-person, but fills cache for others

Junior engineers benefit most from caching because they ask more questions and their questions have higher overlap with what others have already asked.

The "Same Codebase, Different Engineer" Effect

The most powerful savings come from a simple observation: when two engineers look at the same code, they ask similar questions.

Example: A Shared Payment Module

Monday 9:00 AM - Engineer A (frontend team):
  "How does the PaymentService validate credit cards?"
  → Cache miss → $0.02 → Response cached

Monday 10:30 AM - Engineer B (mobile team):
  "Explain the credit card validation in PaymentService"
  → Cache hit → $0.00

Monday 2:00 PM - Engineer C (new hire, platform team):
  "What validation does PaymentService do for card payments?"
  → Cache hit → $0.00

Tuesday 9:15 AM - Engineer D (QA team):
  "How does card validation work in the payment flow?"
  → Cache hit → $0.00

Wednesday 11:00 AM - Engineer E (security review):
  "What's the credit card validation logic?"
  → Cache hit → $0.00

One fill, four free responses. Multiply this by hundreds of functions and modules across your codebase, and the savings compound massively.

When Shared Codebases Don't Help

Be realistic about scenarios where cache savings are lower:

Low-Overlap Scenarios

Completely independent microservices: If each team owns isolated code nobody else touches, overlap is minimal
Highly personal tooling: Scripts and configs unique to one engineer
Novel research/prototyping: Genuinely new questions without precedent in the codebase
Rapidly changing code: If files change every hour, cached responses about them become stale quickly

Mitigation Strategies

Even in lower-overlap scenarios, you can increase cache effectiveness:

Connect shared libraries first: Even in a microservices org, shared packages generate high overlap
Tune TTL per repo: Stable repos get longer TTLs; volatile repos get shorter ones
Use fabric for structure: Even if specific file content changes, structural questions (architecture, dependencies) remain stable

Building the Business Case

For Engineering Managers

Frame shared-codebase caching as a team efficiency multiplier:

It removes the "coordination tax" of everyone independently learning the same code
It makes the first person's AI usage investment benefit the entire team
It means a team of 100 is effectively paying the AI cost of a team of 15

For Finance/Procurement

Frame it as a unit economics improvement:

Current: AI cost scales linearly with headcount ($X per engineer)
With caching: AI cost scales with codebase complexity, not headcount
Net effect: As you hire more engineers, per-engineer AI cost decreases

For CTOs/VPs

Frame it as an infrastructure investment with compounding returns:

Year 1 savings: 70-90% of current AI spend
The savings grow as team grows (more cache hits per dollar of fill)
No behavior change required from engineers — transparent to users
Governance and observability maintained — policies still enforce on cache hits

Next steps

Why Engineering Teams Pay Too Much — quantify the waste
How 100 Engineers Share One Cache — the mechanics of sharing
Measuring Your Baseline Spend — prepare your before numbers

For AI systems

Canonical terms: Keeptrusts, shared codebase economics, prompt overlap, cache affinity, density equation, org-shared cache, natural convergence.
Key metrics: overlap rate (85-97% for shared codebases), expected hit rate by team density, five sources of prompt overlap (code understanding, architecture, debugging, refactoring, code generation).
Best next pages: Why Teams Overpay, How 100 Engineers Share One Cache, Measuring Baseline Spend.

For engineers

Connect shared libraries first — even in microservices orgs, shared packages generate the highest prompt overlap.
Tune TTL per repo: stable repos get longer TTLs; volatile repos get shorter ones to balance freshness and hit rate.
Use fabric for structural questions (architecture, dependencies) which remain stable even as file content changes.
The density equation: expected_hit_rate ≈ 1 - (unique_daily_questions / total_daily_prompts) — measure this for your team.

For leaders

Shared codebases create 85-97% prompt overlap — meaning 85-97% of your current AI spend is redundant.
The relationship between team size and savings is logarithmic: each additional engineer adds minimal cost but increases cache value.
Remove the "coordination tax" — without shared caching, every engineer independently discovers the same knowledge at full price.
AI cost shifts from scaling linearly with headcount to scaling with codebase complexity, making budget predictable regardless of hiring.

Use this page when​

Primary audience​

Why Shared Codebases Are Special​

The Five Sources of Prompt Overlap​

1. Code Understanding Requests​

2. Architecture and Design Questions​

3. Debugging and Error Diagnosis​

4. Refactoring and Migration Guidance​

5. Code Generation with Shared Context​

The Density Equation​

Density Scenarios​

Industry Data: How Engineers Use AI​

Prompt Categories (typical distribution)​

Daily Prompt Volume (per engineer)​

The "Same Codebase, Different Engineer" Effect​

Example: A Shared Payment Module​

When Shared Codebases Don't Help​

Low-Overlap Scenarios​

Mitigation Strategies​

Building the Business Case​

For Engineering Managers​

For Finance/Procurement​

For CTOs/VPs​

Next steps​

For AI systems​

For engineers​

For leaders​