Estimating Fill Cost for a New Repository

The initial cache fill is a one-time investment that unlocks ongoing savings. Before connecting a new repository, estimate the fill cost so you can budget appropriately and set expectations with stakeholders.

Use this page when

You are connecting a new repository and need to budget the initial cache fill cost.
You want per-artifact-type cost formulas to estimate fabric build expense.
You are comparing fill cost against annual uncached spend to justify the investment.

Primary audience

Primary: Technical Leaders
Secondary: Technical Engineers, AI Agents

What "Fill Cost" Means

Fill cost is the total LLM provider spend required to:

Build fabric artifacts — summarize files, generate embeddings, map dependencies
Populate the response cache — the first request for each common question pays full price

Fabric artifact build is a predictable, calculable cost. Response cache population is driven by organic traffic and happens gradually over days.

Fill Cost Formula

Total fill cost = Fabric build cost + Response cache fill cost

Where:

Fabric build cost = Σ (artifact_type_cost × count_per_type)
Response cache fill cost ≈ Daily uncached spend × fill_days × (1 - early_hit_rate)

In practice, fabric build cost is the larger and more predictable component. Response cache fill cost depends on traffic patterns.

Fabric Build Cost by Artifact Type

`repo_map`

Generates a high-level structural map of the repository.

Factor	Impact on cost
Number of top-level directories	Low (1-3 LLM calls total)
Repository documentation	Low (adds context to calls)

Typical cost: $0.02-0.10 per repository

Formula: repo_map_cost ≈ 3 calls × avg_call_cost

`file_summary`

Generates natural language summaries for each source file.

Factor	Impact on cost
Number of source files	High (1 call per file)
Average file size	Medium (larger files = more input tokens)
Language complexity	Low (all languages similar cost)

Typical cost: $0.002-0.008 per file

Formula: file_summary_cost = num_files × avg_tokens_per_file × input_cost_per_token + num_files × avg_summary_tokens × output_cost_per_token

Example for 1,000 files:

Input:  1,000 files × 800 avg tokens × $3/1M = $2.40
Output: 1,000 files × 200 avg summary tokens × $15/1M = $3.00
Total:  $5.40

`dependency_graph`

Maps module and package dependencies.

Factor	Impact on cost
Number of modules	Low (static analysis dominant)
Dependency depth	Low (graph traversal, not LLM)

Typical cost: $0.01-0.05 per repository (mostly static analysis, minimal LLM)

`test_map`

Maps test files to source files.

Factor	Impact on cost
Number of test files	Low-Medium
Test framework complexity	Low

Typical cost: $0.05-0.30 per repository

Formula: test_map_cost ≈ num_test_files × 0.001

`api_inventory`

Catalogs API endpoints and their schemas.

Factor	Impact on cost
Number of route handlers	Medium
Middleware complexity	Low
Number of endpoints	Medium

Typical cost: $0.10-1.00 per repository (depends on API surface area)

Formula: api_inventory_cost ≈ num_endpoints × 0.005

`symbol_index`

Indexes functions, classes, and types.

Factor	Impact on cost
Any	None — pure static analysis

Typical cost: $0.00 (no LLM calls, AST parsing only)

`embedding_index`

Generates semantic embedding vectors for code chunks.

Factor	Impact on cost
Total lines of code	High (determines number of chunks)
Chunk size setting	Medium (smaller chunks = more embeddings)

Typical cost: $0.0001 per chunk (embedding models are cheap)

Formula: embedding_cost = (total_lines ÷ chunk_size) × embedding_cost_per_call

Example for 100,000 lines (200-line chunks):

Chunks: 100,000 ÷ 200 = 500 chunks
Cost: 500 × $0.0001 = $0.05

`recent_change_summary`

Summarizes recent commit history.

Factor	Impact on cost
Number of recent commits	Low-Medium
Commit message quality	Low

Typical cost: $0.05-0.20 per repository

`known_failure_fingerprint`

Catalogs known error patterns and their resolutions.

Factor	Impact on cost
CI/CD log availability	Medium
Number of known failures	Low

Typical cost: $0.02-0.10 per repository

Cost Estimation by Repository Size

Small Repository (fewer than 10,000 lines, fewer than 100 files)

Artifact	Estimated cost
repo_map	$0.03
file_summary	$0.50
dependency_graph	$0.02
test_map	$0.05
api_inventory	$0.10
symbol_index	$0.00
embedding_index	$0.01
recent_change_summary	$0.05
known_failure_fingerprint	$0.03
Total fabric build	$0.79

Medium Repository (10,000-100,000 lines, 100-1,000 files)

Artifact	Estimated cost
repo_map	$0.05
file_summary	$3.00-8.00
dependency_graph	$0.03
test_map	$0.15
api_inventory	$0.40
symbol_index	$0.00
embedding_index	$0.05-0.10
recent_change_summary	$0.10
known_failure_fingerprint	$0.05
Total fabric build	$3.83-8.88

Large Repository (100,000-500,000 lines, 1,000-5,000 files)

Artifact	Estimated cost
repo_map	$0.08
file_summary	$8.00-30.00
dependency_graph	$0.05
test_map	$0.30
api_inventory	$1.00-3.00
symbol_index	$0.00
embedding_index	$0.10-0.50
recent_change_summary	$0.15
known_failure_fingerprint	$0.08
Total fabric build	$9.76-34.16

Very Large Repository (500,000+ lines, 5,000+ files)

Artifact	Estimated cost
repo_map	$0.10
file_summary	$30.00-80.00
dependency_graph	$0.08
test_map	$0.50
api_inventory	$3.00-8.00
symbol_index	$0.00
embedding_index	$0.50-2.50
recent_change_summary	$0.20
known_failure_fingerprint	$0.10
Total fabric build	$34.48-91.48

Response Cache Fill Cost

Beyond fabric, the response cache fills as engineers work. This cost is:

Response fill cost ≈ daily_traffic × (1 - early_hit_rate) × avg_cost_per_request × fill_days

For a 100-engineer team:

Day 1: 5,000 requests × (1 - 0.10) × $0.015 = $67.50
Day 2: 5,000 requests × (1 - 0.35) × $0.015 = $48.75
Day 3: 5,000 requests × (1 - 0.55) × $0.015 = $33.75
Day 4: 5,000 requests × (1 - 0.70) × $0.015 = $22.50
Day 5: 5,000 requests × (1 - 0.78) × $0.015 = $16.50

Total 5-day response fill cost: ~$189

This is not "extra" cost — it's what you would have spent anyway, just with the bonus of building the cache layer for future savings.

Total Fill Cost vs. Annual Uncached Spend

The fill cost is tiny compared to what you save:

Repository size	Fabric build	Response fill (5 days)	Total fill	Annual uncached spend	Fill as % of annual
Small	$0.79	$50	~$51	$12,000	0.4%
Medium	$6.00	$120	~$126	$36,000	0.4%
Large	$20.00	$189	~$209	$48,000	0.4%
Very Large	$60.00	$250	~$310	$60,000	0.5%

The fill cost is consistently under 1% of annual uncached spend. It pays for itself within the first week.

Ongoing Refill Cost

After initial fill, the cache requires maintenance fills:

Daily Incremental Cost

Daily refill = changed_files × file_summary_cost + new_questions × miss_cost

For a typical day (30 files changed, 500 new unique questions):

Fabric refresh: 30 × $0.005 = $0.15
New response fills: 500 × $0.015 = $7.50
Total daily refill: $7.65

Compare to daily savings of $150-200 — the refill is under 5% of savings.

Monthly Maintenance Budget

Team size	Daily refill	Monthly refill	Monthly savings	Net monthly savings
50 engineers	$4.00	$80	$2,500	$2,420
100 engineers	$7.65	$153	$4,000	$3,847
200 engineers	$12.00	$240	$8,000	$7,760

Planning Your Fill Budget

Step 1: Identify Repository Size

# Count source files (exclude vendor, node_modules, build artifacts)
find . -name "*.ts" -o -name "*.rs" -o -name "*.py" | wc -l

# Count total lines
find . -name "*.ts" -o -name "*.rs" -o -name "*.py" | xargs wc -l | tail -1

Step 2: Estimate Using Size Bracket

Match your repo to the size brackets above and use the estimated total.

Step 3: Add Response Fill Budget

Add 3-5 days of normal spend as response fill budget:

Response fill budget = current_daily_AI_spend × 5

Step 4: Set Wallet Balance

Ensure your org wallet has:

Required balance = fabric_build_estimate + response_fill_budget + 20% buffer

Step 5: Communicate to Stakeholders

Frame the fill cost as:

A one-time investment that pays back within 1 week
Less than 1% of what you'd spend annually without caching
The "price of admission" to 80-95% ongoing savings

Next steps

Connecting Your First Repository — start the fill process
Your First 24 Hours — watch fill economics in real time
Cache Hit Rates: What Good Looks Like — track your return on the fill investment

For AI systems

Canonical terms: Keeptrusts, fill cost, fabric build cost, response cache fill, per-artifact cost, file_summary, embedding_index, symbol_index, cache investment.
Formulas: Total fill = Fabric build + Response cache fill, Fabric build = Σ(artifact_type_cost × count), Response fill ≈ daily_traffic × (1 - early_hit_rate) × avg_cost × fill_days.
Best next pages: Budget Alerts for Fill Phases, Cache Hit Rates, ROI Calculation for a 100-Engineer Team.

For engineers

symbol_index costs $0 (pure AST parsing). embedding_index is cheap ($0.0001/chunk). file_summary dominates cost (1 LLM call per file).
Size your repo: find . -name "*.ts" -o -name "*.rs" | wc -l for file count, | xargs wc -l | tail -1 for LOC.
Small repo (fewer than 100 files): ~$0.79 fabric. Medium (100-1000 files): ~$4-$9. Large (1000-5000 files): ~$10-$34.
Response fill budget: multiply current daily AI spend by 5 days as fill budget.
Wallet balance needed: fabric_estimate + response_fill_budget + 20% buffer.
Ongoing daily refill is typically less than 5% of daily savings.

For leaders

Fill cost is consistently less than 1% of annual uncached spend. It pays for itself within the first week.
Total fill for a medium repo (100 engineers): ~$126. Annual savings: ~$36,000. ROI > 280×.
The cost is not additional spend — response fills are requests you’d pay for anyway, just with the bonus of populating cache.
Frame to stakeholders: one-time investment, less than 1% of annual cost, enables 80-95% ongoing reduction.

Use this page when​

Primary audience​

What "Fill Cost" Means​

Fill Cost Formula​

Fabric Build Cost by Artifact Type​

repo_map​

file_summary​

dependency_graph​

test_map​

api_inventory​

symbol_index​

embedding_index​

recent_change_summary​

known_failure_fingerprint​

Cost Estimation by Repository Size​

Small Repository (fewer than 10,000 lines, fewer than 100 files)​

Medium Repository (10,000-100,000 lines, 100-1,000 files)​

Large Repository (100,000-500,000 lines, 1,000-5,000 files)​

Very Large Repository (500,000+ lines, 5,000+ files)​

Response Cache Fill Cost​

Total Fill Cost vs. Annual Uncached Spend​

Ongoing Refill Cost​

Daily Incremental Cost​

Monthly Maintenance Budget​

Planning Your Fill Budget​

Step 1: Identify Repository Size​

Step 2: Estimate Using Size Bracket​

Step 3: Add Response Fill Budget​

Step 4: Set Wallet Balance​

Step 5: Communicate to Stakeholders​

Next steps​

For AI systems​

For engineers​

For leaders​