Setting Up Codebase Context Fabric
Codebase Context Fabric is the system that pre-builds shared context artifacts from your repositories. These artifacts — repo maps, file summaries, dependency graphs, and more — serve as reusable context that reduces the tokens sent per request and powers high cache hit rates across your engineering team.
Use this page when
- You are configuring Codebase Context Fabric for the first time.
- You need the declarative YAML configuration and artifact type reference.
- You want to verify fabric is working (context attachment, refresh on push, token reduction).
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
What Codebase Context Fabric Does
Without fabric, every AI prompt about your codebase must include raw source files as context. This means:
- Large token counts per request (expensive)
- Redundant context across engineers (wasteful)
- No pre-computed understanding (slower responses)
With fabric, the system pre-builds structured context artifacts once and reuses them across all requests:
- Smaller token counts per request (cheaper)
- Shared artifacts across engineers (efficient)
- Pre-computed codebase understanding (faster responses)
Fabric artifacts are the foundation of the fill-then-save model. The "fill" includes building these artifacts. The "save" comes from reusing them thousands of times.
Declarative Configuration
Add the following to your gateway configuration to enable Codebase Context Fabric:
workflow_cache:
enabled: true
default_tier: org_shared_cache
org_shared_enabled: true
ttl_seconds: 86400
max_entry_tokens: 32000
fabric:
enabled: true
auto_build: true
refresh_on_push: true
artifact_types:
- repo_map
- file_summary
- dependency_graph
- test_map
- api_inventory
- symbol_index
- embedding_index
- recent_change_summary
- known_failure_fingerprint
Configuration Fields
| Field | Type | Description |
|---|---|---|
fabric.enabled | boolean | Master switch for Codebase Context Fabric |
fabric.auto_build | boolean | Automatically build artifacts when a repo is connected |
fabric.refresh_on_push | boolean | Rebuild affected artifacts when new commits are pushed |
fabric.artifact_types | list | Which artifact types to build (see below) |
Artifact Types
Each artifact type captures a different aspect of your codebase and serves different query patterns:
repo_map
A high-level structural map of the repository: top-level directories, key files, module boundaries, and entry points.
- Build cost: Low (single LLM call per repo)
- Cache value: High — answers "where is X?" and "what's the structure?" questions
- Refresh trigger: New top-level directories or significant structural changes
file_summary
Natural language summaries of individual files: purpose, key exports, dependencies, and notable patterns.
- Build cost: Medium (one LLM call per file, parallelized)
- Cache value: Very high — most common context attached to prompts
- Refresh trigger: File content changes
dependency_graph
Module-level dependency relationships: imports, exports, package dependencies, and circular dependency detection.
- Build cost: Low (static analysis, minimal LLM calls)
- Cache value: High — answers "what depends on X?" and "what does X use?"
- Refresh trigger: Import/export changes, package.json/Cargo.toml changes
test_map
Mapping between test files and their source targets, coverage relationships, and test categories.
- Build cost: Low (static analysis with light LLM enrichment)
- Cache value: Medium — answers "how is X tested?" and "what tests should I run?"
- Refresh trigger: New/deleted test files, changed test imports
api_inventory
Catalog of API endpoints, route handlers, request/response schemas, and middleware chains.
- Build cost: Medium (pattern matching + LLM enrichment)
- Cache value: High for API-heavy codebases — answers "what endpoints exist?" and "how does endpoint X work?"
- Refresh trigger: Route file changes, handler modifications
symbol_index
Index of functions, classes, types, interfaces, and their signatures across the codebase.
- Build cost: Low (primarily static analysis)
- Cache value: High — enables precise lookup instead of file-level context
- Refresh trigger: Any source code change that modifies exports
embedding_index
Semantic embedding vectors for code chunks, enabling similarity-based retrieval for semantically equivalent but differently-worded queries.
- Build cost: High (requires embedding model calls for every code chunk)
- Cache value: Very high — enables semantic cache matching
- Refresh trigger: File content changes (incremental re-embedding)
recent_change_summary
Summaries of recent commits, PRs, and code changes — what changed, why, and what it affects.
- Build cost: Low (commit messages + small diff summaries)
- Cache value: Medium — answers "what changed recently?" and "why was X modified?"
- Refresh trigger: New commits (continuous)
known_failure_fingerprint
Catalog of known test failures, error patterns, and their resolutions.
- Build cost: Low (pattern extraction from CI/CD logs)
- Cache value: High during incidents — answers "I'm seeing error X, what's the fix?"
- Refresh trigger: New CI failures or resolved incidents
Connecting Repos for Fabric
Navigate to Settings → Repositories in the console:
- Click Connect Repository
- Enter the git URL (HTTPS or SSH)
- Provide credentials (personal access token or deploy key)
- Select branches to track
- Toggle Build fabric artifacts on connect
- Click Connect
The system immediately enqueues artifact warmers for all enabled artifact types.
Monitoring Artifact Creation
After connecting a repo, monitor progress in Repositories → [Repo Name] → Fabric Status:
Each artifact shows one of these states:
| State | Meaning |
|---|---|
| Queued | Waiting for warmer worker capacity |
| Building | Actively processing (LLM calls in progress) |
| Ready | Complete and available for cache attachment |
| Stale | Code changed since last build; refresh queued |
| Error | Build failed (check logs for details) |
All artifacts should reach Ready status within 30 minutes for a medium-sized repository (50k-200k lines of code).
Validation Steps
After fabric setup, verify everything is working:
1. Confirm All Artifacts Are Ready
Settings → Repositories → [Repo] → Fabric Status
All artifact types: Ready ✓
2. Verify Cache Is Using Fabric Context
Send a test prompt about your codebase through the gateway. In Events → [Latest Event], check that the request metadata includes:
fabric_context_attached: truefabric_artifacts_used: ["repo_map", "file_summary", ...]
3. Check Token Reduction
Compare the input token count of a fabric-enriched request vs. a raw context request. Fabric should reduce input tokens by 40-70% because structured summaries are more token-efficient than raw source code.
4. Confirm Refresh on Push
Make a commit to a connected branch. Within 5 minutes, the affected artifacts should transition to Stale → Building → Ready.
Operational Requirements
For fabric to function correctly:
worker_cache_warmermust be running: This binary processes the artifact build queue. Without it, artifacts remain in "Queued" state indefinitely.- Repository access must be maintained: If credentials expire, fabric cannot refresh artifacts.
- Sufficient wallet balance: Initial artifact build incurs LLM costs. Ensure your org wallet has funds.
- Gateway in hosted gateway mode: Fabric context attachment requires hosted-gateway gateway routing.
Tuning for Large Repositories
For repositories over 500,000 lines:
- Consider disabling
embedding_indexinitially (highest build cost) and enabling after other artifacts are ready - Set
file_summaryto target high-traffic directories first using path filters - Monitor the fill cost in Cost & Spend during initial build
- Expect initial build to take 1-2 hours for very large repositories
Next steps
- Connecting Your First Repository — detailed repo connection walkthrough
- Cache Tiers Explained — understand where fabric fits in the cache hierarchy
- Estimating Fill Cost — budget for your fabric build
For AI systems
- Canonical terms: Keeptrusts, Codebase Context Fabric, fabric artifacts, repo_map, file_summary, dependency_graph, test_map, api_inventory, symbol_index, embedding_index, recent_change_summary, known_failure_fingerprint.
- Exact feature/config names:
fabric.enabled: true,fabric.auto_build,fabric.refresh_on_push,fabric.artifact_types,worker_cache_warmerbinary,fabric_context_attached: trueevent field. - Best next pages: Connecting Your First Repository, Cache Tiers Explained, Estimating Fill Cost.
For engineers
- Add
fabric: { enabled: true, auto_build: true, refresh_on_push: true }to your gateway configuration. - List desired artifact types in
fabric.artifact_types— each type has different build cost and query value. - Verify setup: check event metadata for
fabric_context_attached: trueandfabric_artifacts_usedlisting used types. - For repos over 500K lines, consider disabling
embedding_indexinitially (highest cost) and enabling after other artifacts are ready. - Confirm refresh works: push a commit and verify affected artifacts transition Stale → Building → Ready within 5 minutes.
- Operational requirement:
worker_cache_warmermust be running or artifacts stay in "Queued" state.
For leaders
- Fabric reduces per-request token costs by 40-70% by replacing raw source code with structured summaries.
- Initial build cost for a medium repo (1,000 files): approximately $10-25 in LLM calls; paid once, reused thousands of times.
- Refresh is incremental — only changed files re-index on push, so ongoing cost is minimal relative to the initial investment.
- Gateway must run in hosted gateway mode for fabric context attachment; local-mode gateways cannot use fabric.