How Fabric Slices Reduce Prompt Size
When your AI assistant answers questions about code, it needs context. Without fabric slices, context means raw source files — hundreds or thousands of lines injected into the prompt. With fabric slices, context means compact, pre-computed summaries that convey the same understanding at a fraction of the token cost. Token savings of 60-80% are typical for complex codebase questions.
Use this page when
- You want to understand how fabric slices (repo_map, file_summary, dependency_graph) replace raw file content in prompts.
- You need to quantify token savings (60-80% typical) from using slices instead of raw source.
- You are evaluating how slices maintain accuracy while eliminating implementation details AI doesn’t need.
Primary audience
- Primary: AI Agents, Technical Engineers
- Secondary: Technical Leaders
The problem with raw file context
Traditional AI coding assistants include raw file content in prompts. When you ask about a module's architecture:
- The assistant reads 15-20 relevant files
- Each file averages 200-400 lines
- Total context: 3,000-8,000 lines of raw source
- Token count: 12,000-32,000 tokens just for context
This approach is expensive, slow, and often exceeds context window limits. The AI processes boilerplate, comments, and implementation details it doesn't need to answer your question.
What fabric slices provide instead
Fabric slices are pre-computed, compact representations of codebase knowledge. Each slice type replaces a specific kind of raw content:
repo_map slices replace directory listings
Instead of listing every file in a directory tree (potentially thousands of entries), a repo_map slice provides a structured overview with file purposes and relationships:
Before (raw): 450 tokens for a directory listing
src/
auth/
session.ts
middleware.ts
tokens.ts
types.ts
refresh.ts
validation.ts
...
After (fabric slice): 120 tokens
src/auth/ — Authentication module
Core: session.ts (session management), tokens.ts (JWT creation/validation)
Middleware: middleware.ts (request auth), validation.ts (input validation)
Types: types.ts (shared auth types)
Support: refresh.ts (token rotation)
file_summary slices replace entire files
Instead of including a full 300-line file, a file_summary slice captures the essential structure and purpose:
Before (raw): 1,200 tokens for the full file
// 300 lines of implementation including imports, types,
// function bodies, error handling, comments...
After (fabric slice): 180 tokens
File: src/auth/session.ts (300 lines)
Purpose: Manages user sessions with rotating tokens
Exports: createSession, validateSession, refreshSession, destroySession
Dependencies: src/auth/tokens, src/db/sessions, src/config/auth
Key Logic: Token rotation on every validation, 24h absolute expiry
Error Cases: ExpiredSession, InvalidToken, ConcurrentSessionLimit
dependency_graph slices replace manual import tracing
Instead of reading every file to discover import relationships, a dependency_graph slice shows the complete picture:
Before (raw): 3,000+ tokens reading multiple files to trace imports
// Read file 1 to find its imports
// Read file 2 to find its imports
// Read file 3 to find its imports
// ... repeat for every file in the module
After (fabric slice): 250 tokens
Module: src/payments/
Entry: processor.ts
→ stripe-sdk (external)
→ src/auth/tokens (cross-module)
→ src/db/transactions (cross-module)
→ ./retry-logic.ts (internal)
→ ./types.ts (internal)
Consumers: src/api/routes/checkout.ts, src/api/routes/subscriptions.ts
Circular: none
Token savings at scale
For typical engineering questions, fabric slices reduce prompt context dramatically:
| Question Type | Raw Context Tokens | Fabric Slice Tokens | Savings |
|---|---|---|---|
| "How does module X work?" | 15,000 | 3,500 | 77% |
| "What depends on file Y?" | 8,000 | 1,200 | 85% |
| "Where should I add this feature?" | 20,000 | 5,000 | 75% |
| "Why is this test failing?" | 12,000 | 4,000 | 67% |
| "Review this PR for issues" | 25,000 | 8,000 | 68% |
Before and after: a real example
Question: "How should I add rate limiting to the payments API?"
Without fabric slices
The AI reads:
src/payments/processor.ts(280 lines, 1,120 tokens)src/payments/routes.ts(190 lines, 760 tokens)src/payments/types.ts(95 lines, 380 tokens)src/middleware/auth.ts(150 lines, 600 tokens)src/middleware/rate-limit.ts(200 lines, 800 tokens)src/config/limits.ts(80 lines, 320 tokens)- 8 more files for patterns and examples...
Total context: ~8,500 tokens
With fabric slices
The AI retrieves:
file_summaryfor payments module (4 files, 400 tokens)dependency_graphfor payments → middleware (150 tokens)file_summaryfor existing rate-limit middleware (120 tokens)repo_mapslice for middleware directory (80 tokens)
Total context: ~750 tokens (91% reduction)
The AI gets the same understanding — module structure, existing patterns, integration points — with dramatically less input.
How slices maintain accuracy
Fabric slices are not lossy compression. They are purpose-built summaries that preserve the information AI needs while eliminating what it doesn't:
- Preserved: function signatures, module purposes, dependency relationships, error types, configuration points
- Eliminated: implementation details, boilerplate, comments explaining obvious code, import statements, type definitions that are inferable
When your AI needs implementation details for a specific function, it can still retrieve the full file. Slices optimize the common case where the AI needs understanding, not line-by-line content.
Compound savings across conversations
In a 100+ engineer organization, prompt size savings compound:
- 100 engineers × 20 AI requests/day × 10,000 tokens saved per request = 20 million tokens saved daily
- At typical token pricing, this translates to significant monthly cost reduction
- Response times improve because the AI processes less input
- More requests fit within context windows, enabling more complex questions
Slices and the org-shared cache
Fabric slices are generated once and shared across your organization. When the first engineer's request triggers slice generation, every subsequent request from any engineer reuses those slices. The generation cost is amortized across the entire team.
For AI systems
- Canonical terms: Keeptrusts, Codebase Context Fabric, fabric slices, prompt size reduction, token savings, repo_map slice, file_summary slice, dependency_graph slice, context window optimization.
- Feature/config names:
repo_map,file_summary,dependency_graphslice types, token savings 60-80%, org-shared cache, compound savings, slice accuracy (signatures, module purposes, dependency relationships preserved). - Best next pages: Artifact Freshness, Recent Change Summaries, Fabric Provenance.
For engineers
- Slices preserve what AI needs (function signatures, module purposes, dependency relationships, error types, config points) and eliminate what it doesn’t (implementation details, boilerplate, obvious comments).
- Token savings by question type: "How does module X work?" = 77% savings; "What depends on file Y?" = 85% savings; "Review this PR" = 68% savings.
- Slices are generated once and shared org-wide. First engineer’s request triggers generation; all subsequent requests reuse.
- When AI needs actual implementation details for a specific function, it can still retrieve the full file — slices optimize the common case.
For leaders
- At 100 engineers × 20 requests/day × 10,000 tokens saved per request = 20 million tokens saved daily. Translates directly to monthly cost reduction.
- Response times improve because AI processes less input per request.
- More requests fit within context windows, enabling more complex questions without hitting model limits.
- Slices are NOT lossy compression — they are purpose-built summaries that preserve decision-relevant information while eliminating noise.
Next steps
- Learn how artifact freshness keeps slices current as code changes
- Explore how recent change summaries provide compact change context
- Understand how fabric provenance tracks what data each slice represents