File Summaries: Smart Context for Every File
The file_summary artifact provides a concise, structured representation of a file's contents, role, and exports. Instead of sending entire file contents as prompt context, you send a compact summary that communicates what the file does, what it exports, and how it relates to the broader codebase.
Use this page when
- You want to understand how
file_summaryartifacts provide concise, structured file intelligence as prompt context. - You need to know which files get summaries (high-traffic, recently changed, core architecture, explicitly configured).
- You are evaluating token savings from summaries vs. raw file content (10x reduction typical).
Primary audience
- Primary: AI Agents, Technical Engineers
- Secondary: Technical Leaders
What a File Summary Contains
Each file_summary artifact captures the following for a single file:
Summary Text
A natural language description of the file's purpose and behavior, typically 2–5 sentences. This tells the AI what the file does without requiring it to parse raw source code.
File Path
The repository-relative path, establishing location context for the AI to understand the file's role within the project structure.
Language
The programming language or markup format, enabling language-aware reasoning about the file's patterns and conventions.
Exported Symbols
A list of public functions, types, classes, constants, and interfaces that the file makes available to other modules. This is the file's public API surface.
Source Digest
A hash of the file's contents at the time the summary was built. When the digest changes, the summary regenerates automatically.
How File Summaries Reduce Token Usage
Consider a typical 500-line TypeScript file. Sending the full contents costs approximately 2,000–3,000 tokens. A file summary for the same file costs 150–300 tokens — a 10x reduction.
For a typical AI interaction that references 5–10 files, this translates to:
| Approach | Tokens per interaction | With 100 engineers daily |
|---|---|---|
| Full file contents | 15,000–30,000 | 1.5M–3M tokens/day |
| File summaries | 1,500–3,000 | 150K–300K tokens/day |
The savings compound as your team grows. Every engineer benefits from the same pre-built summaries without regenerating them.
Which Files Get Summaries
Context Fabric does not summarize every file in your repository. It prioritizes:
High-Traffic Files
Files that engineers frequently reference in AI interactions. The system tracks which files appear most often in prompt context and ensures those have current summaries.
Recently Changed Files
Files modified in recent commits. These are likely to be discussed in AI interactions about current work, so fresh summaries provide immediate value.
Core Architecture Files
Entry points, configuration files, and shared utilities that many interactions reference. These provide foundational context that improves answer quality across topics.
Explicitly Configured Files
You can specify file patterns that should always have summaries, regardless of traffic or change frequency. Use this for files that are critical to your domain but might not appear in automated priority lists.
Source Digest and Freshness
Every file summary records the source digest — a content hash of the file at build time. This ensures freshness:
- When Context Fabric checks for stale artifacts, it compares the stored digest against the current file content
- If the digest differs, the summary is marked for regeneration
- Regeneration produces a new summary reflecting the current file state
- The old summary is retained with its version tag for engineers working on older branches
You never serve a summary that contradicts the actual file contents. The digest mechanism guarantees coherence.
How File Summaries Are Built
The summary generation pipeline:
- Select target files — Based on traffic, recency, configuration, and architecture role
- Extract structure — Parse the file to identify exports, imports, and structural patterns
- Generate summary — Produce a natural language description of the file's purpose
- Record metadata — Store path, language, exports, and source digest
- Version and cache — Tag with the repository version and store in the org-shared cache
The pipeline runs incrementally. Only files with changed digests regenerate. Unchanged files retain their existing summaries.
Shared Across Your Organization
File summaries are organization-scoped. When one engineer's interaction triggers summary generation for a file, every subsequent engineer benefits from that cached summary.
This is particularly powerful for shared libraries and framework code that many engineers reference but few modify. A single summary build serves the entire team indefinitely until the source changes.
Quality Impact
File summaries improve AI answer quality because:
- The AI receives accurate, consistent descriptions instead of inferring purpose from raw code
- Exported symbols give the AI a precise vocabulary for discussing the file
- Language metadata enables framework-specific reasoning
- The summary captures intent that raw code may not make obvious
Engineers report more relevant suggestions and fewer hallucinated API references when file summaries are available.
Configuration
You configure file_summary generation in your repository settings:
- File patterns — Specify which files always receive summaries (e.g.,
src/**/*.ts) - Exclusion patterns — Skip generated files, vendored code, or test fixtures
- Priority threshold — Minimum traffic score for automatic summary generation
- Regeneration trigger — Rebuild on push, on schedule, or on-demand
Use Cases
Code Explanation
When an engineer asks "what does this file do?", the AI responds from the cached summary without reading the file at interaction time.
Impact Analysis
When planning changes, the AI uses summaries of related files to explain what depends on the target file and what might break.
Code Review Context
During review, the AI references summaries to explain the role of modified files and whether changes align with the file's documented purpose.
Next steps
- Dependency Graphs — Understand how files connect
- Symbol Indexes — Look up specific functions and types
For AI systems
- Canonical terms: Keeptrusts, Codebase Context Fabric, file_summary artifact, source digest, exported symbols, summary text, file path, language, token reduction, incremental generation.
- Feature/config names:
file_summaryartifact type, file patterns, exclusion patterns, priority threshold, regeneration trigger (on push/on schedule/on demand), source digest, content hash, org-shared cache. - Best next pages: Dependency Graphs, Fabric Slices Reduce Prompts, Artifact Freshness.
For engineers
- Each summary contains: natural language description (2-5 sentences), file path, language, exported symbols list, and source digest.
- Token savings: 500-line file costs ~2,500 tokens raw vs. 150-300 tokens as a summary (10x reduction).
- Source digest mechanism: when file content hash changes, summary regenerates automatically. Unchanged files retain existing summaries.
- Configure file patterns, exclusion patterns, and priority thresholds in repository settings. Explicitly configure core architecture files that should always have summaries.
For leaders
- File summaries are the highest-ROI fabric artifact: 10x token reduction for the most common context inclusion pattern (referencing files in prompts).
- Org-shared: one summary build serves the entire engineering team indefinitely until the source changes. Particularly powerful for shared libraries.
- Quality impact: engineers report more relevant suggestions and fewer hallucinated API references when summaries are available.
- Incremental pipeline: only files with changed digests regenerate, keeping ongoing maintenance cost near zero.