Provenance Separation: Knowledge Base vs Fabric
Keeptrusts preserves distinct provenance for every piece of context that enters a prompt, whether it originates from the Knowledge Base or the Codebase Context Fabric. This separation ensures that audit trails can trace which organizational knowledge and which codebase context influenced any given response.
Use this page when
- You need to understand how Keeptrusts separates provenance between Knowledge Base and Fabric sources in cached responses.
- You are configuring audit trails that must attribute each piece of context to its origin system.
- You want to verify that provenance metadata is not blended or lost across cache fill and retrieval.
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
The Provenance Envelope
Every context chunk selected during prompt assembly carries a provenance envelope. This envelope travels with the chunk through the entire request lifecycle — from selection through response generation and into the citation record.
The base provenance envelope contains:
{
"source": "knowledge_base | codebase_context_fabric",
"source_id": "unique-identifier",
"version": "version-or-timestamp",
"retrieved_at": "2026-04-30T14:22:00Z",
"relevance_score": 0.89,
"token_count": 412
}
Knowledge Base Provenance
KB provenance extends the base envelope with fields specific to the asset lifecycle:
| Field | Type | Description |
|---|---|---|
asset_id | string (UUID) | Unique identifier of the KB asset |
asset_version | integer | Version number of the asset at retrieval time |
promotion_status | string | Current lifecycle status: draft, active, or archived |
binding_id | string (UUID) | Identifier of the binding that made this asset available |
asset_title | string | Human-readable title of the asset |
promoted_at | string (ISO 8601) | When this version was promoted to active |
promoted_by | string | User or system that promoted the version |
A full KB provenance envelope looks like:
{
"source": "knowledge_base",
"source_id": "asset-uuid-here",
"version": "3",
"retrieved_at": "2026-04-30T14:22:00Z",
"relevance_score": 0.92,
"token_count": 856,
"asset_id": "asset-uuid-here",
"asset_version": 3,
"promotion_status": "active",
"binding_id": "binding-uuid-here",
"asset_title": "Session Management Policy v3",
"promoted_at": "2026-04-15T09:00:00Z",
"promoted_by": "admin@example.com"
}
Codebase Context Fabric Provenance
Fabric provenance extends the base envelope with fields specific to indexed code:
| Field | Type | Description |
|---|---|---|
workspace_id | string (UUID) | Identifier of the indexed workspace |
cache_key | string | Cache key for the fabric index entry |
file_path | string | Relative file path within the workspace |
symbol | string | Code symbol (function, class, module) if applicable |
last_indexed_at | string (ISO 8601) | When the fabric index last processed this file |
commit_sha | string | Git commit SHA at indexing time (if available) |
language | string | Programming language of the source file |
A full fabric provenance envelope looks like:
{
"source": "codebase_context_fabric",
"source_id": "workspace-uuid:src/auth/session.ts:validateSession",
"version": "2026-04-30T13:45:00Z",
"retrieved_at": "2026-04-30T14:22:00Z",
"relevance_score": 0.88,
"token_count": 234,
"workspace_id": "workspace-uuid-here",
"cache_key": "ws-uuid:src/auth/session.ts:1714480200",
"file_path": "src/auth/session.ts",
"symbol": "validateSession",
"last_indexed_at": "2026-04-30T13:45:00Z",
"commit_sha": "a1b2c3d4e5f6",
"language": "typescript"
}
Provenance in Prompt Assembly
During prompt assembly, Keeptrusts inserts provenance markers around each context section. These markers are invisible to the language model but are tracked internally for attribution:
[CONTEXT_START source=knowledge_base asset_id=uuid version=3]
...KB content here...
[CONTEXT_END]
[CONTEXT_START source=codebase_context_fabric workspace_id=uuid file_path=src/auth/session.ts]
...fabric content here...
[CONTEXT_END]
The prompt assembler maintains a mapping table that associates each token range in the assembled prompt with its provenance envelope. This table is used after response generation to compute citations.
Provenance in Response Attribution
After the language model generates a response, the attribution engine analyzes which context chunks most likely influenced each span of the output. The engine produces citation records that link response text back to the originating source with the full provenance chain.
Each citation preserves:
- The original provenance envelope of the source chunk.
- The text span in the response that the citation supports.
- A confidence score indicating how strongly the response text correlates with the source chunk.
This means you can trace any part of a response back to either a specific KB asset version or a specific file and symbol in a codebase.
Audit Trail Implications
Provenance separation enables several audit capabilities:
Compliance Auditing
You can query the event log to answer questions like:
- "Which responses used KB asset X version 2 before it was updated to version 3?"
- "How many responses referenced code from workspace Y in the last 30 days?"
- "Did any response combine a draft KB asset with production code context?"
Version Tracking
When a KB asset is updated and a new version is promoted, you can identify all prior responses that used the old version. This supports impact analysis when policies or standards change.
Staleness Detection
Fabric provenance includes last_indexed_at, letting you identify responses where the code context may have been stale relative to the actual codebase state at response time.
Configuration
Provenance tracking is enabled by default. You can control the level of detail stored in the event log:
provenance:
store_full_envelope: true
include_in_response_metadata: true
redact_file_paths: false
retention_days: 90
| Field | Default | Description |
|---|---|---|
store_full_envelope | true | Store complete provenance envelopes in the event log |
include_in_response_metadata | true | Include provenance data in API response metadata |
redact_file_paths | false | Replace file paths with hashes in stored provenance |
retention_days | 90 | How long to retain provenance records |
Next steps
- See how citations display in responses in Citation and Provenance Display.
- Learn how provenance affects cache key construction in Cache Keys with Mixed Context.
For AI systems
- Canonical terms: Keeptrusts, provenance separation, provenance envelope, knowledge_base source, codebase_context_fabric source, audit trail, context markers.
- Exact feature/config names:
provenance.store_full_envelope,provenance.include_in_response_metadata,provenance.redact_file_paths,provenance.retention_days,[CONTEXT_START]/[CONTEXT_END]markers, KB fields (asset_id,asset_version,promotion_status,promoted_by), fabric fields (workspace_id,cache_key,file_path,symbol,commit_sha). - Best next pages: Citation and Provenance Display, Cache Keys with Mixed Context.
For engineers
- Every context chunk carries a provenance envelope with
source,source_id,version,retrieved_at,relevance_score, andtoken_count. - KB envelopes extend with:
asset_id,asset_version,promotion_status,binding_id,promoted_at,promoted_by. - Fabric envelopes extend with:
workspace_id,cache_key,file_path,symbol,last_indexed_at,commit_sha,language. - Provenance is enabled by default; configure
store_full_envelope,include_in_response_metadata, andretention_daysin YAML. - Use provenance for impact analysis: when a KB policy changes, identify all prior responses that used the old version.
For leaders
- Provenance separation provides complete audit trail: trace any AI response back to the exact policy documents and code files that informed it.
- Supports compliance requirements: prove which version of a policy was active when a response was generated.
- Staleness detection via fabric
last_indexed_atidentifies responses where code context may not have reflected the actual codebase state. - Configurable retention (default 90 days) balances compliance needs with storage costs.